WorldWideScience

Sample records for chemometric variable selection

  1. Sensor combination and chemometric variable selection for online monitoring of Streptomyces coelicolor fed-batch cultivations

    DEFF Research Database (Denmark)

    Ödman, Peter; Johansen, C.L.; Olsson, L.

    2010-01-01

    of biomass and substrate (casamino acids) concentrations, respectively. The effect of combination of fluorescence and gas analyzer data as well as of different variable selection methods was investigated. Improved prediction models were obtained by combination of data from the two sensors and by variable......Fed-batch cultivations of Streptomyces coelicolor, producing the antibiotic actinorhodin, were monitored online by multiwavelength fluorescence spectroscopy and off-gas analysis. Partial least squares (PLS), locally weighted regression, and multilinear PLS (N-PLS) models were built for prediction...

  2. Non-targeted detection of chemical contamination in carbonated soft drinks using NMR spectroscopy, variable selection and chemometrics

    Energy Technology Data Exchange (ETDEWEB)

    Charlton, Adrian J. [Department for Environment, Food and Rural Affairs, Central Science Laboratory, Sand Hutton, York YO41 1LZ (United Kingdom)], E-mail: adrian.charlton@csl.gov.uk; Robb, Paul; Donarski, James A.; Godward, John [Department for Environment, Food and Rural Affairs, Central Science Laboratory, Sand Hutton, York YO41 1LZ (United Kingdom)

    2008-06-23

    An efficient method for detecting malicious and accidental contamination of foods has been developed using a combined {sup 1}H nuclear magnetic resonance (NMR) and chemometrics approach. The method has been demonstrated using a commercially available carbonated soft drink, as being capable of identifying atypical products and to identify contaminant resonances. Soft-independent modelling of class analogy (SIMCA) was used to compare {sup 1}H NMR profiles of genuine products (obtained from the manufacturer) against retail products spiked in the laboratory with impurities. The benefits of using feature selection for extracting contaminant NMR frequencies were also assessed. Using example impurities (paraquat, p-cresol and glyphosate) NMR spectra were analysed using multivariate methods resulting in detection limits of approximately 0.075, 0.2, and 0.06 mM for p-cresol, paraquat and glyphosate, respectively. These detection limits are shown to be approximately 100-fold lower than the minimum lethal dose for paraquat. The methodology presented here is used to assess the composition of complex matrices for the presence of contaminating molecules without a priori knowledge of the nature of potential contaminants. The ability to detect if a sample does not fit into the expected profile without recourse to multiple targeted analyses is a valuable tool for incident detection and forensic applications.

  3. Non-targeted detection of chemical contamination in carbonated soft drinks using NMR spectroscopy, variable selection and chemometrics

    International Nuclear Information System (INIS)

    Charlton, Adrian J.; Robb, Paul; Donarski, James A.; Godward, John

    2008-01-01

    An efficient method for detecting malicious and accidental contamination of foods has been developed using a combined 1 H nuclear magnetic resonance (NMR) and chemometrics approach. The method has been demonstrated using a commercially available carbonated soft drink, as being capable of identifying atypical products and to identify contaminant resonances. Soft-independent modelling of class analogy (SIMCA) was used to compare 1 H NMR profiles of genuine products (obtained from the manufacturer) against retail products spiked in the laboratory with impurities. The benefits of using feature selection for extracting contaminant NMR frequencies were also assessed. Using example impurities (paraquat, p-cresol and glyphosate) NMR spectra were analysed using multivariate methods resulting in detection limits of approximately 0.075, 0.2, and 0.06 mM for p-cresol, paraquat and glyphosate, respectively. These detection limits are shown to be approximately 100-fold lower than the minimum lethal dose for paraquat. The methodology presented here is used to assess the composition of complex matrices for the presence of contaminating molecules without a priori knowledge of the nature of potential contaminants. The ability to detect if a sample does not fit into the expected profile without recourse to multiple targeted analyses is a valuable tool for incident detection and forensic applications

  4. Online Monitoring of Copper Damascene Electroplating Bath by Voltammetry: Selection of Variables for Multiblock and Hierarchical Chemometric Analysis of Voltammetric Data

    Directory of Open Access Journals (Sweden)

    Aleksander Jaworski

    2017-01-01

    Full Text Available The Real Time Analyzer (RTA utilizing DC- and AC-voltammetric techniques is an in situ, online monitoring system that provides a complete chemical analysis of different electrochemical deposition solutions. The RTA employs multivariate calibration when predicting concentration parameters from a multivariate data set. Although the hierarchical and multiblock Principal Component Regression- (PCR- and Partial Least Squares- (PLS- based methods can handle data sets even when the number of variables significantly exceeds the number of samples, it can be advantageous to reduce the number of variables to obtain improvement of the model predictions and better interpretation. This presentation focuses on the introduction of a multistep, rigorous method of data-selection-based Least Squares Regression, Simple Modeling of Class Analogy modeling power, and, as a novel application in electroanalysis, Uninformative Variable Elimination by PLS and by PCR, Variable Importance in the Projection coupled with PLS, Interval PLS, Interval PCR, and Moving Window PLS. Selection criteria of the optimum decomposition technique for the specific data are also demonstrated. The chief goal of this paper is to introduce to the community of electroanalytical chemists numerous variable selection methods which are well established in spectroscopy and can be successfully applied to voltammetric data analysis.

  5. Automated optimization and construction of chemometric models based on highly variable raw chromatographic data.

    Science.gov (United States)

    Sinkov, Nikolai A; Johnston, Brandon M; Sandercock, P Mark L; Harynuk, James J

    2011-07-04

    Direct chemometric interpretation of raw chromatographic data (as opposed to integrated peak tables) has been shown to be advantageous in many circumstances. However, this approach presents two significant challenges: data alignment and feature selection. In order to interpret the data, the time axes must be precisely aligned so that the signal from each analyte is recorded at the same coordinates in the data matrix for each and every analyzed sample. Several alignment approaches exist in the literature and they work well when the samples being aligned are reasonably similar. In cases where the background matrix for a series of samples to be modeled is highly variable, the performance of these approaches suffers. Considering the challenge of feature selection, when the raw data are used each signal at each time is viewed as an individual, independent variable; with the data rates of modern chromatographic systems, this generates hundreds of thousands of candidate variables, or tens of millions of candidate variables if multivariate detectors such as mass spectrometers are utilized. Consequently, an automated approach to identify and select appropriate variables for inclusion in a model is desirable. In this research we present an alignment approach that relies on a series of deuterated alkanes which act as retention anchors for an alignment signal, and couple this with an automated feature selection routine based on our novel cluster resolution metric for the construction of a chemometric model. The model system that we use to demonstrate these approaches is a series of simulated arson debris samples analyzed by passive headspace extraction, GC-MS, and interpreted using partial least squares discriminant analysis (PLS-DA). Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Chemometric Analysis of Selected Organic Contaminants in Surface Water of Langat River Basin

    International Nuclear Information System (INIS)

    Mohamad Rafaie Mohamed Zubir; Rozita Osman; Norashikin Saim

    2016-01-01

    Chemometric techniques namely hierarchical agglomerative cluster analysis (HACA), discriminant analysis (DA), principal component analysis (PCA) and factor analysis (FA) were applied to the distribution of selected organic contaminants (polycyclic aromatic hydrocarbons (PAHs), sterols, pesticides (chloropyrifos), and phenol) to assess the potential of using these organic contaminants as chemical markers in Langat River Basin. Water samples were collected from February 2012 to January 2013 on a monthly basis for nine monitoring sites along Langat River Basin. HACA was able to classify the sampling sites into three clusters which can be correlated to the level of contamination (low, moderate and high contamination sites). DA was used to discriminate the sources of contamination using the selected organic contaminants and relate to the existing DOE local activities groupings. Forward and backward stepwise DA was able to discriminate two and five organic contaminants variables, respectively, from the original 13 selected variables. The five significant variables identified using backward stepwise DA were fluorene, pyrene, stigmastanol, stigmasterol and phenol. PCA and FA (varimax functionality) were used to identify the possible sources of each organic contaminant based on the inventory of local activities. Five principal components were obtained with 66.5 % of the total variation. Result from FA indicated that PAHs (pyrene, fluorene, acenaphthene, benzo[a]anthracene) originated from industrial activity and socio-economic activities; while sterols (coprostanol, stigmastanol and stigmasterol) were associated to domestic sewage and local socio-economic activities. The occurrence of chloropyrifos was correlated to agricultural activities, urban and domestic discharges. This study showed that the application of chemometrics on the distribution of selected organic contaminants was able to trace the sources of contamination in surface water. (author)

  7. Variable and subset selection in PLS regression

    DEFF Research Database (Denmark)

    Høskuldsson, Agnar

    2001-01-01

    The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...

  8. Chemometrics in spectroscopy. Part 1. Classical chemometrics

    International Nuclear Information System (INIS)

    Geladi, Paul

    2003-01-01

    An overview is given of chemometrics as it can be applied to spectroscopic and other multivariate data. Major chemometrics and data analysis techniques are described. An important aspect is the focus on soft modeling for situations that are too complicated for the traditional hard models to work. Also measurement noise is given due attention. A small example is used to illustrate some ways of working, mainly by using graphics. Selected literature references are given. Part 1 deals with classical chemometrics. Part 2 presents some newer developments and includes some more elaborated examples

  9. Benchmarking Variable Selection in QSAR.

    Science.gov (United States)

    Eklund, Martin; Norinder, Ulf; Boyer, Scott; Carlsson, Lars

    2012-02-01

    Variable selection is important in QSAR modeling since it can improve model performance and transparency, as well as reduce the computational cost of model fitting and predictions. Which variable selection methods that perform well in QSAR settings is largely unknown. To address this question we, in a total of 1728 benchmarking experiments, rigorously investigated how eight variable selection methods affect the predictive performance and transparency of random forest models fitted to seven QSAR datasets covering different endpoints, descriptors sets, types of response variables, and number of chemical compounds. The results show that univariate variable selection methods are suboptimal and that the number of variables in the benchmarked datasets can be reduced with about 60 % without significant loss in model performance when using multivariate adaptive regression splines MARS and forward selection. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Variable Selection via Partial Correlation.

    Science.gov (United States)

    Li, Runze; Liu, Jingyuan; Lou, Lejia

    2017-07-01

    Partial correlation based variable selection method was proposed for normal linear regression models by Bühlmann, Kalisch and Maathuis (2010) as a comparable alternative method to regularization methods for variable selection. This paper addresses two important issues related to partial correlation based variable selection method: (a) whether this method is sensitive to normality assumption, and (b) whether this method is valid when the dimension of predictor increases in an exponential rate of the sample size. To address issue (a), we systematically study this method for elliptical linear regression models. Our finding indicates that the original proposal may lead to inferior performance when the marginal kurtosis of predictor is not close to that of normal distribution. Our simulation results further confirm this finding. To ensure the superior performance of partial correlation based variable selection procedure, we propose a thresholded partial correlation (TPC) approach to select significant variables in linear regression models. We establish the selection consistency of the TPC in the presence of ultrahigh dimensional predictors. Since the TPC procedure includes the original proposal as a special case, our theoretical results address the issue (b) directly. As a by-product, the sure screening property of the first step of TPC was obtained. The numerical examples also illustrate that the TPC is competitively comparable to the commonly-used regularization methods for variable selection.

  11. PG-Metrics: A chemometric-based approach for classifying bacterial peptidoglycan data sets and uncovering their subjacent chemical variability.

    Directory of Open Access Journals (Sweden)

    Keshav Kumar

    Full Text Available Bacteria cells are protected from osmotic and environmental stresses by an exoskeleton-like polymeric structure called peptidoglycan (PG or murein sacculus. This structure is fundamental for bacteria's viability and thus, the mechanisms underlying cell wall assembly and how it is modulated serve as targets for many of our most successful antibiotics. Therefore, it is now more important than ever to understand the genetics and structural chemistry of the bacterial cell walls in order to find new and effective methods of blocking it for the treatment of disease. In the last decades, liquid chromatography and mass spectrometry have been demonstrated to provide the required resolution and sensitivity to characterize the fine chemical structure of PG. However, the large volume of data sets that can be produced by these instruments today are difficult to handle without a proper data analysis workflow. Here, we present PG-metrics, a chemometric based pipeline that allows fast and easy classification of bacteria according to their muropeptide chromatographic profiles and identification of the subjacent PG chemical variability between e.g. bacterial species, growth conditions and, mutant libraries. The pipeline is successfully validated here using PG samples from different bacterial species and mutants in cell wall proteins. The obtained results clearly demonstrated that PG-metrics pipeline is a valuable bioanalytical tool that can lead us to cell wall classification and biomarker discovery.

  12. Seleção de variáveis em QSAR Variable selection in QSAR

    Directory of Open Access Journals (Sweden)

    Márcia Miguel Castro Ferreira

    2002-05-01

    Full Text Available The process of building mathematical models in quantitative structure-activity relationship (QSAR studies is generally limited by the size of the dataset used to select variables from. For huge datasets, the task of selecting a given number of variables that produces the best linear model can be enormous, if not unfeasible. In this case, some methods can be used to separate good parameter combinations from the bad ones. In this paper three methodologies are analyzed: systematic search, genetic algorithm and chemometric methods. These methods have been exposed and discussed through practical examples.

  13. Variable selection in PLSR and extensions to a multi-block setting for metabolomics data

    DEFF Research Database (Denmark)

    Karaman, İbrahim; Hedemann, Mette Skou; Knudsen, Knud Erik Bach

    When applying LC-MS or NMR spectroscopy in metabolomics studies, high-dimensional data are generated and effective tools for variable selection are needed in order to detect the important metabolites. Methods based on sparsity combined with PLSR have recently attracted attention in the field...... of genomics [1]. They became quickly well established in the field of statistics because a close relationship to elastic net has been established. In sparse variable selection combined with PLSR, a soft thresholding is applied on each loading weight separately. In the field of chemometrics Jack-knifing has...... been introduced for variable selection in PLSR [2]. Jack-knifing has been frequently applied in the field of spectroscopy and is implemented in software tools like The Unscrambler. In Jack-knifing uncertainty estimates of regression coefficients are estimated and a t-test is applied on these estimates...

  14. Chemometric classification of casework arson samples based on gasoline content.

    Science.gov (United States)

    Sinkov, Nikolai A; Sandercock, P Mark L; Harynuk, James J

    2014-02-01

    Detection and identification of ignitable liquids (ILs) in arson debris is a critical part of arson investigations. The challenge of this task is due to the complex and unpredictable chemical nature of arson debris, which also contains pyrolysis products from the fire. ILs, most commonly gasoline, are complex chemical mixtures containing hundreds of compounds that will be consumed or otherwise weathered by the fire to varying extents depending on factors such as temperature, air flow, the surface on which IL was placed, etc. While methods such as ASTM E-1618 are effective, data interpretation can be a costly bottleneck in the analytical process for some laboratories. In this study, we address this issue through the application of chemometric tools. Prior to the application of chemometric tools such as PLS-DA and SIMCA, issues of chromatographic alignment and variable selection need to be addressed. Here we use an alignment strategy based on a ladder consisting of perdeuterated n-alkanes. Variable selection and model optimization was automated using a hybrid backward elimination (BE) and forward selection (FS) approach guided by the cluster resolution (CR) metric. In this work, we demonstrate the automated construction, optimization, and application of chemometric tools to casework arson data. The resulting PLS-DA and SIMCA classification models, trained with 165 training set samples, have provided classification of 55 validation set samples based on gasoline content with 100% specificity and sensitivity. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. A metabolic fingerprinting approach based on selected ion flow tube mass spectrometry (SIFT-MS) and chemometrics: A reliable tool for Mediterranean origin-labeled olive oils authentication.

    Science.gov (United States)

    Bajoub, Aadil; Medina-Rodríguez, Santiago; Ajal, El Amine; Cuadros-Rodríguez, Luis; Monasterio, Romina Paula; Vercammen, Joeri; Fernández-Gutiérrez, Alberto; Carrasco-Pancorbo, Alegría

    2018-04-01

    Selected Ion flow tube mass spectrometry (SIFT-MS) in combination with chemometrics was used to authenticate the geographical origin of Mediterranean virgin olive oils (VOOs) produced under geographical origin labels. In particular, 130 oil samples from six different Mediterranean regions (Kalamata (Greece); Toscana (Italy); Meknès and Tyout (Morocco); and Priego de Córdoba and Baena (Spain)) were considered. The headspace volatile fingerprints were measured by SIFT-MS in full scan with H 3 O + , NO + and O 2 + as precursor ions and the results were subjected to chemometric treatments. Principal Component Analysis (PCA) was used for preliminary multivariate data analysis and Partial Least Squares-Discriminant Analysis (PLS-DA) was applied to build different models (considering the three reagent ions) to classify samples according to the country of origin and regions (within the same country). The multi-class PLS-DA models showed very good performance in terms of fitting accuracy (98.90-100%) and prediction accuracy (96.70-100% accuracy for cross validation and 97.30-100% accuracy for external validation (test set)). Considering the two-class PLS-DA models, the one for the Spanish samples showed 100% sensitivity, specificity and accuracy in calibration, cross validation and external validation; the model for Moroccan oils also showed very satisfactory results (with perfect scores for almost every parameter in all the cases). Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. "Turn-off" fluorescent data array sensor based on double quantum dots coupled with chemometrics for highly sensitive and selective detection of multicomponent pesticides.

    Science.gov (United States)

    Fan, Yao; Liu, Li; Sun, Donglei; Lan, Hanyue; Fu, Haiyan; Yang, Tianming; She, Yuanbin; Ni, Chuang

    2016-04-15

    As a popular detection model, the fluorescence "turn-off" sensor based on quantum dots (QDs) has already been successfully employed in the detections of many materials, especially in the researches on the interactions between pesticides. However, the previous studies are mainly focused on simple single track or the comparison based on similar concentration of drugs. In this work, a new detection method based on the fluorescence "turn-off" model with water-soluble ZnCdSe and CdSe QDs simultaneously as the fluorescent probes is established to detect various pesticides. The fluorescence of the two QDs can be quenched by different pesticides with varying degrees, which leads to the differences in positions and intensities of two peaks. By combining with chemometrics methods, all the pesticides can be qualitative and quantitative respectively even in real samples with the limit of detection was 2 × 10(-8) mol L(-1) and a recognition rate of 100%. This work is, to the best of our knowledge, the first report on the detection of pesticides based on the fluorescence quenching phenomenon of double quantum dots combined with chemometrics methods. What's more, the excellent selectivity of the system has been verified in different mediums such as mixed ion disruption, waste water, tea and water extraction liquid drugs. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Variable selection by lasso-type methods

    Directory of Open Access Journals (Sweden)

    Sohail Chand

    2011-09-01

    Full Text Available Variable selection is an important property of shrinkage methods. The adaptive lasso is an oracle procedure and can do consistent variable selection. In this paper, we provide an explanation that how use of adaptive weights make it possible for the adaptive lasso to satisfy the necessary and almost sufcient condition for consistent variable selection. We suggest a novel algorithm and give an important result that for the adaptive lasso if predictors are normalised after the introduction of adaptive weights, it makes the adaptive lasso performance identical to the lasso.

  18. SELECTING QUASARS BY THEIR INTRINSIC VARIABILITY

    International Nuclear Information System (INIS)

    Schmidt, Kasper B.; Rix, Hans-Walter; Jester, Sebastian; Hennawi, Joseph F.; Marshall, Philip J.; Dobler, Gregory

    2010-01-01

    We present a new and simple technique for selecting extensive, complete, and pure quasar samples, based on their intrinsic variability. We parameterize the single-band variability by a power-law model for the light-curve structure function, with amplitude A and power-law index γ. We show that quasars can be efficiently separated from other non-variable and variable sources by the location of the individual sources in the A-γ plane. We use ∼60 epochs of imaging data, taken over ∼5 years, from the SDSS stripe 82 (S82) survey, where extensive spectroscopy provides a reference sample of quasars, to demonstrate the power of variability as a quasar classifier in multi-epoch surveys. For UV-excess selected objects, variability performs just as well as the standard SDSS color selection, identifying quasars with a completeness of 90% and a purity of 95%. In the redshift range 2.5 < z < 3, where color selection is known to be problematic, variability can select quasars with a completeness of 90% and a purity of 96%. This is a factor of 5-10 times more pure than existing color selection of quasars in this redshift range. Selecting objects from a broad griz color box without u-band information, variability selection in S82 can afford completeness and purity of 92%, despite a factor of 30 more contaminants than quasars in the color-selected feeder sample. This confirms that the fraction of quasars hidden in the 'stellar locus' of color space is small. To test variability selection in the context of Pan-STARRS 1 (PS1) we created mock PS1 data by down-sampling the S82 data to just six epochs over 3 years. Even with this much sparser time sampling, variability is an encouragingly efficient classifier. For instance, a 92% pure and 44% complete quasar candidate sample is attainable from the above griz-selected catalog. Finally, we show that the presented A-γ technique, besides selecting clean and pure samples of quasars (which are stochastically varying objects), is also

  19. “Turn-off” fluorescent data array sensor based on double quantum dots coupled with chemometrics for highly sensitive and selective detection of multicomponent pesticides

    Energy Technology Data Exchange (ETDEWEB)

    Fan, Yao; Liu, Li; Sun, Donglei; Lan, Hanyue [The Modernization Engineering Technology Research Center of Ethnic Minority Medicine of Hubei Province, College of Pharmacy, South-Central University for Nationalities, Wuhan 430074 (China); Fu, Haiyan, E-mail: fuhaiyan@mail.scuec.edu.cn [The Modernization Engineering Technology Research Center of Ethnic Minority Medicine of Hubei Province, College of Pharmacy, South-Central University for Nationalities, Wuhan 430074 (China); Yang, Tianming, E-mail: tmyang@mail.scuec.edu.cn [The Modernization Engineering Technology Research Center of Ethnic Minority Medicine of Hubei Province, College of Pharmacy, South-Central University for Nationalities, Wuhan 430074 (China); She, Yuanbin, E-mail: sheyb@zjut.edu.cn [State Key Laboratory Breeding Base of Green Chemistry-Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310032 (China); Ni, Chuang [The Modernization Engineering Technology Research Center of Ethnic Minority Medicine of Hubei Province, College of Pharmacy, South-Central University for Nationalities, Wuhan 430074 (China)

    2016-04-15

    As a popular detection model, the fluorescence “turn-off” sensor based on quantum dots (QDs) has already been successfully employed in the detections of many materials, especially in the researches on the interactions between pesticides. However, the previous studies are mainly focused on simple single track or the comparison based on similar concentration of drugs. In this work, a new detection method based on the fluorescence “turn-off” model with water-soluble ZnCdSe and CdSe QDs simultaneously as the fluorescent probes is established to detect various pesticides. The fluorescence of the two QDs can be quenched by different pesticides with varying degrees, which leads to the differences in positions and intensities of two peaks. By combining with chemometrics methods, all the pesticides can be qualitative and quantitative respectively even in real samples with the limit of detection was 2 × 10{sup −8} mol L{sup −1} and a recognition rate of 100%. This work is, to the best of our knowledge, the first report on the detection of pesticides based on the fluorescence quenching phenomenon of double quantum dots combined with chemometrics methods. What's more, the excellent selectivity of the system has been verified in different mediums such as mixed ion disruption, waste water, tea and water extraction liquid drugs. - Highlights: • A new model based on double QDs is established for pesticide residues detection. • The fluorescent data array sensor is coupled with chmometrics methods. • The sensor can be highly sensitive and selective detection in actual samples.

  20. “Turn-off” fluorescent data array sensor based on double quantum dots coupled with chemometrics for highly sensitive and selective detection of multicomponent pesticides

    International Nuclear Information System (INIS)

    Fan, Yao; Liu, Li; Sun, Donglei; Lan, Hanyue; Fu, Haiyan; Yang, Tianming; She, Yuanbin; Ni, Chuang

    2016-01-01

    As a popular detection model, the fluorescence “turn-off” sensor based on quantum dots (QDs) has already been successfully employed in the detections of many materials, especially in the researches on the interactions between pesticides. However, the previous studies are mainly focused on simple single track or the comparison based on similar concentration of drugs. In this work, a new detection method based on the fluorescence “turn-off” model with water-soluble ZnCdSe and CdSe QDs simultaneously as the fluorescent probes is established to detect various pesticides. The fluorescence of the two QDs can be quenched by different pesticides with varying degrees, which leads to the differences in positions and intensities of two peaks. By combining with chemometrics methods, all the pesticides can be qualitative and quantitative respectively even in real samples with the limit of detection was 2 × 10"−"8 mol L"−"1 and a recognition rate of 100%. This work is, to the best of our knowledge, the first report on the detection of pesticides based on the fluorescence quenching phenomenon of double quantum dots combined with chemometrics methods. What's more, the excellent selectivity of the system has been verified in different mediums such as mixed ion disruption, waste water, tea and water extraction liquid drugs. - Highlights: • A new model based on double QDs is established for pesticide residues detection. • The fluorescent data array sensor is coupled with chmometrics methods. • The sensor can be highly sensitive and selective detection in actual samples.

  1. Purposeful selection of variables in logistic regression

    Directory of Open Access Journals (Sweden)

    Williams David Keith

    2008-12-01

    Full Text Available Abstract Background The main problem in many model-building situations is to choose from a large set of covariates those that should be included in the "best" model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms in existence. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates within which an analyst makes a variable selection decision at each step of the modeling process. Methods In this paper we introduce an algorithm which automates that process. We conduct a simulation study to compare the performance of this algorithm with three well documented variable selection procedures in SAS PROC LOGISTIC: FORWARD, BACKWARD, and STEPWISE. Results We show that the advantage of this approach is when the analyst is interested in risk factor modeling and not just prediction. In addition to significant covariates, this variable selection procedure has the capability of retaining important confounding variables, resulting potentially in a slightly richer model. Application of the macro is further illustrated with the Hosmer and Lemeshow Worchester Heart Attack Study (WHAS data. Conclusion If an analyst is in need of an algorithm that will help guide the retention of significant covariates as well as confounding ones they should consider this macro as an alternative tool.

  2. Chemometrics in spectroscopy

    International Nuclear Information System (INIS)

    Geladi, Paul; Sethson, Britta; Nystroem, Josefina; Lillhonga, Tom; Lestander, Torbjoern; Burger, Jim

    2004-01-01

    Some of the principles and main methods of chemometrics are illustrated by examples. The examples are from electrochemistry, process analytical chemistry and multivariate imaging. Principal component analysis, partial least squares regression and multivariate image analysis are used to illustrate the power of chemometrical thinking. The emphasis is on plotting and visualization for showing the salient features of a model or data set

  3. Penalized variable selection in competing risks regression.

    Science.gov (United States)

    Fu, Zhixuan; Parikh, Chirag R; Zhou, Bingqing

    2017-07-01

    Penalized variable selection methods have been extensively studied for standard time-to-event data. Such methods cannot be directly applied when subjects are at risk of multiple mutually exclusive events, known as competing risks. The proportional subdistribution hazard (PSH) model proposed by Fine and Gray (J Am Stat Assoc 94:496-509, 1999) has become a popular semi-parametric model for time-to-event data with competing risks. It allows for direct assessment of covariate effects on the cumulative incidence function. In this paper, we propose a general penalized variable selection strategy that simultaneously handles variable selection and parameter estimation in the PSH model. We rigorously establish the asymptotic properties of the proposed penalized estimators and modify the coordinate descent algorithm for implementation. Simulation studies are conducted to demonstrate the good performance of the proposed method. Data from deceased donor kidney transplants from the United Network of Organ Sharing illustrate the utility of the proposed method.

  4. Machine learning techniques to select variable stars

    Directory of Open Access Journals (Sweden)

    García-Varela Alejandro

    2017-01-01

    Full Text Available In order to perform a supervised classification of variable stars, we propose and evaluate a set of six features extracted from the magnitude density of the light curves. They are used to train automatic classification systems using state-of-the-art classifiers implemented in the R statistical computing environment. We find that random forests is the most successful method to select variables.

  5. Robust cluster analysis and variable selection

    CERN Document Server

    Ritter, Gunter

    2014-01-01

    Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of bot

  6. Classification of archaeological pieces into their respective stratum by a chemometric model based on the soil concentration of 25 selected elements

    International Nuclear Information System (INIS)

    Carrero, J.A.; Goienaga, N.; Fdez-Ortiz de Vallejuelo, S.; Arana, G.; Madariaga, J.M.

    2010-01-01

    The aim of this work was to demonstrate that an archaeological ceramic piece has remained buried underground in the same stratum for centuries without being removed. For this purpose, a chemometric model based on Principal Component Analysis, Soft Independent Modelling of Class Analogy and Linear Discriminant Analysis classification techniques was created with the concentration of some selected elements of both soil of the stratum and soil adhered to the ceramic piece. Some ceramic pieces from four different stratigraphic units, coming from a roman archaeological site in Alava (North of Spain), and its respective stratum soils were collected. The soil adhered to the ceramic pieces was removed and treated in the same way as the soil from its respective stratum. The digestion was carried out following the US Environmental Pollution Agency EPA 3051A method. A total of 54 elements were determined in the extracts by a rapid screening inductively coupled plasma mass spectrometry method. After rejecting the major elements and those which could have changed from the original composition of the soils (migration or retention from/to the buried objects), the following elements (25) were finally taken into account to construct the model: Li, V, Co, As, Y, Nb, Sn, Ba, La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, Au, Th and U. A total of 33 subsamples were treated from 10 soils belonging to 4 different stratigraphic units. The final model groups and discriminate them in four groups, according to the stratigraphic unit, having both the stratum and soils adhered to the pieces falling down in the same group.

  7. Comparison of selected variables of gaming performance in football

    OpenAIRE

    Parachin, Jiří

    2014-01-01

    Title: Comparison of selected variables of gaming performance in football Objectives: Analysis of selected variables of gaming performance in the matches of professional Czech football teams in the Champions League and UEFA Europa League in 2013. During the observation to register set variables, then evaluate obtained results and compare them. Methods: The use of observational analysis and comparison of selected variables of gaming performance in competitive matches of professional football. ...

  8. Fuzzy target selection using RFM variables

    NARCIS (Netherlands)

    Kaymak, U.

    2001-01-01

    An important data mining problem from the world of direct marketing is target selection. The main task in target selection is the determination of potential customers for a product from a client database. Target selection algorithms identify the profiles of customer groups for a particular product,

  9. Total sulfur determination in residues of crude oil distillation using FT-IR/ATR and variable selection methods

    Science.gov (United States)

    Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; Mello, Paola de Azevedo; Ferrão, Marco Flores; dos Santos, Maria de Fátima Pereira; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes

    2012-04-01

    Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm-1). This model produced a RMSECV of 400 mg kg-1 S and RMSEP of 420 mg kg-1 S, showing a correlation coefficient of 0.990.

  10. Variable selection in Logistic regression model with genetic algorithm.

    Science.gov (United States)

    Zhang, Zhongheng; Trevino, Victor; Hoseini, Sayed Shahabuddin; Belciug, Smaranda; Boopathi, Arumugam Manivanna; Zhang, Ping; Gorunescu, Florin; Subha, Velappan; Dai, Songshi

    2018-02-01

    Variable or feature selection is one of the most important steps in model specification. Especially in the case of medical-decision making, the direct use of a medical database, without a previous analysis and preprocessing step, is often counterproductive. In this way, the variable selection represents the method of choosing the most relevant attributes from the database in order to build a robust learning models and, thus, to improve the performance of the models used in the decision process. In biomedical research, the purpose of variable selection is to select clinically important and statistically significant variables, while excluding unrelated or noise variables. A variety of methods exist for variable selection, but none of them is without limitations. For example, the stepwise approach, which is highly used, adds the best variable in each cycle generally producing an acceptable set of variables. Nevertheless, it is limited by the fact that it commonly trapped in local optima. The best subset approach can systematically search the entire covariate pattern space, but the solution pool can be extremely large with tens to hundreds of variables, which is the case in nowadays clinical data. Genetic algorithms (GA) are heuristic optimization approaches and can be used for variable selection in multivariable regression models. This tutorial paper aims to provide a step-by-step approach to the use of GA in variable selection. The R code provided in the text can be extended and adapted to other data analysis needs.

  11. THE TIME DOMAIN SPECTROSCOPIC SURVEY: VARIABLE SELECTION AND ANTICIPATED RESULTS

    Energy Technology Data Exchange (ETDEWEB)

    Morganson, Eric; Green, Paul J. [Harvard Smithsonian Center for Astrophysics, 60 Garden St, Cambridge, MA 02138 (United States); Anderson, Scott F.; Ruan, John J. [Department of Astronomy, University of Washington, Box 351580, Seattle, WA 98195 (United States); Myers, Adam D. [Department of Physics and Astronomy, University of Wyoming, Laramie, WY 82071 (United States); Eracleous, Michael; Brandt, William Nielsen [Department of Astronomy and Astrophysics, 525 Davey Laboratory, The Pennsylvania State University, University Park, PA 16802 (United States); Kelly, Brandon [Department of Physics, Broida Hall, University of California, Santa Barbara, CA 93106-9530 (United States); Badenes, Carlos [Department of Physics and Astronomy and Pittsburgh Particle Physics, Astrophysics and Cosmology Center (PITT PACC), University of Pittsburgh, 3941 O’Hara St, Pittsburgh, PA 15260 (United States); Bañados, Eduardo [Max-Planck-Institut für Astronomie, Königstuhl 17, D-69117 Heidelberg (Germany); Blanton, Michael R. [Center for Cosmology and Particle Physics, Department of Physics, New York University, 4 Washington Place, New York, NY 10003 (United States); Bershady, Matthew A. [Department of Astronomy, University of Wisconsin, 475 N. Charter St., Madison, WI 53706 (United States); Borissova, Jura [Instituto de Física y Astronomía, Universidad de Valparaíso, Av. Gran Bretaña 1111, Playa Ancha, Casilla 5030, and Millennium Institute of Astrophysics (MAS), Santiago (Chile); Burgett, William S. [GMTO Corp, Suite 300, 251 S. Lake Ave, Pasadena, CA 91101 (United States); Chambers, Kenneth, E-mail: emorganson@cfa.harvard.edu [Institute for Astronomy, University of Hawaii at Manoa, Honolulu, HI 96822 (United States); and others

    2015-06-20

    We present the selection algorithm and anticipated results for the Time Domain Spectroscopic Survey (TDSS). TDSS is an Sloan Digital Sky Survey (SDSS)-IV Extended Baryon Oscillation Spectroscopic Survey (eBOSS) subproject that will provide initial identification spectra of approximately 220,000 luminosity-variable objects (variable stars and active galactic nuclei across 7500 deg{sup 2} selected from a combination of SDSS and multi-epoch Pan-STARRS1 photometry. TDSS will be the largest spectroscopic survey to explicitly target variable objects, avoiding pre-selection on the basis of colors or detailed modeling of specific variability characteristics. Kernel Density Estimate analysis of our target population performed on SDSS Stripe 82 data suggests our target sample will be 95% pure (meaning 95% of objects we select have genuine luminosity variability of a few magnitudes or more). Our final spectroscopic sample will contain roughly 135,000 quasars and 85,000 stellar variables, approximately 4000 of which will be RR Lyrae stars which may be used as outer Milky Way probes. The variability-selected quasar population has a smoother redshift distribution than a color-selected sample, and variability measurements similar to those we develop here may be used to make more uniform quasar samples in large surveys. The stellar variable targets are distributed fairly uniformly across color space, indicating that TDSS will obtain spectra for a wide variety of stellar variables including pulsating variables, stars with significant chromospheric activity, cataclysmic variables, and eclipsing binaries. TDSS will serve as a pathfinder mission to identify and characterize the multitude of variable objects that will be detected photometrically in even larger variability surveys such as Large Synoptic Survey Telescope.

  12. Chemometric methods in capillary electrophoresis

    National Research Council Canada - National Science Library

    Hanrahan, Grady; Gomez, Frank A

    2010-01-01

    ... 113 6 CHEMOMETRIC METHODS FOR THE OPTIMIZATION OF CE AND CE- MS IN PHARMACEUTICAL, ENVIRONMENTAL, AND FOOD ANALYSIS Javier Hernández-Borges, Miguel Ángel Rodríguez-Delgado, and Alejandro Cifuent...

  13. Chemometrics review for chemical sensor development, task 7 report

    International Nuclear Information System (INIS)

    1994-05-01

    This report, the seventh in a series on the evaluation of several chemical sensors for use in the U.S. Department of Energy's (DOE's) site characterization and monitoring programs, concentrates on the potential use of chemometrics techniques in analysis of sensor data. Chemometrics is the chemical discipline that uses mathematical, statistical, and other methods that employ formal logic to: design or select optimal measurement procedures and experiments and provide maximum relevant chemical information by analyzing chemical data. The report emphasizes the latter aspect. In a formal sense, two distinct phases are in chemometrics applications to analytical chemistry problems: (1) the exploratory data analysis phase and (2) the calibration and prediction phase. For use in real-world problems, it is wise to add a third aspect - the independent validation and verification phase. In practical applications, such as the ERWM work, and in order of decreasing difficulties, the most difficult tasks in chemometrics are: establishing the necessary infrastructure (to manage sampling records, data handling, and data storage and related aspects), exploring data analysis, and solving calibration problems, especially for nonlinear models. Chemometrics techniques are different for what are called zeroth-, first-, and second-order systems, and the details depend on the form of the assumed functional relationship between the measured response and the concentrations of components in mixtures. In general, linear relationships can be handled relatively easily, but nonlinear relationships can be difficult

  14. Chemometrics review for chemical sensor development, task 7 report

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1994-05-01

    This report, the seventh in a series on the evaluation of several chemical sensors for use in the U.S. Department of Energy`s (DOE`s) site characterization and monitoring programs, concentrates on the potential use of chemometrics techniques in analysis of sensor data. Chemometrics is the chemical discipline that uses mathematical, statistical, and other methods that employ formal logic to: design or select optimal measurement procedures and experiments and provide maximum relevant chemical information by analyzing chemical data. The report emphasizes the latter aspect. In a formal sense, two distinct phases are in chemometrics applications to analytical chemistry problems: (1) the exploratory data analysis phase and (2) the calibration and prediction phase. For use in real-world problems, it is wise to add a third aspect - the independent validation and verification phase. In practical applications, such as the ERWM work, and in order of decreasing difficulties, the most difficult tasks in chemometrics are: establishing the necessary infrastructure (to manage sampling records, data handling, and data storage and related aspects), exploring data analysis, and solving calibration problems, especially for nonlinear models. Chemometrics techniques are different for what are called zeroth-, first-, and second-order systems, and the details depend on the form of the assumed functional relationship between the measured response and the concentrations of components in mixtures. In general, linear relationships can be handled relatively easily, but nonlinear relationships can be difficult.

  15. Bayesian Group Bridge for Bi-level Variable Selection.

    Science.gov (United States)

    Mallick, Himel; Yi, Nengjun

    2017-06-01

    A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.

  16. Using variable combination population analysis for variable selection in multivariate calibration.

    Science.gov (United States)

    Yun, Yong-Huan; Wang, Wei-Ting; Deng, Bai-Chuan; Lai, Guang-Bi; Liu, Xin-bo; Ren, Da-Bing; Liang, Yi-Zeng; Fan, Wei; Xu, Qing-Song

    2015-03-03

    Variable (wavelength or feature) selection techniques have become a critical step for the analysis of datasets with high number of variables and relatively few samples. In this study, a novel variable selection strategy, variable combination population analysis (VCPA), was proposed. This strategy consists of two crucial procedures. First, the exponentially decreasing function (EDF), which is the simple and effective principle of 'survival of the fittest' from Darwin's natural evolution theory, is employed to determine the number of variables to keep and continuously shrink the variable space. Second, in each EDF run, binary matrix sampling (BMS) strategy that gives each variable the same chance to be selected and generates different variable combinations, is used to produce a population of subsets to construct a population of sub-models. Then, model population analysis (MPA) is employed to find the variable subsets with the lower root mean squares error of cross validation (RMSECV). The frequency of each variable appearing in the best 10% sub-models is computed. The higher the frequency is, the more important the variable is. The performance of the proposed procedure was investigated using three real NIR datasets. The results indicate that VCPA is a good variable selection strategy when compared with four high performing variable selection methods: genetic algorithm-partial least squares (GA-PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling (CARS) and iteratively retains informative variables (IRIV). The MATLAB source code of VCPA is available for academic research on the website: http://www.mathworks.com/matlabcentral/fileexchange/authors/498750. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    Science.gov (United States)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    ) statistics was used to quantitatively assess the predictors most relevant for response variable estimation and then for variable selection (Andersen and Bro, 2010). PCA and SDA returned TOC and RFC as influential variables both on the set of chemical and physical data analyzed separately as well as on the whole dataset (Stellacci et al., 2016). Highly weighted variables in PCA were also TEC, followed by K, and AC, followed by Pmac and BD, in the first PC (41.2% of total variance); Olsen P and HA-FA in the second PC (12.6%), Ca in the third (10.6%) component. Variables enabling maximum discrimination among treatments for SDA were WEOC, on the whole dataset, humic substances, followed by Olsen P, EC and clay, in the separate data analyses. The highest PLS-VIP statistics were recorded for Olsen P and Pmac, followed by TOC, TEC, pH and Mg for chemical variables and clay, RFC and AC for the physical variables. Results show that different methods may provide different ranking of the selected variables and the presence of a response variable, in regressive techniques, may affect variable selection. Further investigation with different response variables and with multi-year datasets would allow to better define advantages and limits of single or combined approaches. Acknowledgment The work was supported by the projects "BIOTILLAGE, approcci innovative per il miglioramento delle performances ambientali e produttive dei sistemi cerealicoli no-tillage", financed by PSR-Basilicata 2007-2013, and "DESERT, Low-cost water desalination and sensor technology compact module" financed by ERANET-WATERWORKS 2014. References Andersen C.M. and Bro R., 2010. Variable selection in regression - a tutorial. Journal of Chemometrics, 24 728-737. Armenise et al., 2013. Developing a soil quality index to compare soil fitness for agricultural use under different managements in the mediterranean environment. Soil and Tillage Research, 130:91-98. de Paul Obade et al., 2016. A standardized soil quality index

  18. Ensembling Variable Selectors by Stability Selection for the Cox Model

    Directory of Open Access Journals (Sweden)

    Qing-Yan Yin

    2017-01-01

    Full Text Available As a pivotal tool to build interpretive models, variable selection plays an increasingly important role in high-dimensional data analysis. In recent years, variable selection ensembles (VSEs have gained much interest due to their many advantages. Stability selection (Meinshausen and Bühlmann, 2010, a VSE technique based on subsampling in combination with a base algorithm like lasso, is an effective method to control false discovery rate (FDR and to improve selection accuracy in linear regression models. By adopting lasso as a base learner, we attempt to extend stability selection to handle variable selection problems in a Cox model. According to our experience, it is crucial to set the regularization region Λ in lasso and the parameter λmin properly so that stability selection can work well. To the best of our knowledge, however, there is no literature addressing this problem in an explicit way. Therefore, we first provide a detailed procedure to specify Λ and λmin. Then, some simulated and real-world data with various censoring rates are used to examine how well stability selection performs. It is also compared with several other variable selection approaches. Experimental results demonstrate that it achieves better or competitive performance in comparison with several other popular techniques.

  19. A numeric comparison of variable selection algorithms for supervised learning

    International Nuclear Information System (INIS)

    Palombo, G.; Narsky, I.

    2009-01-01

    Datasets in modern High Energy Physics (HEP) experiments are often described by dozens or even hundreds of input variables. Reducing a full variable set to a subset that most completely represents information about data is therefore an important task in analysis of HEP data. We compare various variable selection algorithms for supervised learning using several datasets such as, for instance, imaging gamma-ray Cherenkov telescope (MAGIC) data found at the UCI repository. We use classifiers and variable selection methods implemented in the statistical package StatPatternRecognition (SPR), a free open-source C++ package developed in the HEP community ( (http://sourceforge.net/projects/statpatrec/)). For each dataset, we select a powerful classifier and estimate its learning accuracy on variable subsets obtained by various selection algorithms. When possible, we also estimate the CPU time needed for the variable subset selection. The results of this analysis are compared with those published previously for these datasets using other statistical packages such as R and Weka. We show that the most accurate, yet slowest, method is a wrapper algorithm known as generalized sequential forward selection ('Add N Remove R') implemented in SPR.

  20. A Variable-Selection Heuristic for K-Means Clustering.

    Science.gov (United States)

    Brusco, Michael J.; Cradit, J. Dennis

    2001-01-01

    Presents a variable selection heuristic for nonhierarchical (K-means) cluster analysis based on the adjusted Rand index for measuring cluster recovery. Subjected the heuristic to Monte Carlo testing across more than 2,200 datasets. Results indicate that the heuristic is extremely effective at eliminating masking variables. (SLD)

  1. Variable Selection for Regression Models of Percentile Flows

    Science.gov (United States)

    Fouad, G.

    2017-12-01

    Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high

  2. Classification and quantitation of milk powder by near-infrared spectroscopy and mutual information-based variable selection and partial least squares

    Science.gov (United States)

    Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong

    2018-01-01

    Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.

  3. Variable selection and estimation for longitudinal survey data

    KAUST Repository

    Wang, Li

    2014-09-01

    There is wide interest in studying longitudinal surveys where sample subjects are observed successively over time. Longitudinal surveys have been used in many areas today, for example, in the health and social sciences, to explore relationships or to identify significant variables in regression settings. This paper develops a general strategy for the model selection problem in longitudinal sample surveys. A survey weighted penalized estimating equation approach is proposed to select significant variables and estimate the coefficients simultaneously. The proposed estimators are design consistent and perform as well as the oracle procedure when the correct submodel was known. The estimating function bootstrap is applied to obtain the standard errors of the estimated parameters with good accuracy. A fast and efficient variable selection algorithm is developed to identify significant variables for complex longitudinal survey data. Simulated examples are illustrated to show the usefulness of the proposed methodology under various model settings and sampling designs. © 2014 Elsevier Inc.

  4. Variable selection for mixture and promotion time cure rate models.

    Science.gov (United States)

    Masud, Abdullah; Tu, Wanzhu; Yu, Zhangsheng

    2016-11-16

    Failure-time data with cured patients are common in clinical studies. Data from these studies are typically analyzed with cure rate models. Variable selection methods have not been well developed for cure rate models. In this research, we propose two least absolute shrinkage and selection operators based methods, for variable selection in mixture and promotion time cure models with parametric or nonparametric baseline hazards. We conduct an extensive simulation study to assess the operating characteristics of the proposed methods. We illustrate the use of the methods using data from a study of childhood wheezing. © The Author(s) 2016.

  5. A novel variable selection approach that iteratively optimizes variable space using weighted binary matrix sampling.

    Science.gov (United States)

    Deng, Bai-chuan; Yun, Yong-huan; Liang, Yi-zeng; Yi, Lun-zhao

    2014-10-07

    In this study, a new optimization algorithm called the Variable Iterative Space Shrinkage Approach (VISSA) that is based on the idea of model population analysis (MPA) is proposed for variable selection. Unlike most of the existing optimization methods for variable selection, VISSA statistically evaluates the performance of variable space in each step of optimization. Weighted binary matrix sampling (WBMS) is proposed to generate sub-models that span the variable subspace. Two rules are highlighted during the optimization procedure. First, the variable space shrinks in each step. Second, the new variable space outperforms the previous one. The second rule, which is rarely satisfied in most of the existing methods, is the core of the VISSA strategy. Compared with some promising variable selection methods such as competitive adaptive reweighted sampling (CARS), Monte Carlo uninformative variable elimination (MCUVE) and iteratively retaining informative variables (IRIV), VISSA showed better prediction ability for the calibration of NIR data. In addition, VISSA is user-friendly; only a few insensitive parameters are needed, and the program terminates automatically without any additional conditions. The Matlab codes for implementing VISSA are freely available on the website: https://sourceforge.net/projects/multivariateanalysis/files/VISSA/.

  6. Variable selection and model choice in geoadditive regression models.

    Science.gov (United States)

    Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard

    2009-06-01

    Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.

  7. The Properties of Model Selection when Retaining Theory Variables

    DEFF Research Database (Denmark)

    Hendry, David F.; Johansen, Søren

    Economic theories are often fitted directly to data to avoid possible model selection biases. We show that embedding a theory model that specifies the correct set of m relevant exogenous variables, x{t}, within the larger set of m+k candidate variables, (x{t},w{t}), then selection over the second...... set by their statistical significance can be undertaken without affecting the estimator distribution of the theory parameters. This strategy returns the theory-parameter estimates when the theory is correct, yet protects against the theory being under-specified because some w{t} are relevant....

  8. Exhaustive Search for Sparse Variable Selection in Linear Regression

    Science.gov (United States)

    Igarashi, Yasuhiko; Takenaka, Hikaru; Nakanishi-Ohno, Yoshinori; Uemura, Makoto; Ikeda, Shiro; Okada, Masato

    2018-04-01

    We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.

  9. Variable selection in multivariate calibration based on clustering of variable concept.

    Science.gov (United States)

    Farrokhnia, Maryam; Karimi, Sadegh

    2016-01-01

    Recently we have proposed a new variable selection algorithm, based on clustering of variable concept (CLoVA) in classification problem. With the same idea, this new concept has been applied to a regression problem and then the obtained results have been compared with conventional variable selection strategies for PLS. The basic idea behind the clustering of variable is that, the instrument channels are clustered into different clusters via clustering algorithms. Then, the spectral data of each cluster are subjected to PLS regression. Different real data sets (Cargill corn, Biscuit dough, ACE QSAR, Soy, and Tablet) have been used to evaluate the influence of the clustering of variables on the prediction performances of PLS. Almost in the all cases, the statistical parameter especially in prediction error shows the superiority of CLoVA-PLS respect to other variable selection strategies. Finally the synergy clustering of variable (sCLoVA-PLS), which is used the combination of cluster, has been proposed as an efficient and modification of CLoVA algorithm. The obtained statistical parameter indicates that variable clustering can split useful part from redundant ones, and then based on informative cluster; stable model can be reached. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. ENSEMBLE VARIABILITY OF NEAR-INFRARED-SELECTED ACTIVE GALACTIC NUCLEI

    International Nuclear Information System (INIS)

    Kouzuma, S.; Yamaoka, H.

    2012-01-01

    We present the properties of the ensemble variability V for nearly 5000 near-infrared active galactic nuclei (AGNs) selected from the catalog of Quasars and Active Galactic Nuclei (13th Edition) and the SDSS-DR7 quasar catalog. From three near-infrared point source catalogs, namely, Two Micron All Sky Survey (2MASS), Deep Near Infrared Survey (DENIS), and UKIDSS/LAS catalogs, we extract 2MASS-DENIS and 2MASS-UKIDSS counterparts for cataloged AGNs by cross-identification between catalogs. We further select variable AGNs based on an optimal criterion for selecting the variable sources. The sample objects are divided into subsets according to whether near-infrared light originates by optical emission or by near-infrared emission in the rest frame; and we examine the correlations of the ensemble variability with the rest-frame wavelength, redshift, luminosity, and rest-frame time lag. In addition, we also examine the correlations of variability amplitude with optical variability, radio intensity, and radio-to-optical flux ratio. The rest-frame optical variability of our samples shows negative correlations with luminosity and positive correlations with rest-frame time lag (i.e., the structure function, SF), and this result is consistent with previous analyses. However, no well-known negative correlation exists between the rest-frame wavelength and optical variability. This inconsistency might be due to a biased sampling of high-redshift AGNs. Near-infrared variability in the rest frame is anticorrelated with the rest-frame wavelength, which is consistent with previous suggestions. However, correlations of near-infrared variability with luminosity and rest-frame time lag are the opposite of these correlations of the optical variability; that is, the near-infrared variability is positively correlated with luminosity but negatively correlated with the rest-frame time lag. Because these trends are qualitatively consistent with the properties of radio-loud quasars reported

  11. Source identification of petroleum hydrocarbons in soil and sediments from Iguaçu River Watershed, Paraná, Brazil using the CHEMSIC method (CHEMometric analysis of Selected Ion Chromatograms).

    Science.gov (United States)

    Gallotta, Fabiana D C; Christensen, Jan H

    2012-04-27

    A chemometric method based on principal component analysis (PCA) of pre-processed and combined sections of selected ion chromatograms (SICs) is used to characterise the hydrocarbon profiles in soil and sediment from Araucária, Guajuvira, General Lúcio and Balsa Nova Municipalities (Iguaçu River Watershed, Paraná, Brazil) and to indicate the main sources of hydrocarbon pollution. The study includes 38 SICs of polycyclic aromatic compounds (PACs) and four of petroleum biomarkers in two separate analyses. The most contaminated samples are inside the Presidente Getúlio Vargas Refinery area. These samples represent a petrogenic pattern and different weathering degrees. Samples from outside the refinery area are either less or not contaminated, or contain mixtures of diagenetic, pyrogenic and petrogenic inputs where different proportions predominate. The locations farthest away from industrial activity (Balsa Nova) contains the lowest levels of PAC contamination. There are no evidences to conclude positive matches between the samples from outside the refinery area and the Cusiana spilled oil. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. CHARACTERIZING THE OPTICAL VARIABILITY OF BRIGHT BLAZARS: VARIABILITY-BASED SELECTION OF FERMI ACTIVE GALACTIC NUCLEI

    International Nuclear Information System (INIS)

    Ruan, John J.; Anderson, Scott F.; MacLeod, Chelsea L.; Becker, Andrew C.; Davenport, James R. A.; Ivezić, Željko; Burnett, T. H.; Kochanek, Christopher S.; Plotkin, Richard M.; Sesar, Branimir; Stuart, J. Scott

    2012-01-01

    We investigate the use of optical photometric variability to select and identify blazars in large-scale time-domain surveys, in part to aid in the identification of blazar counterparts to the ∼30% of γ-ray sources in the Fermi 2FGL catalog still lacking reliable associations. Using data from the optical LINEAR asteroid survey, we characterize the optical variability of blazars by fitting a damped random walk model to individual light curves with two main model parameters, the characteristic timescales of variability τ, and driving amplitudes on short timescales σ-circumflex. Imposing cuts on minimum τ and σ-circumflex allows for blazar selection with high efficiency E and completeness C. To test the efficacy of this approach, we apply this method to optically variable LINEAR objects that fall within the several-arcminute error ellipses of γ-ray sources in the Fermi 2FGL catalog. Despite the extreme stellar contamination at the shallow depth of the LINEAR survey, we are able to recover previously associated optical counterparts to Fermi active galactic nuclei with E ≥ 88% and C = 88% in Fermi 95% confidence error ellipses having semimajor axis r < 8'. We find that the suggested radio counterpart to Fermi source 2FGL J1649.6+5238 has optical variability consistent with other γ-ray blazars and is likely to be the γ-ray source. Our results suggest that the variability of the non-thermal jet emission in blazars is stochastic in nature, with unique variability properties due to the effects of relativistic beaming. After correcting for beaming, we estimate that the characteristic timescale of blazar variability is ∼3 years in the rest frame of the jet, in contrast with the ∼320 day disk flux timescale observed in quasars. The variability-based selection method presented will be useful for blazar identification in time-domain optical surveys and is also a probe of jet physics.

  13. Portfolio Selection Based on Distance between Fuzzy Variables

    Directory of Open Access Journals (Sweden)

    Weiyi Qian

    2014-01-01

    Full Text Available This paper researches portfolio selection problem in fuzzy environment. We introduce a new simple method in which the distance between fuzzy variables is used to measure the divergence of fuzzy investment return from a prior one. Firstly, two new mathematical models are proposed by expressing divergence as distance, investment return as expected value, and risk as variance and semivariance, respectively. Secondly, the crisp forms of the new models are also provided for different types of fuzzy variables. Finally, several numerical examples are given to illustrate the effectiveness of the proposed approach.

  14. Protein construct storage: Bayesian variable selection and prediction with mixtures.

    Science.gov (United States)

    Clyde, M A; Parmigiani, G

    1998-07-01

    Determining optimal conditions for protein storage while maintaining a high level of protein activity is an important question in pharmaceutical research. A designed experiment based on a space-filling design was conducted to understand the effects of factors affecting protein storage and to establish optimal storage conditions. Different model-selection strategies to identify important factors may lead to very different answers about optimal conditions. Uncertainty about which factors are important, or model uncertainty, can be a critical issue in decision-making. We use Bayesian variable selection methods for linear models to identify important variables in the protein storage data, while accounting for model uncertainty. We also use the Bayesian framework to build predictions based on a large family of models, rather than an individual model, and to evaluate the probability that certain candidate storage conditions are optimal.

  15. Mahalanobis distance and variable selection to optimize dose response

    International Nuclear Information System (INIS)

    Moore, D.H. II; Bennett, D.E.; Wyrobek, A.J.; Kranzler, D.

    1979-01-01

    A battery of statistical techniques are combined to improve detection of low-level dose response. First, Mahalanobis distances are used to classify objects as normal or abnormal. Then the proportion classified abnormal is regressed on dose. Finally, a subset of regressor variables is selected which maximizes the slope of the dose response line. Use of the techniques is illustrated by application to mouse sperm damaged by low doses of x-rays

  16. STEPWISE SELECTION OF VARIABLES IN DEA USING CONTRIBUTION LOADS

    Directory of Open Access Journals (Sweden)

    Fernando Fernandez-Palacin

    Full Text Available ABSTRACT In this paper, we propose a new methodology for variable selection in Data Envelopment Analysis (DEA. The methodology is based on an internal measure which evaluates the contribution of each variable in the calculation of the efficiency scores of DMUs. In order to apply the proposed method, an algorithm, known as “ADEA”, was developed and implemented in R. Step by step, the algorithm maximizes the load of the variable (input or output which contribute least to the calculation of the efficiency scores, redistributing the weights of the variables without altering the efficiency scores of the DMUs. Once the weights have been redistributed, if the lower contribution does not reach a previously given critical value, a variable with minimum contribution will be removed from the model and, as a result, the DEA will be solved again. The algorithm will stop when all variables reach a given contribution load to the DEA or until no more variables can be removed. In this way and contrary to what is usual, the algorithm provides a clear stop rule. In both cases, the efficiencies obtained from the DEA will be considered suitable and rightly interpreted in terms of the remaining variables, indicating the load themselves; moreover, the algorithm will provide a sequence of alternative nested models - potential solutions - that could be evaluated according to external criterion. To illustrate the procedure, we have applied the methodology proposed to obtain a research ranking of Spanish public universities. In this case, at each step of the algorithm, the critical value is obtained based on a simulation study.

  17. Two-step variable selection in quantile regression models

    Directory of Open Access Journals (Sweden)

    FAN Yali

    2015-06-01

    Full Text Available We propose a two-step variable selection procedure for high dimensional quantile regressions, in which the dimension of the covariates, pn is much larger than the sample size n. In the first step, we perform ℓ1 penalty, and we demonstrate that the first step penalized estimator with the LASSO penalty can reduce the model from an ultra-high dimensional to a model whose size has the same order as that of the true model, and the selected model can cover the true model. The second step excludes the remained irrelevant covariates by applying the adaptive LASSO penalty to the reduced model obtained from the first step. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. We conduct a simulation study and a real data analysis to evaluate the finite sample performance of the proposed approach.

  18. Characterizing the Optical Variability of Bright Blazars: Variability-based Selection of Fermi Active Galactic Nuclei

    Science.gov (United States)

    Ruan, John J.; Anderson, Scott F.; MacLeod, Chelsea L.; Becker, Andrew C.; Burnett, T. H.; Davenport, James R. A.; Ivezić, Željko; Kochanek, Christopher S.; Plotkin, Richard M.; Sesar, Branimir; Stuart, J. Scott

    2012-11-01

    We investigate the use of optical photometric variability to select and identify blazars in large-scale time-domain surveys, in part to aid in the identification of blazar counterparts to the ~30% of γ-ray sources in the Fermi 2FGL catalog still lacking reliable associations. Using data from the optical LINEAR asteroid survey, we characterize the optical variability of blazars by fitting a damped random walk model to individual light curves with two main model parameters, the characteristic timescales of variability τ, and driving amplitudes on short timescales \\hat{\\sigma }. Imposing cuts on minimum τ and \\hat{\\sigma } allows for blazar selection with high efficiency E and completeness C. To test the efficacy of this approach, we apply this method to optically variable LINEAR objects that fall within the several-arcminute error ellipses of γ-ray sources in the Fermi 2FGL catalog. Despite the extreme stellar contamination at the shallow depth of the LINEAR survey, we are able to recover previously associated optical counterparts to Fermi active galactic nuclei with E >= 88% and C = 88% in Fermi 95% confidence error ellipses having semimajor axis r beaming. After correcting for beaming, we estimate that the characteristic timescale of blazar variability is ~3 years in the rest frame of the jet, in contrast with the ~320 day disk flux timescale observed in quasars. The variability-based selection method presented will be useful for blazar identification in time-domain optical surveys and is also a probe of jet physics.

  19. A Simple K-Map Based Variable Selection Scheme in the Direct ...

    African Journals Online (AJOL)

    A multiplexer with (n-l) data select inputs can realise directly a function of n variables. In this paper, a simple k-map based variable selection scheme is proposed such that an n variable logic function can be synthesised using a multiplexer with (n-q) data input variables and q data select variables. The procedure is based on ...

  20. Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.

    Science.gov (United States)

    Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen

    2011-04-01

    Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.

  1. Isoenzymatic variability in tropical maize populations under reciprocal recurrent selection

    Directory of Open Access Journals (Sweden)

    Pinto Luciana Rossini

    2003-01-01

    Full Text Available Maize (Zea mays L. is one of the crops in which the genetic variability has been extensively studied at isoenzymatic loci. The genetic variability of the maize populations BR-105 and BR-106, and the synthetics IG-3 and IG-4, obtained after one cycle of a high-intensity reciprocal recurrent selection (RRS, was investigated at seven isoenzymatic loci. A total of twenty alleles were identified, and most of the private alleles were found in the BR-106 population. One cycle of reciprocal recurrent selection (RRS caused reductions of 12% in the number of alleles in both populations. Changes in allele frequencies were also observed between populations and synthetics, mainly for the Est 2 locus. Populations presented similar values for the number of alleles per locus, percentage of polymorphic loci, and observed and expected heterozygosities. A decrease of the genetic variation values was observed for the synthetics as a consequence of genetic drift effects and reduction of the effective population sizes. The distribution of the genetic diversity within and between populations revealed that most of the diversity was maintained within them, i.e. BR-105 x BR-106 (G ST = 3.5% and IG-3 x IG-4 (G ST = 4.0%. The genetic distances between populations and synthetics increased approximately 21%. An increase in the genetic divergence between the populations occurred without limiting new selection procedures.

  2. Chaotic Dynamical State Variables Selection Procedure Based Image Encryption Scheme

    Directory of Open Access Journals (Sweden)

    Zia Bashir

    2017-12-01

    Full Text Available Nowadays, in the modern digital era, the use of computer technologies such as smartphones, tablets and the Internet, as well as the enormous quantity of confidential information being converted into digital form have resulted in raised security issues. This, in turn, has led to rapid developments in cryptography, due to the imminent need for system security. Low-dimensional chaotic systems have low complexity and key space, yet they achieve high encryption speed. An image encryption scheme is proposed that, without compromising the security, uses reasonable resources. We introduced a chaotic dynamic state variables selection procedure (CDSVSP to use all state variables of a hyper-chaotic four-dimensional dynamical system. As a result, less iterations of the dynamical system are required, and resources are saved, thus making the algorithm fast and suitable for practical use. The simulation results of security and other miscellaneous tests demonstrate that the suggested algorithm excels at robustness, security and high speed encryption.

  3. Estimation and variable selection for generalized additive partial linear models

    KAUST Repository

    Wang, Li

    2011-08-01

    We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus results in gains in computational simplicity. We further develop a class of variable selection procedures for the linear parameters by employing a nonconcave penalized quasi-likelihood, which is shown to have an asymptotic oracle property. Monte Carlo simulations and an empirical example are presented for illustration. © Institute of Mathematical Statistics, 2011.

  4. Comparison of climate envelope models developed using expert-selected variables versus statistical selection

    Science.gov (United States)

    Brandt, Laura A.; Benscoter, Allison; Harvey, Rebecca G.; Speroterra, Carolina; Bucklin, David N.; Romañach, Stephanie; Watling, James I.; Mazzotti, Frank J.

    2017-01-01

    Climate envelope models are widely used to describe potential future distribution of species under different climate change scenarios. It is broadly recognized that there are both strengths and limitations to using climate envelope models and that outcomes are sensitive to initial assumptions, inputs, and modeling methods Selection of predictor variables, a central step in modeling, is one of the areas where different techniques can yield varying results. Selection of climate variables to use as predictors is often done using statistical approaches that develop correlations between occurrences and climate data. These approaches have received criticism in that they rely on the statistical properties of the data rather than directly incorporating biological information about species responses to temperature and precipitation. We evaluated and compared models and prediction maps for 15 threatened or endangered species in Florida based on two variable selection techniques: expert opinion and a statistical method. We compared model performance between these two approaches for contemporary predictions, and the spatial correlation, spatial overlap and area predicted for contemporary and future climate predictions. In general, experts identified more variables as being important than the statistical method and there was low overlap in the variable sets (0.9 for area under the curve (AUC) and >0.7 for true skill statistic (TSS). Spatial overlap, which compares the spatial configuration between maps constructed using the different variable selection techniques, was only moderate overall (about 60%), with a great deal of variability across species. Difference in spatial overlap was even greater under future climate projections, indicating additional divergence of model outputs from different variable selection techniques. Our work is in agreement with other studies which have found that for broad-scale species distribution modeling, using statistical methods of variable

  5. Selection for altruism through random drift in variable size populations

    Directory of Open Access Journals (Sweden)

    Houchmandzadeh Bahram

    2012-05-01

    Full Text Available Abstract Background Altruistic behavior is defined as helping others at a cost to oneself and a lowered fitness. The lower fitness implies that altruists should be selected against, which is in contradiction with their widespread presence is nature. Present models of selection for altruism (kin or multilevel show that altruistic behaviors can have ‘hidden’ advantages if the ‘common good’ produced by altruists is restricted to some related or unrelated groups. These models are mostly deterministic, or assume a frequency dependent fitness. Results Evolutionary dynamics is a competition between deterministic selection pressure and stochastic events due to random sampling from one generation to the next. We show here that an altruistic allele extending the carrying capacity of the habitat can win by increasing the random drift of “selfish” alleles. In other terms, the fixation probability of altruistic genes can be higher than those of a selfish ones, even though altruists have a smaller fitness. Moreover when populations are geographically structured, the altruists advantage can be highly amplified and the fixation probability of selfish genes can tend toward zero. The above results are obtained both by numerical and analytical calculations. Analytical results are obtained in the limit of large populations. Conclusions The theory we present does not involve kin or multilevel selection, but is based on the existence of random drift in variable size populations. The model is a generalization of the original Fisher-Wright and Moran models where the carrying capacity depends on the number of altruists.

  6. Fluorescence Spectroscopy and Chemometric Modeling for Bioprocess Monitoring

    Directory of Open Access Journals (Sweden)

    Saskia M. Faassen

    2015-04-01

    Full Text Available On-line sensors for the detection of crucial process parameters are desirable for the monitoring, control and automation of processes in the biotechnology, food and pharma industry. Fluorescence spectroscopy as a highly developed and non-invasive technique that enables the on-line measurements of substrate and product concentrations or the identification of characteristic process states. During a cultivation process significant changes occur in the fluorescence spectra. By means of chemometric modeling, prediction models can be calculated and applied for process supervision and control to provide increased quality and the productivity of bioprocesses. A range of applications for different microorganisms and analytes has been proposed during the last years. This contribution provides an overview of different analysis methods for the measured fluorescence spectra and the model-building chemometric methods used for various microbial cultivations. Most of these processes are observed using the BioView® Sensor, thanks to its robustness and insensitivity to adverse process conditions. Beyond that, the PLS-method is the most frequently used chemometric method for the calculation of process models and prediction of process variables.

  7. Fluorescence Spectroscopy and Chemometric Modeling for Bioprocess Monitoring

    Science.gov (United States)

    Faassen, Saskia M.; Hitzmann, Bernd

    2015-01-01

    On-line sensors for the detection of crucial process parameters are desirable for the monitoring, control and automation of processes in the biotechnology, food and pharma industry. Fluorescence spectroscopy as a highly developed and non-invasive technique that enables the on-line measurements of substrate and product concentrations or the identification of characteristic process states. During a cultivation process significant changes occur in the fluorescence spectra. By means of chemometric modeling, prediction models can be calculated and applied for process supervision and control to provide increased quality and the productivity of bioprocesses. A range of applications for different microorganisms and analytes has been proposed during the last years. This contribution provides an overview of different analysis methods for the measured fluorescence spectra and the model-building chemometric methods used for various microbial cultivations. Most of these processes are observed using the BioView® Sensor, thanks to its robustness and insensitivity to adverse process conditions. Beyond that, the PLS-method is the most frequently used chemometric method for the calculation of process models and prediction of process variables. PMID:25942644

  8. Application of chemometric techniques to classify the quality of surface water in the watershed of the river Bermudez in Heredia, Costa Rica

    International Nuclear Information System (INIS)

    Herrera Murillo, Jorge; Rodriguez Roman, Susana; Solis Torres, Ligia Dina; Castro Delgado, Francisco

    2009-01-01

    The application of selected chemometric techniques have been investigated: cluster analysis, principal component analysis and factor analysis, to classify the quality of rivers water and evaluate pollution data. Fourteen physicochemical parameters were monitored at 10 stations located in the watershed of the river Bermudez, from August 2005 to February 2007. The results have identified the existence of two natural clusters of monitoring sites with similar characteristics of contamination and identify the DQO, DBO, NO 3 - , SO 4 -2 and SST, as the main variables that discriminate between sampling sites. (author) [es

  9. Variable selection in near-infrared spectroscopy: Benchmarking of feature selection methods on biodiesel data

    International Nuclear Information System (INIS)

    Balabin, Roman M.; Smirnov, Sergey V.

    2011-01-01

    During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm -1 ) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic

  10. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data

    Directory of Open Access Journals (Sweden)

    Himmelreich Uwe

    2009-07-01

    Full Text Available Abstract Background Regularized regression methods such as principal component or partial least squares regression perform well in learning tasks on high dimensional spectral data, but cannot explicitly eliminate irrelevant features. The random forest classifier with its associated Gini feature importance, on the other hand, allows for an explicit feature elimination, but may not be optimally adapted to spectral data due to the topology of its constituent classification trees which are based on orthogonal splits in feature space. Results We propose to combine the best of both approaches, and evaluated the joint use of a feature selection based on a recursive feature elimination using the Gini importance of random forests' together with regularized classification methods on spectral data sets from medical diagnostics, chemotaxonomy, biomedical analytics, food science, and synthetically modified spectral data. Here, a feature selection using the Gini feature importance with a regularized classification by discriminant partial least squares regression performed as well as or better than a filtering according to different univariate statistical tests, or using regression coefficients in a backward feature elimination. It outperformed the direct application of the random forest classifier, or the direct application of the regularized classifiers on the full set of features. Conclusion The Gini importance of the random forest provided superior means for measuring feature relevance on spectral data, but – on an optimal subset of features – the regularized classifiers might be preferable over the random forest classifier, in spite of their limitation to model linear dependencies only. A feature selection based on Gini importance, however, may precede a regularized linear classification to identify this optimal subset of features, and to earn a double benefit of both dimensionality reduction and the elimination of noise from the classification task.

  11. Ethnic variability in adiposity and cardiovascular risk: the variable disease selection hypothesis.

    Science.gov (United States)

    Wells, Jonathan C K

    2009-02-01

    Evidence increasingly suggests that ethnic differences in cardiovascular risk are partly mediated by adipose tissue biology, which refers to the regional distribution of adipose tissue and its differential metabolic activity. This paper proposes a novel evolutionary hypothesis for ethnic genetic variability in adipose tissue biology. Whereas medical interest focuses on the harmful effect of excess fat, the value of adipose tissue is greatest during chronic energy insufficiency. Following Neel's influential paper on the thrifty genotype, proposed to have been favoured by exposure to cycles of feast and famine, much effort has been devoted to searching for genetic markers of 'thrifty metabolism'. However, whether famine-induced starvation was the primary selective pressure on adipose tissue biology has been questioned, while the notion that fat primarily represents a buffer against starvation appears inconsistent with historical records of mortality during famines. This paper reviews evidence for the role played by adipose tissue in immune function and proposes that adipose tissue biology responds to selective pressures acting through infectious disease. Different diseases activate the immune system in different ways and induce different metabolic costs. It is hypothesized that exposure to different infectious disease burdens has favoured ethnic genetic variability in the anatomical location of, and metabolic profile of, adipose tissue depots.

  12. Sex-specific selection for MHC variability in Alpine chamois

    Directory of Open Access Journals (Sweden)

    Schaschl Helmut

    2012-02-01

    Full Text Available Abstract Background In mammals, males typically have shorter lives than females. This difference is thought to be due to behavioural traits which enhance competitive abilities, and hence male reproductive success, but impair survival. Furthermore, in many species males usually show higher parasite burden than females. Consequently, the intensity of selection for genetic factors which reduce susceptibility to pathogens may differ between sexes. High variability at the major histocompatibility complex (MHC genes is believed to be advantageous for detecting and combating the range of infectious agents present in the environment. Increased heterozygosity at these immune genes is expected to be important for individual longevity. However, whether males in natural populations benefit more from MHC heterozygosity than females has rarely been investigated. We investigated this question in a long-term study of free-living Alpine chamois (Rupicapra rupicapra, a polygynous mountain ungulate. Results Here we show that male chamois survive significantly (P = 0.022 longer if heterozygous at the MHC class II DRB locus, whereas females do not. Improved survival of males was not a result of heterozygote advantage per se, as background heterozygosity (estimated across twelve microsatellite loci did not change significantly with age. Furthermore, reproductively active males depleted their body fat reserves earlier than females leading to significantly impaired survival rates in this sex (P Conclusions Increased MHC class II DRB heterozygosity with age in males, suggests that MHC heterozygous males survive longer than homozygotes. Reproductively active males appear to be less likely to survive than females most likely because of the energetic challenge of the winter rut, accompanied by earlier depletion of their body fat stores, and a generally higher parasite burden. This scenario renders the MHC-mediated immune response more important for males than for females

  13. Birth order and selected work-related personality variables.

    Science.gov (United States)

    Phillips, A S; Bedeian, A G; Mossholder, K W; Touliatos, J

    1988-12-01

    A possible link between birth order and various individual characteristics (e. g., intelligence, potential eminence, need for achievement, sociability) has been suggested by personality theorists such as Adler for over a century. The present study examines whether birth order is associated with selected personality variables that may be related to various work outcomes. 3 of 7 hypotheses were supported and the effect sizes for these were small. Firstborns scored significantly higher than later borns on measures of dominance, good impression, and achievement via conformity. No differences between firstborns and later borns were found in managerial potential, work orientation, achievement via independence, and sociability. The study's sample consisted of 835 public, government, and industrial accountants responding to a national US survey of accounting professionals. The nature of the sample may have been partially responsible for the results obtained. Its homogeneity may have caused any birth order effects to wash out. It can be argued that successful membership in the accountancy profession requires internalization of a set of prescribed rules and standards. It may be that accountants as a group are locked in to a behavioral framework. Any differentiation would result from spurious interpersonal differences, not from predictable birth-order related characteristics. A final interpretation is that birth order effects are nonexistent or statistical artifacts. Given the present data and particularistic sample, however, the authors have insufficient information from which to draw such a conclusion.

  14. [Identification of two varieties of Citri Fructus by fingerprint and chemometrics].

    Science.gov (United States)

    Su, Jing-hua; Zhang, Chao; Sun, Lei; Gu, Bing-ren; Ma, Shuang-cheng

    2015-06-01

    Citri Fructus identification by fingerprint and chemometrics was investigated in this paper. Twenty-three Citri Fructus samples were collected which referred to two varieties as Cirtus wilsonii and C. medica recorded in Chinese Pharmacopoeia. HPLC chromatograms were obtained. The components were partly identified by reference substances, and then common pattern was established for chemometrics analysis. Similarity analysis, principal component analysis (PCA) , partial least squares-discriminant analysis (PLS-DA) and hierarchical cluster analysis heatmap were applied. The results indicated that C. wilsonii and C. medica could be ideally classified with common pattern contained twenty-five characteristic peaks. Besides, preliminary pattern recognition had verified the chemometrics analytical results. Absolute peak area (APA) was used for relevant quantitative analysis, results showed the differences between two varieties and it was valuable for further quality control as selection of characteristic components.

  15. Principal Component Analysis: Most Favourite Tool in Chemometrics

    Indian Academy of Sciences (India)

    GENERAL ARTICLE. Principal ... Chemometrics is a discipline that combines mathematics, statis- ... workers have used PCA for air quality monitoring [8]. ..... J S Verbeke, Handbook of Chemometrics and Qualimetrics, Elsevier, New York,.

  16. The application of chemometrics on Infrared and Raman spectra as a tool for the forensic analysis of paints.

    Science.gov (United States)

    Muehlethaler, Cyril; Massonnet, Genevieve; Esseiva, Pierre

    2011-06-15

    The aim of this work is to evaluate the capabilities and limitations of chemometric methods and other mathematical treatments applied on spectroscopic data and more specifically on paint samples. The uniqueness of the spectroscopic data comes from the fact that they are multivariate - a few thousands variables - and highly correlated. Statistical methods are used to study and discriminate samples. A collection of 34 red paint samples was measured by Infrared and Raman spectroscopy. Data pretreatment and variable selection demonstrated that the use of Standard Normal Variate (SNV), together with removal of the noisy variables by a selection of the wavelengths from 650 to 1830 cm(-1) and 2730-3600 cm(-1), provided the optimal results for infrared analysis. Principal component analysis (PCA) and hierarchical clusters analysis (HCA) were then used as exploratory techniques to provide evidence of structure in the data, cluster, or detect outliers. With the FTIR spectra, the Principal Components (PCs) correspond to binder types and the presence/absence of calcium carbonate. 83% of the total variance is explained by the four first PCs. As for the Raman spectra, we observe six different clusters corresponding to the different pigment compositions when plotting the first two PCs, which account for 37% and 20% respectively of the total variance. In conclusion, the use of chemometrics for the forensic analysis of paints provides a valuable tool for objective decision-making, a reduction of the possible classification errors, and a better efficiency, having robust results with time saving data treatments. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  17. Variable selection methods in PLS regression - a comparison study on metabolomics data

    DEFF Research Database (Denmark)

    Karaman, İbrahim; Hedemann, Mette Skou; Knudsen, Knud Erik Bach

    . The aim of the metabolomics study was to investigate the metabolic profile in pigs fed various cereal fractions with special attention to the metabolism of lignans using LC-MS based metabolomic approach. References 1. Lê Cao KA, Rossouw D, Robert-Granié C, Besse P: A Sparse PLS for Variable Selection when...... integrated approach. Due to the high number of variables in data sets (both raw data and after peak picking) the selection of important variables in an explorative analysis is difficult, especially when different data sets of metabolomics data need to be related. Variable selection (or removal of irrelevant...... different strategies for variable selection on PLSR method were considered and compared with respect to selected subset of variables and the possibility for biological validation. Sparse PLSR [1] as well as PLSR with Jack-knifing [2] was applied to data in order to achieve variable selection prior...

  18. Input variable selection for interpolating high-resolution climate ...

    African Journals Online (AJOL)

    Although the primary input data of climate interpolations are usually meteorological data, other related (independent) variables are frequently incorporated in the interpolation process. One such variable is elevation, which is known to have a strong influence on climate. This research investigates the potential of 4 additional ...

  19. Selecting candidate predictor variables for the modelling of post ...

    African Journals Online (AJOL)

    Objectives: The objective of this project was to determine the variables most likely to be associated with post- .... (as defined subjectively by the research team) in global .... ed on their lack of knowledge of wealth scoring tools. ... HIV serology.

  20. A New Variable Weighting and Selection Procedure for K-Means Cluster Analysis

    Science.gov (United States)

    Steinley, Douglas; Brusco, Michael J.

    2008-01-01

    A variance-to-range ratio variable weighting procedure is proposed. We show how this weighting method is theoretically grounded in the inherent variability found in data exhibiting cluster structure. In addition, a variable selection procedure is proposed to operate in conjunction with the variable weighting technique. The performances of these…

  1. In-Depth Two-Year Study of Phenolic Profile Variability among Olive Oils from Autochthonous and Mediterranean Varieties in Morocco, as Revealed by a LC-MS Chemometric Profiling Approach

    Directory of Open Access Journals (Sweden)

    Aadil Bajoub

    2016-12-01

    Full Text Available Olive oil phenolic fraction considerably contributes to the sensory quality and nutritional value of this foodstuff. Herein, the phenolic fraction of 203 olive oil samples extracted from fruits of four autochthonous Moroccan cultivars (“Picholine Marocaine”, “Dahbia”, “Haouzia” and “Menara”, and nine Mediterranean varieties recently introduced in Morocco (“Arbequina”, “Arbosana”, “Cornicabra”, “Frantoio”, “Hojiblanca”, “Koroneiki”, “Manzanilla”, “Picholine de Languedoc” and “Picual”, were explored over two consecutive crop seasons (2012/2013 and 2013/2014 by using liquid chromatography-mass spectrometry. A total of 32 phenolic compounds (and quinic acid, belonging to five chemical classes (secoiridoids, simple phenols, flavonoids, lignans and phenolic acids were identified and quantified. Phenolic profiling revealed that the determined phenolic compounds showed variety-dependent levels, being, at the same time, significantly affected by the crop season. Moreover, based on the obtained phenolic composition and chemometric linear discriminant analysis, statistical models were obtained allowing a very satisfactory classification and prediction of the varietal origin of the studied oils.

  2. Pathogen-mediated selection for MHC variability in wild zebrafish

    Czech Academy of Sciences Publication Activity Database

    Smith, C.; Ondračková, Markéta; Spence, R.; Adams, S.; Betts, D. S.; Mallon, E.

    2011-01-01

    Roč. 13, č. 6 (2011), s. 589-605 ISSN 1522-0613 Institutional support: RVO:68081766 Keywords : digenean * frequency-dependent selection * heterozygote advantage * major histocompatibility complex * metazoan parasite * pathogen-driven selection Subject RIV: EG - Zoology Impact factor: 1.029, year: 2011

  3. Variable selection in multiple linear regression: The influence of ...

    African Journals Online (AJOL)

    provide an indication of whether the fit of the selected model improves or ... and calculate M(−i); quantify the influence of case i in terms of a function, f(•), of M and ..... [21] Venter JH & Snyman JLJ, 1997, Linear model selection based on risk ...

  4. Rainfall trends and variability in selected areas of Ethiopian Somali ...

    African Journals Online (AJOL)

    Moreover, proper spatial distribution of meteorological stations together with early warning system are required to further support local adaptive and coping strategies that the community designed towards rainfall variability in particular and climate change/disaster and risk at large. Keywords: Ethiopian Somali Region, Gode, ...

  5. Estimation of raw material performance in mammalian cell culture using near infrared spectra combined with chemometrics approaches.

    Science.gov (United States)

    Lee, Hae Woo; Christie, Andrew; Liu, Jun Jay; Yoon, Seongkyu

    2012-01-01

    Understanding variability in raw materials and their impacts on product quality is of critical importance in the biopharmaceutical manufacturing processes. For this purpose, several spectroscopic techniques have been studied for raw material characterization, providing fast and nondestructive ways to measure quality of raw materials. However, investigations of correlation between spectra of raw materials and cell culture performance have been scarce due to their complexity and uncertainty. In this study, near-infrared spectra and bioassays of multiple soy hydrolysate lots manufactured by different vendors were analyzed using chemometrics approaches in order to address variability of raw materials as well as correlation between raw material properties and corresponding cell culture performance. Principal component analysis revealed that near-infrared spectra of different soy lots contain enough physicochemical information about soy hydrolysates to allow identification of lot-to-lot variability as well as vendor-to-vendor differences. The identified compositional variability was further analyzed in order to estimate cell growth and protein production of two mammalian cell lines under the condition of varying soy dosages using partial least square regression combined with optimal variable selection. The performance of the resulting models demonstrates the potential of near-infrared spectroscopy as a robust lot selection tool for raw materials while providing a biological link between chemical composition of raw materials and cell culture performance. Copyright © 2012 American Institute of Chemical Engineers (AIChE).

  6. IMMAN: free software for information theory-based chemometric analysis.

    Science.gov (United States)

    Urias, Ricardo W Pino; Barigye, Stephen J; Marrero-Ponce, Yovani; García-Jacas, César R; Valdes-Martiní, José R; Perez-Gimenez, Facundo

    2015-05-01

    The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon's entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA

  7. Joint Variable Selection and Classification with Immunohistochemical Data

    Directory of Open Access Journals (Sweden)

    Debashis Ghosh

    2009-01-01

    Full Text Available To determine if candidate cancer biomarkers have utility in a clinical setting, validation using immunohistochemical methods is typically done. Most analyses of such data have not incorporated the multivariate nature of the staining profiles. In this article, we consider modelling such data using recently developed ideas from the machine learning community. In particular, we consider the joint goals of feature selection and classification. We develop estimation procedures for the analysis of immunohistochemical profiles using the least absolute selection and shrinkage operator. These lead to novel and flexible models and algorithms for the analysis of compositional data. The techniques are illustrated using data from a cancer biomarker study.

  8. The quasar luminosity function from a variability-selected sample

    Science.gov (United States)

    Hawkins, M. R. S.; Veron, P.

    1993-01-01

    A sample of quasars is selected from a 10-yr sequence of 30 UK Schmidt plates. Luminosity functions are derived in several redshift intervals, which in each case show a featureless power-law rise towards low luminosities. There is no sign of the 'break' found in the recent UVX sample of Boyle et al. It is suggested that reasons for the disagreement are connected with biases in the selection of the UVX sample. The question of the nature of quasar evolution appears to be still unresolved.

  9. Variable Selection in Time Series Forecasting Using Random Forests

    Directory of Open Access Journals (Sweden)

    Hristos Tyralis

    2017-10-01

    Full Text Available Time series forecasting using machine learning algorithms has gained popularity recently. Random forest is a machine learning algorithm implemented in time series forecasting; however, most of its forecasting properties have remained unexplored. Here we focus on assessing the performance of random forests in one-step forecasting using two large datasets of short time series with the aim to suggest an optimal set of predictor variables. Furthermore, we compare its performance to benchmarking methods. The first dataset is composed by 16,000 simulated time series from a variety of Autoregressive Fractionally Integrated Moving Average (ARFIMA models. The second dataset consists of 135 mean annual temperature time series. The highest predictive performance of RF is observed when using a low number of recent lagged predictor variables. This outcome could be useful in relevant future applications, with the prospect to achieve higher predictive accuracy.

  10. Effect of balance exercise on selected kinematic gait variables in ...

    African Journals Online (AJOL)

    The purpose of this study was to investigate the effect of balance exercise on some selected kinematic gait parameters in patients with knee joint osteoarthritis. Forty subjects (18 men and 22 women) participated in the study.They were divided into two groups: Group 1 (experimental) that was treated with balance exercises, ...

  11. The Relationship between Attitudes toward Censorship and Selected Academic Variables.

    Science.gov (United States)

    Dwyer, Edward J.; Summy, Mary K.

    1989-01-01

    To examine characteristics of subjects relative to their attitudes toward censorship, a study surveyed 98 college students selected from students in a public university in the southeastern United States. A 24-item Likert-style censorship scale was used to measure attitudes toward censorship. Strong agreement with affirmative items would suggest…

  12. The use of vector bootstrapping to improve variable selection precision in Lasso models

    NARCIS (Netherlands)

    Laurin, C.; Boomsma, D.I.; Lubke, G.H.

    2016-01-01

    The Lasso is a shrinkage regression method that is widely used for variable selection in statistical genetics. Commonly, K-fold cross-validation is used to fit a Lasso model. This is sometimes followed by using bootstrap confidence intervals to improve precision in the resulting variable selections.

  13. Random forest variable selection in spatial malaria transmission modelling in Mpumalanga Province, South Africa

    Directory of Open Access Journals (Sweden)

    Thandi Kapwata

    2016-11-01

    Full Text Available Malaria is an environmentally driven disease. In order to quantify the spatial variability of malaria transmission, it is imperative to understand the interactions between environmental variables and malaria epidemiology at a micro-geographic level using a novel statistical approach. The random forest (RF statistical learning method, a relatively new variable-importance ranking method, measures the variable importance of potentially influential parameters through the percent increase of the mean squared error. As this value increases, so does the relative importance of the associated variable. The principal aim of this study was to create predictive malaria maps generated using the selected variables based on the RF algorithm in the Ehlanzeni District of Mpumalanga Province, South Africa. From the seven environmental variables used [temperature, lag temperature, rainfall, lag rainfall, humidity, altitude, and the normalized difference vegetation index (NDVI], altitude was identified as the most influential predictor variable due its high selection frequency. It was selected as the top predictor for 4 out of 12 months of the year, followed by NDVI, temperature and lag rainfall, which were each selected twice. The combination of climatic variables that produced the highest prediction accuracy was altitude, NDVI, and temperature. This suggests that these three variables have high predictive capabilities in relation to malaria transmission. Furthermore, it is anticipated that the predictive maps generated from predictions made by the RF algorithm could be used to monitor the progression of malaria and assist in intervention and prevention efforts with respect to malaria.

  14. UV-Vis spectroscopy with chemometric data treatment. An option for on-line control in nuclear industry

    International Nuclear Information System (INIS)

    Kirsanov, Dmitry; Legin, Andrey

    2017-01-01

    Chemometrics can be very useful for the classical field of UV-Vis determination of metals in aqueous solutions. A conventional approach consisting of using selective bands in a univariate mode is often not applicable to the real-world samples from e.g. hydrometallurgical processes, because of overlapping signals, light scattering on foreign particles, gas bubble formation, etc. And this is where chemometrics can do a good job. This paper overviews certain contributions to the field of multivariate data processing of UV-Vis spectra for seemingly simple case of metal detection in aqueous solutions. Special attention is given to applications in nuclear technology field. (author)

  15. Chemometrics Methods for Specificity, Authenticity and Traceability Analysis of Olive Oils: Principles, Classifications and Applications

    Directory of Open Access Journals (Sweden)

    Habib Messai

    2016-11-01

    Full Text Available Background. Olive oils (OOs show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends’ preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i characterization by specific markers; (ii authentication by fingerprint patterns; and (iii monitoring by traceability analysis. Methods. These quality control and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and pattern recognition methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. Results. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries. Conclusion. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors.

  16. Chemometrics Methods for Specificity, Authenticity and Traceability Analysis of Olive Oils: Principles, Classifications and Applications

    Science.gov (United States)

    Messai, Habib; Farman, Muhammad; Sarraj-Laabidi, Abir; Hammami-Semmar, Asma; Semmar, Nabil

    2016-01-01

    Background. Olive oils (OOs) show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends’ preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i) characterization by specific markers; (ii) authentication by fingerprint patterns; and (iii) monitoring by traceability analysis. Methods. These quality control and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and pattern recognition methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. Results. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries. Conclusion. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors. PMID:28231172

  17. Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology.

    Science.gov (United States)

    Fox, Eric W; Hill, Ryan A; Leibowitz, Scott G; Olsen, Anthony R; Thornbrugh, Darren J; Weber, Marc H

    2017-07-01

    Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological data sets, there is limited guidance on variable selection methods for RF modeling. Typically, either a preselected set of predictor variables are used or stepwise procedures are employed which iteratively remove variables according to their importance measures. This paper investigates the application of variable selection methods to RF models for predicting probable biological stream condition. Our motivating data set consists of the good/poor condition of n = 1365 stream survey sites from the 2008/2009 National Rivers and Stream Assessment, and a large set (p = 212) of landscape features from the StreamCat data set as potential predictors. We compare two types of RF models: a full variable set model with all 212 predictors and a reduced variable set model selected using a backward elimination approach. We assess model accuracy using RF's internal out-of-bag estimate, and a cross-validation procedure with validation folds external to the variable selection process. We also assess the stability of the spatial predictions generated by the RF models to changes in the number of predictors and argue that model selection needs to consider both accuracy and stability. The results suggest that RF modeling is robust to the inclusion of many variables of moderate to low importance. We found no substantial improvement in cross-validated accuracy as a result of variable reduction. Moreover, the backward elimination procedure tended to select too few variables and exhibited numerous issues such as upwardly biased out-of-bag accuracy estimates and instabilities in the spatial predictions. We use simulations to further support and generalize results from the analysis of real data. A main purpose of this work is to elucidate issues of model selection bias and instability to ecologists interested in

  18. Using Random Forests to Select Optimal Input Variables for Short-Term Wind Speed Forecasting Models

    Directory of Open Access Journals (Sweden)

    Hui Wang

    2017-10-01

    Full Text Available Achieving relatively high-accuracy short-term wind speed forecasting estimates is a precondition for the construction and grid-connected operation of wind power forecasting systems for wind farms. Currently, most research is focused on the structure of forecasting models and does not consider the selection of input variables, which can have significant impacts on forecasting performance. This paper presents an input variable selection method for wind speed forecasting models. The candidate input variables for various leading periods are selected and random forests (RF is employed to evaluate the importance of all variable as features. The feature subset with the best evaluation performance is selected as the optimal feature set. Then, kernel-based extreme learning machine is constructed to evaluate the performance of input variables selection based on RF. The results of the case study show that by removing the uncorrelated and redundant features, RF effectively extracts the most strongly correlated set of features from the candidate input variables. By finding the optimal feature combination to represent the original information, RF simplifies the structure of the wind speed forecasting model, shortens the training time required, and substantially improves the model’s accuracy and generalization ability, demonstrating that the input variables selected by RF are effective.

  19. Variability-based active galactic nucleus selection using image subtraction in the SDSS and LSST era

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Yumi; Gibson, Robert R.; Becker, Andrew C.; Ivezić, Željko; Connolly, Andrew J.; Ruan, John J.; Anderson, Scott F. [Department of Astronomy, University of Washington, Box 351580, Seattle, WA 98195 (United States); MacLeod, Chelsea L., E-mail: ymchoi@astro.washington.edu [Physics Department, U.S. Naval Academy, 572 Holloway Road, Annapolis, MD 21402 (United States)

    2014-02-10

    With upcoming all-sky surveys such as LSST poised to generate a deep digital movie of the optical sky, variability-based active galactic nucleus (AGN) selection will enable the construction of highly complete catalogs with minimum contamination. In this study, we generate g-band difference images and construct light curves (LCs) for QSO/AGN candidates listed in Sloan Digital Sky Survey Stripe 82 public catalogs compiled from different methods, including spectroscopy, optical colors, variability, and X-ray detection. Image differencing excels at identifying variable sources embedded in complex or blended emission regions such as Type II AGNs and other low-luminosity AGNs that may be omitted from traditional photometric or spectroscopic catalogs. To separate QSOs/AGNs from other sources using our difference image LCs, we explore several LC statistics and parameterize optical variability by the characteristic damping timescale (τ) and variability amplitude. By virtue of distinguishable variability parameters of AGNs, we are able to select them with high completeness of 93.4% and efficiency (i.e., purity) of 71.3%. Based on optical variability, we also select highly variable blazar candidates, whose infrared colors are consistent with known blazars. One-third of them are also radio detected. With the X-ray selected AGN candidates, we probe the optical variability of X-ray detected optically extended sources using their difference image LCs for the first time. A combination of optical variability and X-ray detection enables us to select various types of host-dominated AGNs. Contrary to the AGN unification model prediction, two Type II AGN candidates (out of six) show detectable variability on long-term timescales like typical Type I AGNs. This study will provide a baseline for future optical variability studies of extended sources.

  20. Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology

    Science.gov (United States)

    Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological datasets there is limited guidance on variable selection methods for RF modeling. Typically, e...

  1. Chemical pattern of brazilian apples: a chemometric approach based on the Fuji and Gala varieties

    OpenAIRE

    Vieira,Renato Giovanetti; Prestes,Rosilene Aparecida; Denardi,Frederico; Nogueira,Alessandro; Wosiacki,Gilvan

    2011-01-01

    The chemical composition of apple juices may be used to discriminate between the varieties for consumption and those for raw material. Fuji and Gala have a chemical pattern that can be used for this classification. Multivariate methods correlate independent continuous chemical descriptors with the categorical apple variety. Three main descriptors of apple juice were selected: malic acid, total reducing sugar and total phenolic compounds. A chemometric approach, employing PCA and SIMCA, was us...

  2. Near Infrared Spectroscopy Calibration for Wood Chemistry: Which Chemometric Technique Is Best for Prediction and Interpretation?

    OpenAIRE

    Via, Brian K.; Zhou, Chengfeng; Acquah, Gifty; Jiang, Wei; Eckhardt, Lori

    2014-01-01

    This paper addresses the precision in factor loadings during partial least squares (PLS) and principal components regression (PCR) of wood chemistry content from near infrared reflectance (NIR) spectra. The precision of the loadings is considered important because these estimates are often utilized to interpret chemometric models or selection of meaningful wavenumbers. Standard laboratory chemistry methods were employed on a mixed genus/species hardwood sample set. PLS and PCR, before and af...

  3. Spectroscopic and chemometric exploration of food quality

    DEFF Research Database (Denmark)

    Pedersen, Dorthe Kjær

    2002-01-01

    and multi-way chemometrics demonstrated the potential for screening of environmental contamination in complex food samples. Significant prediction models were established with correlation coefficients in the range from r = 0.69 to r = 0.97 for dioxin. Further development of the fluorescence measurements......The desire to develop non-invasive rapid measurements of essential quality parameters in foods is the motivation of this thesis. Due to the speed and noninvasive properties of spectroscopic techniques, they have potential as on-line or atline methods and can be employed in the food industry...... in order to control the quality of the end product and to continuously monitor the production. In this thesis, the possibilities and limitations of the application of spectroscopy and chemometrics in rapid control of food quality are discussed and demonstrated by the examples in the eight included...

  4. Best conditions for biodegradation of diesel oil by chemometric tools

    Directory of Open Access Journals (Sweden)

    Ewa Kaczorek

    2014-01-01

    Full Text Available Diesel oil biodegradation by different bacteria-yeast-rhamnolipids consortia was tested. Chromatographic analysis of post-biodegradation residue was completed with chemometric tools (ANOVA, and a novel ranking procedure based on the sum of ranking differences. These tools were used in the selection of the most effective systems. The best results of aliphatic fractions of diesel oil biodegradation were observed for a yeast consortia with Aeromonas hydrophila KR4. For these systems the positive effect of rhamnolipids on hydrocarbon biodegradation was observed. However, rhamnolipids addition did not always have a positive influence on the biodegradation process (e.g. in case of yeast consortia with Stenotrophomonas maltophila KR7. Moreover, particular differences in the degradation pattern were observed for lower and higher alkanes than in the case with C22. Normally, the best conditions for "lower" alkanes are Aeromonas hydrophila KR4 + emulsifier independently from yeasts and e.g. Pseudomonas stutzeri KR7 for C24 alkane.

  5. Best conditions for biodegradation of diesel oil by chemometric tools

    Science.gov (United States)

    Kaczorek, Ewa; Bielicka-Daszkiewicz, Katarzyna; Héberger, Károly; Kemény, Sándor; Olszanowski, Andrzej; Voelkel, Adam

    2014-01-01

    Diesel oil biodegradation by different bacteria-yeast-rhamnolipids consortia was tested. Chromatographic analysis of post-biodegradation residue was completed with chemometric tools (ANOVA, and a novel ranking procedure based on the sum of ranking differences). These tools were used in the selection of the most effective systems. The best results of aliphatic fractions of diesel oil biodegradation were observed for a yeast consortia with Aeromonas hydrophila KR4. For these systems the positive effect of rhamnolipids on hydrocarbon biodegradation was observed. However, rhamnolipids addition did not always have a positive influence on the biodegradation process (e.g. in case of yeast consortia with Stenotrophomonas maltophila KR7). Moreover, particular differences in the degradation pattern were observed for lower and higher alkanes than in the case with C22. Normally, the best conditions for “lower” alkanes are Aeromonas hydrophila KR4 + emulsifier independently from yeasts and e.g. Pseudomonas stutzeri KR7 for C24 alkane. PMID:24948922

  6. Bayesian Multiresolution Variable Selection for Ultra-High Dimensional Neuroimaging Data.

    Science.gov (United States)

    Zhao, Yize; Kang, Jian; Long, Qi

    2018-01-01

    Ultra-high dimensional variable selection has become increasingly important in analysis of neuroimaging data. For example, in the Autism Brain Imaging Data Exchange (ABIDE) study, neuroscientists are interested in identifying important biomarkers for early detection of the autism spectrum disorder (ASD) using high resolution brain images that include hundreds of thousands voxels. However, most existing methods are not feasible for solving this problem due to their extensive computational costs. In this work, we propose a novel multiresolution variable selection procedure under a Bayesian probit regression framework. It recursively uses posterior samples for coarser-scale variable selection to guide the posterior inference on finer-scale variable selection, leading to very efficient Markov chain Monte Carlo (MCMC) algorithms. The proposed algorithms are computationally feasible for ultra-high dimensional data. Also, our model incorporates two levels of structural information into variable selection using Ising priors: the spatial dependence between voxels and the functional connectivity between anatomical brain regions. Applied to the resting state functional magnetic resonance imaging (R-fMRI) data in the ABIDE study, our methods identify voxel-level imaging biomarkers highly predictive of the ASD, which are biologically meaningful and interpretable. Extensive simulations also show that our methods achieve better performance in variable selection compared to existing methods.

  7. Comparison of Sparse and Jack-knife partial least squares regression methods for variable selection

    DEFF Research Database (Denmark)

    Karaman, Ibrahim; Qannari, El Mostafa; Martens, Harald

    2013-01-01

    The objective of this study was to compare two different techniques of variable selection, Sparse PLSR and Jack-knife PLSR, with respect to their predictive ability and their ability to identify relevant variables. Sparse PLSR is a method that is frequently used in genomics, whereas Jack-knife PL...

  8. Punishment induced behavioural and neurophysiological variability reveals dopamine-dependent selection of kinematic movement parameters

    Science.gov (United States)

    Galea, Joseph M.; Ruge, Diane; Buijink, Arthur; Bestmann, Sven; Rothwell, John C.

    2013-01-01

    Action selection describes the high-level process which selects between competing movements. In animals, behavioural variability is critical for the motor exploration required to select the action which optimizes reward and minimizes cost/punishment, and is guided by dopamine (DA). The aim of this study was to test in humans whether low-level movement parameters are affected by punishment and reward in ways similar to high-level action selection. Moreover, we addressed the proposed dependence of behavioural and neurophysiological variability on DA, and whether this may underpin the exploration of kinematic parameters. Participants performed an out-and-back index finger movement and were instructed that monetary reward and punishment were based on its maximal acceleration (MA). In fact, the feedback was not contingent on the participant’s behaviour but pre-determined. Blocks highly-biased towards punishment were associated with increased MA variability relative to blocks with either reward or without feedback. This increase in behavioural variability was positively correlated with neurophysiological variability, as measured by changes in cortico-spinal excitability with transcranial magnetic stimulation over the primary motor cortex. Following the administration of a DA-antagonist, the variability associated with punishment diminished and the correlation between behavioural and neurophysiological variability no longer existed. Similar changes in variability were not observed when participants executed a pre-determined MA, nor did DA influence resting neurophysiological variability. Thus, under conditions of punishment, DA-dependent processes influence the selection of low-level movement parameters. We propose that the enhanced behavioural variability reflects the exploration of kinematic parameters for less punishing, or conversely more rewarding, outcomes. PMID:23447607

  9. Coupled variable selection for regression modeling of complex treatment patterns in a clinical cancer registry.

    Science.gov (United States)

    Schmidtmann, I; Elsäßer, A; Weinmann, A; Binder, H

    2014-12-30

    For determining a manageable set of covariates potentially influential with respect to a time-to-event endpoint, Cox proportional hazards models can be combined with variable selection techniques, such as stepwise forward selection or backward elimination based on p-values, or regularized regression techniques such as component-wise boosting. Cox regression models have also been adapted for dealing with more complex event patterns, for example, for competing risks settings with separate, cause-specific hazard models for each event type, or for determining the prognostic effect pattern of a variable over different landmark times, with one conditional survival model for each landmark. Motivated by a clinical cancer registry application, where complex event patterns have to be dealt with and variable selection is needed at the same time, we propose a general approach for linking variable selection between several Cox models. Specifically, we combine score statistics for each covariate across models by Fisher's method as a basis for variable selection. This principle is implemented for a stepwise forward selection approach as well as for a regularized regression technique. In an application to data from hepatocellular carcinoma patients, the coupled stepwise approach is seen to facilitate joint interpretation of the different cause-specific Cox models. In conditional survival models at landmark times, which address updates of prediction as time progresses and both treatment and other potential explanatory variables may change, the coupled regularized regression approach identifies potentially important, stably selected covariates together with their effect time pattern, despite having only a small number of events. These results highlight the promise of the proposed approach for coupling variable selection between Cox models, which is particularly relevant for modeling for clinical cancer registries with their complex event patterns. Copyright © 2014 John Wiley & Sons

  10. Variable selectivity and the role of nutritional quality in food selection by a planktonic rotifer

    International Nuclear Information System (INIS)

    Sierszen, M.E.

    1990-01-01

    To investigate the potential for selective feeding to enhance fitness, I test the hypothesis that an herbivorous zooplankter selects those food items that best support its reproduction. Under this hypothesis, growth and reproduction on selected food items should be higher than on less preferred items. The hypothesis is not supported. In situ selectivity by the rotifer Keratella taurocephala for Cryptomonas relative to Chlamydomonas goes through a seasonal cycle, in apparent response to fluctuating Cryptomonas populations. However, reproduction on a unialgal diet of Cryptomonas is consistently high and similar to that on Chlamydomonas. Oocystis, which also supports reproduction equivalent to that supported by Chlamydomonas, is sometimes rejected by K. taurocephala. In addition, K. taurocephala does not discriminate between Merismopedia and Chlamydomonas even though Merismopedia supports virtually no reproduction by the rotifer. Selection by K. taurocephala does not simply maximize the intake of food items that yield high reproduction. Selectivity is a complex, dynamic process, one function of which may be the exploitation of locally or seasonally abundant foods. (author)

  11. The Effects of Variability and Risk in Selection Utility Analysis: An Empirical Comparison.

    Science.gov (United States)

    Rich, Joseph R.; Boudreau, John W.

    1987-01-01

    Investigated utility estimate variability for the selection utility of using the Programmer Aptitude Test to select computer programmers. Comparison of Monte Carlo results to other risk assessment approaches (sensitivity analysis, break-even analysis, algebraic derivation of the distribtion) suggests that distribution information provided by Monte…

  12. A Time-Series Water Level Forecasting Model Based on Imputation and Variable Selection Method.

    Science.gov (United States)

    Yang, Jun-He; Cheng, Ching-Hsue; Chan, Chia-Pan

    2017-01-01

    Reservoirs are important for households and impact the national economy. This paper proposed a time-series forecasting model based on estimating a missing value followed by variable selection to forecast the reservoir's water level. This study collected data from the Taiwan Shimen Reservoir as well as daily atmospheric data from 2008 to 2015. The two datasets are concatenated into an integrated dataset based on ordering of the data as a research dataset. The proposed time-series forecasting model summarily has three foci. First, this study uses five imputation methods to directly delete the missing value. Second, we identified the key variable via factor analysis and then deleted the unimportant variables sequentially via the variable selection method. Finally, the proposed model uses a Random Forest to build the forecasting model of the reservoir's water level. This was done to compare with the listing method under the forecasting error. These experimental results indicate that the Random Forest forecasting model when applied to variable selection with full variables has better forecasting performance than the listing model. In addition, this experiment shows that the proposed variable selection can help determine five forecast methods used here to improve the forecasting capability.

  13. A Time-Series Water Level Forecasting Model Based on Imputation and Variable Selection Method

    Directory of Open Access Journals (Sweden)

    Jun-He Yang

    2017-01-01

    Full Text Available Reservoirs are important for households and impact the national economy. This paper proposed a time-series forecasting model based on estimating a missing value followed by variable selection to forecast the reservoir’s water level. This study collected data from the Taiwan Shimen Reservoir as well as daily atmospheric data from 2008 to 2015. The two datasets are concatenated into an integrated dataset based on ordering of the data as a research dataset. The proposed time-series forecasting model summarily has three foci. First, this study uses five imputation methods to directly delete the missing value. Second, we identified the key variable via factor analysis and then deleted the unimportant variables sequentially via the variable selection method. Finally, the proposed model uses a Random Forest to build the forecasting model of the reservoir’s water level. This was done to compare with the listing method under the forecasting error. These experimental results indicate that the Random Forest forecasting model when applied to variable selection with full variables has better forecasting performance than the listing model. In addition, this experiment shows that the proposed variable selection can help determine five forecast methods used here to improve the forecasting capability.

  14. Symbiosis of chemometrics and metabolomics: past, present, and future

    NARCIS (Netherlands)

    van der Greef, J.; Smilde, A. K.

    2005-01-01

    Metabolomics is a growing area in the field of systems biology. Metabolomics has already a long history and also the connection of metabolomics with chemometrics goes back some time. This review discusses the symbiosis of metabolomics and chemometrics with emphasis on the medical domain, puts the

  15. Air quality modelling using chemometric techniques | Azid | Journal ...

    African Journals Online (AJOL)

    This study presents that the chemometric techniques and modelling become an excellent tool in API assessment, air pollution source identification, apportionment and can be setbacks in designing an API monitoring network for effective air pollution resources management. Keywords: air pollutant index; chemometric; ANN; ...

  16. Novel Harmonic Regularization Approach for Variable Selection in Cox’s Proportional Hazards Model

    Directory of Open Access Journals (Sweden)

    Ge-Jin Chu

    2014-01-01

    Full Text Available Variable selection is an important issue in regression and a number of variable selection methods have been proposed involving nonconvex penalty functions. In this paper, we investigate a novel harmonic regularization method, which can approximate nonconvex Lq  (1/2select key risk factors in the Cox’s proportional hazards model using microarray gene expression data. The harmonic regularization method can be efficiently solved using our proposed direct path seeking approach, which can produce solutions that closely approximate those for the convex loss function and the nonconvex regularization. Simulation results based on the artificial datasets and four real microarray gene expression datasets, such as real diffuse large B-cell lymphoma (DCBCL, the lung cancer, and the AML datasets, show that the harmonic regularization method can be more accurate for variable selection than existing Lasso series methods.

  17. A survey of variable selection methods in two Chinese epidemiology journals

    Directory of Open Access Journals (Sweden)

    Lynn Henry S

    2010-09-01

    Full Text Available Abstract Background Although much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies. This review surveys the variable selection methods reported in two high-ranking Chinese epidemiology journals. Methods Articles published in 2004, 2006, and 2008 in the Chinese Journal of Epidemiology and the Chinese Journal of Preventive Medicine were reviewed. Five categories of methods were identified whereby variables were selected using: A - bivariate analyses; B - multivariable analysis; e.g. stepwise or individual significance testing of model coefficients; C - first bivariate analyses, followed by multivariable analysis; D - bivariate analyses or multivariable analysis; and E - other criteria like prior knowledge or personal judgment. Results Among the 287 articles that reported using variable selection methods, 6%, 26%, 30%, 21%, and 17% were in categories A through E, respectively. One hundred sixty-three studies selected variables using bivariate analyses, 80% (130/163 via multiple significance testing at the 5% alpha-level. Of the 219 multivariable analyses, 97 (44% used stepwise procedures, 89 (41% tested individual regression coefficients, but 33 (15% did not mention how variables were selected. Sixty percent (58/97 of the stepwise routines also did not specify the algorithm and/or significance levels. Conclusions The variable selection methods reported in the two journals were limited in variety, and details were often missing. Many studies still relied on problematic techniques like stepwise procedures and/or multiple testing of bivariate associations at the 0.05 alpha-level. These deficiencies should be rectified to safeguard the scientific validity of articles published in Chinese epidemiology journals.

  18. HR-MAS NMR allied to chemometric on Hancornia speciosa varieties differentiation

    Energy Technology Data Exchange (ETDEWEB)

    Flores, Igor S. [Instituto Federal de Goiás (IFG), Luziânia, GO (Brazil); Silva, Andressa K.; Chaves, Lazaro J.; Collevatti, Rosane G.; Lião, Luciano M., E-mail: lucianoliao@ufg.br [Universidade Federal de Goiás (UFG), Goiânia, GO (Brazil); Furquim, Leonnardo C. [Faculdade Objetivo, GO (Brazil); Castro, Carlos F.S. [Instituto Federal de Educação, Ciência e Tecnologia Goiano (IFGoiano), GO (Brazil)

    2018-05-01

    This work describes the potential of chemometric analyses applied to {sup 1}H high-resolution magic angle spinning nuclear magnetic resonance ({sup 1}H HR-MAS NMR) data for the chemotaxonomic investigation of Hancornia speciosa (Apocynaceae) varieties. This plant, popularly known as mangaba, has a complex morphological differentiation and thus chemical analyses can be used for their taxonomic classification. In comparison to traditional techniques, {sup 1}H HR-MAS NMR allied with chemometrics provided a simple and low cost method for chemotaxonomy. Leaves of four varieties of H. speciosa from a common garden experiment was studied and demonstrated that H. speciosa var. speciosa differs from others due to its specific metabolic profile, and var. pubescens was discriminated based on its high phenolic compound content. The distinction between the latter variety and gardineri is important once it allows for the selection of samples with greater commercial value, once they produce the largest and heaviest fruits. (author)

  19. Predictive and Descriptive CoMFA Models: The Effect of Variable Selection.

    Science.gov (United States)

    Sepehri, Bakhtyar; Omidikia, Nematollah; Kompany-Zareh, Mohsen; Ghavami, Raouf

    2018-01-01

    Aims & Scope: In this research, 8 variable selection approaches were used to investigate the effect of variable selection on the predictive power and stability of CoMFA models. Three data sets including 36 EPAC antagonists, 79 CD38 inhibitors and 57 ATAD2 bromodomain inhibitors were modelled by CoMFA. First of all, for all three data sets, CoMFA models with all CoMFA descriptors were created then by applying each variable selection method a new CoMFA model was developed so for each data set, 9 CoMFA models were built. Obtained results show noisy and uninformative variables affect CoMFA results. Based on created models, applying 5 variable selection approaches including FFD, SRD-FFD, IVE-PLS, SRD-UVEPLS and SPA-jackknife increases the predictive power and stability of CoMFA models significantly. Among them, SPA-jackknife removes most of the variables while FFD retains most of them. FFD and IVE-PLS are time consuming process while SRD-FFD and SRD-UVE-PLS run need to few seconds. Also applying FFD, SRD-FFD, IVE-PLS, SRD-UVE-PLS protect CoMFA countor maps information for both fields. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  20. Genome-wide prediction of traits with different genetic architecture through efficient variable selection.

    Science.gov (United States)

    Wimmer, Valentin; Lehermeier, Christina; Albrecht, Theresa; Auinger, Hans-Jürgen; Wang, Yu; Schön, Chris-Carolin

    2013-10-01

    In genome-based prediction there is considerable uncertainty about the statistical model and method required to maximize prediction accuracy. For traits influenced by a small number of quantitative trait loci (QTL), predictions are expected to benefit from methods performing variable selection [e.g., BayesB or the least absolute shrinkage and selection operator (LASSO)] compared to methods distributing effects across the genome [ridge regression best linear unbiased prediction (RR-BLUP)]. We investigate the assumptions underlying successful variable selection by combining computer simulations with large-scale experimental data sets from rice (Oryza sativa L.), wheat (Triticum aestivum L.), and Arabidopsis thaliana (L.). We demonstrate that variable selection can be successful when the number of phenotyped individuals is much larger than the number of causal mutations contributing to the trait. We show that the sample size required for efficient variable selection increases dramatically with decreasing trait heritabilities and increasing extent of linkage disequilibrium (LD). We contrast and discuss contradictory results from simulation and experimental studies with respect to superiority of variable selection methods over RR-BLUP. Our results demonstrate that due to long-range LD, medium heritabilities, and small sample sizes, superiority of variable selection methods cannot be expected in plant breeding populations even for traits like FRIGIDA gene expression in Arabidopsis and flowering time in rice, assumed to be influenced by a few major QTL. We extend our conclusions to the analysis of whole-genome sequence data and infer upper bounds for the number of causal mutations which can be identified by LASSO. Our results have major impact on the choice of statistical method needed to make credible inferences about genetic architecture and prediction accuracy of complex traits.

  1. Application of mass spectrometry based electronic nose and chemometrics for fingerprinting radiation treatment

    International Nuclear Information System (INIS)

    Gupta, Sumit; Variyar, Prasad S.; Sharma, Arun

    2015-01-01

    Volatile compounds were isolated from apples and grapes employing solid phase micro extraction (SPME) and subsequently analyzed by GC/MS equipped with a transfer line without stationary phase. Single peak obtained was integrated to obtain total mass spectrum of the volatile fraction of samples. A data matrix having relative abundance of all mass-to-charge ratios was subjected to principal component analysis (PCA) and linear discriminant analysis (LDA) to identify radiation treatment. PCA results suggested that there is sufficient variability between control and irradiated samples to build classification models based on supervised techniques. LDA successfully aided in segregating control from irradiated samples at all doses (0.1, 0.25, 0.5, 1.0, 1.5, 2.0 kGy). SPME-MS with chemometrics was successfully demonstrated as simple screening method for radiation treatment. - Highlights: • Total mass spectra obtained from HS-MS for control and irradiated fruits. • Grapes and apples are chosen for present study. • Total mass spectrum was analyzed by two chemometric techniques (PCA and LDA). • Successful segregation of control and irradiated samples achieved using chemometrics

  2. Chemometric modeling of thermogravimetric data for the compositional analysis of forest biomass.

    Science.gov (United States)

    Acquah, Gifty E; Via, Brian K; Fasina, Oladiran O; Adhikari, Sushil; Billor, Nedret; Eckhardt, Lori G

    2017-01-01

    The objective of this study was to investigated the use of chemometric modeling of thermogravimetric (TG) data as an alternative approach to estimate the chemical and proximate (i.e. volatile matter, fixed carbon and ash contents) composition of lignocellulosic biomass. Since these properties affect the conversion pathway, processing costs, yield and / or quality of products, a capability to rapidly determine these for biomass feedstock entering the process stream will be useful in the success and efficiency of bioconversion technologies. The 38-minute long methodology developed in this study enabled the simultaneous prediction of both the chemical and proximate properties of forest-derived biomass from the same TG data. Conventionally, two separate experiments had to be conducted to obtain such information. In addition, the chemometric models constructed with normalized TG data outperformed models developed via the traditional deconvolution of TG data. PLS and PCR models were especially robust in predicting the volatile matter (R2-0.92; RPD- 3.58) and lignin (R2-0.82; RPD- 2.40) contents of the biomass. The application of chemometrics to TG data also made it possible to predict some monomeric sugars in this study. Elucidation of PC loadings obtained from chemometric models also provided some insights into the thermal decomposition behavior of the chemical constituents of lignocellulosic biomass. For instance, similar loadings were noted for volatile matter and cellulose, and for fixed carbon and lignin. The findings indicate that common latent variables are shared between these chemical and thermal reactivity properties. Results from this study buttresses literature that have reported that the less thermally stable polysaccharides are responsible for the yield of volatiles whereas the more recalcitrant lignin with its higher percentage of elementary carbon contributes to the yield of fixed carbon.

  3. Chemometric modeling of thermogravimetric data for the compositional analysis of forest biomass.

    Directory of Open Access Journals (Sweden)

    Gifty E Acquah

    Full Text Available The objective of this study was to investigated the use of chemometric modeling of thermogravimetric (TG data as an alternative approach to estimate the chemical and proximate (i.e. volatile matter, fixed carbon and ash contents composition of lignocellulosic biomass. Since these properties affect the conversion pathway, processing costs, yield and / or quality of products, a capability to rapidly determine these for biomass feedstock entering the process stream will be useful in the success and efficiency of bioconversion technologies. The 38-minute long methodology developed in this study enabled the simultaneous prediction of both the chemical and proximate properties of forest-derived biomass from the same TG data. Conventionally, two separate experiments had to be conducted to obtain such information. In addition, the chemometric models constructed with normalized TG data outperformed models developed via the traditional deconvolution of TG data. PLS and PCR models were especially robust in predicting the volatile matter (R2-0.92; RPD- 3.58 and lignin (R2-0.82; RPD- 2.40 contents of the biomass. The application of chemometrics to TG data also made it possible to predict some monomeric sugars in this study. Elucidation of PC loadings obtained from chemometric models also provided some insights into the thermal decomposition behavior of the chemical constituents of lignocellulosic biomass. For instance, similar loadings were noted for volatile matter and cellulose, and for fixed carbon and lignin. The findings indicate that common latent variables are shared between these chemical and thermal reactivity properties. Results from this study buttresses literature that have reported that the less thermally stable polysaccharides are responsible for the yield of volatiles whereas the more recalcitrant lignin with its higher percentage of elementary carbon contributes to the yield of fixed carbon.

  4. Selection of variables for neural network analysis. Comparisons of several methods with high energy physics data

    International Nuclear Information System (INIS)

    Proriol, J.

    1994-01-01

    Five different methods are compared for selecting the most important variables with a view to classifying high energy physics events with neural networks. The different methods are: the F-test, Principal Component Analysis (PCA), a decision tree method: CART, weight evaluation, and Optimal Cell Damage (OCD). The neural networks use the variables selected with the different methods. We compare the percentages of events properly classified by each neural network. The learning set and the test set are the same for all the neural networks. (author)

  5. Curve fitting and modeling with splines using statistical variable selection techniques

    Science.gov (United States)

    Smith, P. L.

    1982-01-01

    The successful application of statistical variable selection techniques to fit splines is demonstrated. Major emphasis is given to knot selection, but order determination is also discussed. Two FORTRAN backward elimination programs, using the B-spline basis, were developed. The program for knot elimination is compared in detail with two other spline-fitting methods and several statistical software packages. An example is also given for the two-variable case using a tensor product basis, with a theoretical discussion of the difficulties of their use.

  6. Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection

    KAUST Repository

    Chen, Lisha

    2012-12-01

    The reduced-rank regression is an effective method in predicting multiple response variables from the same set of predictor variables. It reduces the number of model parameters and takes advantage of interrelations between the response variables and hence improves predictive accuracy. We propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty. We apply a group-lasso type penalty that treats each row of the matrix of the regression coefficients as a group and show that this penalty satisfies certain desirable invariance properties. We develop two numerical algorithms to solve the penalized regression problem and establish the asymptotic consistency of the proposed method. In particular, the manifold structure of the reduced-rank regression coefficient matrix is considered and studied in our theoretical analysis. In our simulation study and real data analysis, the new method is compared with several existing variable selection methods for multivariate regression and exhibits competitive performance in prediction and variable selection. © 2012 American Statistical Association.

  7. Combining epidemiologic and biostatistical tools to enhance variable selection in HIV cohort analyses.

    Directory of Open Access Journals (Sweden)

    Christopher Rentsch

    Full Text Available BACKGROUND: Variable selection is an important step in building a multivariate regression model for which several methods and statistical packages are available. A comprehensive approach for variable selection in complex multivariate regression analyses within HIV cohorts is explored by utilizing both epidemiological and biostatistical procedures. METHODS: Three different methods for variable selection were illustrated in a study comparing survival time between subjects in the Department of Defense's National History Study and the Atlanta Veterans Affairs Medical Center's HIV Atlanta VA Cohort Study. The first two methods were stepwise selection procedures, based either on significance tests (Score test, or on information theory (Akaike Information Criterion, while the third method employed a Bayesian argument (Bayesian Model Averaging. RESULTS: All three methods resulted in a similar parsimonious survival model. Three of the covariates previously used in the multivariate model were not included in the final model suggested by the three approaches. When comparing the parsimonious model to the previously published model, there was evidence of less variance in the main survival estimates. CONCLUSIONS: The variable selection approaches considered in this study allowed building a model based on significance tests, on an information criterion, and on averaging models using their posterior probabilities. A parsimonious model that balanced these three approaches was found to provide a better fit than the previously reported model.

  8. Chemometrics: A new scenario in herbal drug standardization

    Directory of Open Access Journals (Sweden)

    Ankit Bansal

    2014-08-01

    Full Text Available Chromatography and spectroscopy techniques are the most commonly used methods in standardization of herbal medicines but the herbal system is not easy to analyze because of their complexity of chemical composition. Many cutting-edge analytical technologies have been introduced to evaluate the quality of medicinal plants and significant amount of measurement data has been produced. Chemometric techniques provide a good opportunity for mining more useful chemical information from the original data. Then, the application of chemometrics in the field of medicinal plants is spontaneous and necessary. Comprehensive methods and hyphenated techniques associated with chemometrics used for extracting useful information and supplying various methods of data processing are now more and more widely used in medicinal plants, among which chemometrics resolution methods and principal component analysis (PCA are most commonly used techniques. This review focuses on the recent various important analytical techniques, important chemometrics tools and interpretation of results by PCA, and applications of chemometrics in quality evaluation of medicinal plants in the authenticity, efficacy and consistency. Key words: Chemometrics, HELP, Herbal drugs, PCA, OPA

  9. Current Debates on Variability in Child Welfare Decision-Making: A Selected Literature Review

    Directory of Open Access Journals (Sweden)

    Emily Keddell

    2014-11-01

    Full Text Available This article considers selected drivers of decision variability in child welfare decision-making and explores current debates in relation to these drivers. Covering the related influences of national orientation, risk and responsibility, inequality and poverty, evidence-based practice, constructions of abuse and its causes, domestic violence and cognitive processes, it discusses the literature in regards to how each of these influences decision variability. It situates these debates in relation to the ethical issue of variability and the equity issues that variability raises. I propose that despite the ecological complexity that drives decision variability, that improving internal (within-country decision consistency is still a valid goal. It may be that the use of annotated case examples, kind learning systems, and continued commitments to the social justice issues of inequality and individualisation can contribute to this goal.

  10. EFFECT OF CORE TRAINING ON SELECTED HEMATOLOGICAL VARIABLES AMONG BASKETBALL PLAYERS

    OpenAIRE

    K. Rejinadevi; Dr. C. Ramesh

    2017-01-01

    The purpose of the study was to find out the effect of core training on selected haematological variables among basketball players. For the purpose of the study forty men basketball players were selected as subjects from S.V.N College and Arul Anandar College, Madurai, Tamilnadu at random and their age ranged from 18 to 25 years. The selected subjects are divided in to two groups of twenty subjects each. Group I acted as core training group and Group II acted as control group. The experimenta...

  11. Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection

    KAUST Repository

    Chen, Lisha; Huang, Jianhua Z.

    2012-01-01

    and hence improves predictive accuracy. We propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty. We apply a group-lasso type penalty that treats each row of the matrix of the regression coefficients as a group

  12. Meta-Statistics for Variable Selection: The R Package BioMark

    Directory of Open Access Journals (Sweden)

    Ron Wehrens

    2012-11-01

    Full Text Available Biomarker identification is an ever more important topic in the life sciences. With the advent of measurement methodologies based on microarrays and mass spectrometry, thousands of variables are routinely being measured on complex biological samples. Often, the question is what makes two groups of samples different. Classical hypothesis testing suffers from the multiple testing problem; however, correcting for this often leads to a lack of power. In addition, choosing α cutoff levels remains somewhat arbitrary. Also in a regression context, a model depending on few but relevant variables will be more accurate and precise, and easier to interpret biologically.We propose an R package, BioMark, implementing two meta-statistics for variable selection. The first, higher criticism, presents a data-dependent selection threshold for significance, instead of a cookbook value of α = 0.05. It is applicable in all cases where two groups are compared. The second, stability selection, is more general, and can also be applied in a regression context. This approach uses repeated subsampling of the data in order to assess the variability of the model coefficients and selects those that remain consistently important. It is shown using experimental spike-in data from the field of metabolomics that both approaches work well with real data. BioMark also contains functionality for simulating data with specific characteristics for algorithm development and testing.

  13. A Robust Supervised Variable Selection for Noisy High-Dimensional Data

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan; Schlenker, Anna

    2015-01-01

    Roč. 2015, Article 320385 (2015), s. 1-10 ISSN 2314-6133 R&D Projects: GA ČR GA13-17187S Institutional support: RVO:67985807 Keywords : dimensionality reduction * variable selection * robustness Subject RIV: BA - General Mathematics Impact factor: 2.134, year: 2015

  14. Automatic variable selection method and a comparison for quantitative analysis in laser-induced breakdown spectroscopy

    Science.gov (United States)

    Duan, Fajie; Fu, Xiao; Jiang, Jiajia; Huang, Tingting; Ma, Ling; Zhang, Cong

    2018-05-01

    In this work, an automatic variable selection method for quantitative analysis of soil samples using laser-induced breakdown spectroscopy (LIBS) is proposed, which is based on full spectrum correction (FSC) and modified iterative predictor weighting-partial least squares (mIPW-PLS). The method features automatic selection without artificial processes. To illustrate the feasibility and effectiveness of the method, a comparison with genetic algorithm (GA) and successive projections algorithm (SPA) for different elements (copper, barium and chromium) detection in soil was implemented. The experimental results showed that all the three methods could accomplish variable selection effectively, among which FSC-mIPW-PLS required significantly shorter computation time (12 s approximately for 40,000 initial variables) than the others. Moreover, improved quantification models were got with variable selection approaches. The root mean square errors of prediction (RMSEP) of models utilizing the new method were 27.47 (copper), 37.15 (barium) and 39.70 (chromium) mg/kg, which showed comparable prediction effect with GA and SPA.

  15. Sparse supervised principal component analysis (SSPCA) for dimension reduction and variable selection

    DEFF Research Database (Denmark)

    Sharifzadeh, Sara; Ghodsi, Ali; Clemmensen, Line H.

    2017-01-01

    Principal component analysis (PCA) is one of the main unsupervised pre-processing methods for dimension reduction. When the training labels are available, it is worth using a supervised PCA strategy. In cases that both dimension reduction and variable selection are required, sparse PCA (SPCA...

  16. Cataclysmic variables from a ROSAT/2MASS selection - I. Four new intermediate polars

    NARCIS (Netherlands)

    Gänsicke, B.T.; Marsh, T.R.; Edge, A.; Rodríguez-Gil, P.; Steeghs, D.; Araujo-Betancor, S.; Harlaftis, E.; Giannakis, O.; Pyrzas, S.; Morales-Rueda, L.; Aungwerojwit, A.

    2005-01-01

    We report the first results from a new search for cataclysmic variables (CVs) using a combined X-ray (ROSAT)/infrared (2MASS) target selection that discriminates against background active galactic nuclei. Identification spectra were obtained at the Isaac Newton Telescope for a total of 174 targets,

  17. A QSAR Study of Environmental Estrogens Based on a Novel Variable Selection Method

    Directory of Open Access Journals (Sweden)

    Aiqian Zhang

    2012-05-01

    Full Text Available A large number of descriptors were employed to characterize the molecular structure of 53 natural, synthetic, and environmental chemicals which are suspected of disrupting endocrine functions by mimicking or antagonizing natural hormones and may thus pose a serious threat to the health of humans and wildlife. In this work, a robust quantitative structure-activity relationship (QSAR model with a novel variable selection method has been proposed for the effective estrogens. The variable selection method is based on variable interaction (VSMVI with leave-multiple-out cross validation (LMOCV to select the best subset. During variable selection, model construction and assessment, the Organization for Economic Co-operation and Development (OECD principles for regulation of QSAR acceptability were fully considered, such as using an unambiguous multiple-linear regression (MLR algorithm to build the model, using several validation methods to assessment the performance of the model, giving the define of applicability domain and analyzing the outliers with the results of molecular docking. The performance of the QSAR model indicates that the VSMVI is an effective, feasible and practical tool for rapid screening of the best subset from large molecular descriptors.

  18. Variable selection in the explorative analysis of several data blocks in metabolomics

    DEFF Research Database (Denmark)

    Karaman, İbrahim; Nørskov, Natalja; Yde, Christian Clement

    highly correlated data sets in one integrated approach. Due to the high number of variables in data sets from metabolomics (both raw data and after peak picking) the selection of important variables in an explorative analysis is difficult, especially when different data sets of metabolomics data need...... to be related. Tools for the handling of mental overflow minimising false discovery rates both by using statistical and biological validation in an integrative approach are needed. In this paper different strategies for variable selection were considered with respect to false discovery and the possibility...... for biological validation. The data set used in this study is metabolomics data from an animal intervention study. The aim of the metabolomics study was to investigate the metabolic profile in pigs fed various cereal fractions with special attention to the metabolism of lignans using NMR and LC-MS based...

  19. Multivariate fault isolation of batch processes via variable selection in partial least squares discriminant analysis.

    Science.gov (United States)

    Yan, Zhengbing; Kuang, Te-Hui; Yao, Yuan

    2017-09-01

    In recent years, multivariate statistical monitoring of batch processes has become a popular research topic, wherein multivariate fault isolation is an important step aiming at the identification of the faulty variables contributing most to the detected process abnormality. Although contribution plots have been commonly used in statistical fault isolation, such methods suffer from the smearing effect between correlated variables. In particular, in batch process monitoring, the high autocorrelations and cross-correlations that exist in variable trajectories make the smearing effect unavoidable. To address such a problem, a variable selection-based fault isolation method is proposed in this research, which transforms the fault isolation problem into a variable selection problem in partial least squares discriminant analysis and solves it by calculating a sparse partial least squares model. As different from the traditional methods, the proposed method emphasizes the relative importance of each process variable. Such information may help process engineers in conducting root-cause diagnosis. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  20. Penalized regression procedures for variable selection in the potential outcomes framework.

    Science.gov (United States)

    Ghosh, Debashis; Zhu, Yeying; Coffman, Donna L

    2015-05-10

    A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple 'impute, then select' class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data, and imputation are drawn. A difference least absolute shrinkage and selection operator algorithm is defined, along with its multiple imputation analogs. The procedures are illustrated using a well-known right-heart catheterization dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  1. Not accounting for interindividual variability can mask habitat selection patterns: a case study on black bears.

    Science.gov (United States)

    Lesmerises, Rémi; St-Laurent, Martin-Hugues

    2017-11-01

    Habitat selection studies conducted at the population scale commonly aim to describe general patterns that could improve our understanding of the limiting factors in species-habitat relationships. Researchers often consider interindividual variation in selection patterns to control for its effects and avoid pseudoreplication by using mixed-effect models that include individuals as random factors. Here, we highlight common pitfalls and possible misinterpretations of this strategy by describing habitat selection of 21 black bears Ursus americanus. We used Bayesian mixed-effect models and compared results obtained when using random intercept (i.e., population level) versus calculating individual coefficients for each independent variable (i.e., individual level). We then related interindividual variability to individual characteristics (i.e., age, sex, reproductive status, body condition) in a multivariate analysis. The assumption of comparable behavior among individuals was verified only in 40% of the cases in our seasonal best models. Indeed, we found strong and opposite responses among sampled bears and individual coefficients were linked to individual characteristics. For some covariates, contrasted responses canceled each other out at the population level. In other cases, interindividual variability was concealed by the composition of our sample, with the majority of the bears (e.g., old individuals and bears in good physical condition) driving the population response (e.g., selection of young forest cuts). Our results stress the need to consider interindividual variability to avoid misinterpretation and uninformative results, especially for a flexible and opportunistic species. This study helps to identify some ecological drivers of interindividual variability in bear habitat selection patterns.

  2. Effects of environmental variables on invasive amphibian activity: Using model selection on quantiles for counts

    Science.gov (United States)

    Muller, Benjamin J.; Cade, Brian S.; Schwarzkoph, Lin

    2018-01-01

    Many different factors influence animal activity. Often, the value of an environmental variable may influence significantly the upper or lower tails of the activity distribution. For describing relationships with heterogeneous boundaries, quantile regressions predict a quantile of the conditional distribution of the dependent variable. A quantile count model extends linear quantile regression methods to discrete response variables, and is useful if activity is quantified by trapping, where there may be many tied (equal) values in the activity distribution, over a small range of discrete values. Additionally, different environmental variables in combination may have synergistic or antagonistic effects on activity, so examining their effects together, in a modeling framework, is a useful approach. Thus, model selection on quantile counts can be used to determine the relative importance of different variables in determining activity, across the entire distribution of capture results. We conducted model selection on quantile count models to describe the factors affecting activity (numbers of captures) of cane toads (Rhinella marina) in response to several environmental variables (humidity, temperature, rainfall, wind speed, and moon luminosity) over eleven months of trapping. Environmental effects on activity are understudied in this pest animal. In the dry season, model selection on quantile count models suggested that rainfall positively affected activity, especially near the lower tails of the activity distribution. In the wet season, wind speed limited activity near the maximum of the distribution, while minimum activity increased with minimum temperature. This statistical methodology allowed us to explore, in depth, how environmental factors influenced activity across the entire distribution, and is applicable to any survey or trapping regime, in which environmental variables affect activity.

  3. Uninformative variable elimination assisted by Gram-Schmidt Orthogonalization/successive projection algorithm for descriptor selection in QSAR

    DEFF Research Database (Denmark)

    Omidikia, Nematollah; Kompany-Zareh, Mohsen

    2013-01-01

    Employment of Uninformative Variable Elimination (UVE) as a robust variable selection method is reported in this study. Each regression coefficient represents the contribution of the corresponding variable in the established model, but in the presence of uninformative variables as well as colline......Employment of Uninformative Variable Elimination (UVE) as a robust variable selection method is reported in this study. Each regression coefficient represents the contribution of the corresponding variable in the established model, but in the presence of uninformative variables as well...... as collinearity reliability of the regression coefficient's magnitude is suspicious. Successive Projection Algorithm (SPA) and Gram-Schmidt Orthogonalization (GSO) were implemented as pre-selection technique for removing collinearity and redundancy among variables in the model. Uninformative variable elimination...

  4. Current application of chemometrics in traditional Chinese herbal medicine research.

    Science.gov (United States)

    Huang, Yipeng; Wu, Zhenwei; Su, Rihui; Ruan, Guihua; Du, Fuyou; Li, Gongke

    2016-07-15

    Traditional Chinese herbal medicines (TCHMs) are promising approach for the treatment of various diseases which have attracted increasing attention all over the world. Chemometrics in quality control of TCHMs are great useful tools that harnessing mathematics, statistics and other methods to acquire information maximally from the data obtained from various analytical approaches. This feature article focuses on the recent studies which evaluating the pharmacological efficacy and quality of TCHMs by determining, identifying and discriminating the bioactive or marker components in different samples with the help of chemometric techniques. In this work, the application of chemometric techniques in the classification of TCHMs based on their efficacy and usage was introduced. The recent advances of chemometrics applied in the chemical analysis of TCHMs were reviewed in detail. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Effects of carprofen or meloxicam on selected haemostatic variables in miniature pigs after orthopaedic surgery

    Directory of Open Access Journals (Sweden)

    Petr Raušer

    2011-01-01

    Full Text Available The aim of the study was to detect and compare the haemostatic variables and bleeding after 7‑days administration of carprofen or meloxicam in clinically healthy miniature pigs. Twenty-one clinically healthy Göttingen miniature pigs were divided into 3 groups. Selected haemostatic variables such as platelet count, prothrombin time, activated partial thromboplastin time, thrombin time, fibrinogen, serum biochemical variables such as total protein, bilirubin, urea, creatinine, alkaline phosphatase, alanine aminotransferase and gamma-glutamyltransferase and haemoglobin, haematocrit, red blood cells, white blood cells and buccal mucosal bleeding time were assessed before and 7 days after daily intramuscular administration of saline (1.5 ml per animal, control group, carprofen (2 mg·kg-1 or meloxicam (0.1 mg·kg-1. In pigs receiving carprofen or meloxicam, the thrombin time was significantly increased (p p p p < 0.05 compared to the control group. Significant differences were not detected in other haemostatic, biochemical variables or bleeding time compared to other groups or to the pretreatment values. Intramuscular administration of carprofen or meloxicam in healthy miniature pigs for 7 days causes sporadic, but not clinically important changes of selected haemostatic variables. Therefore, we can recommend them for perioperative use, e.g. for their analgesic effects, in orthopaedic or other surgical procedures without increased bleeding.

  6. Chemometric Analysis of High Molecular Mass Glutenin Subunits and Image Data of Bread Crumb Structure from Croatian Wheat Cultivars

    OpenAIRE

    Zorica Jurković; Rezica Sudar; Damir Magdić; Daniela Horvat; Želimir Kurtanjek

    2002-01-01

    The aim of this work is to investigate functional relationships among wheat properties, high molecular mass (weight) (HMW) glutenin subunits and bread quality produced from eleven Croatian wheat cultivars by chemometric analysis. HMW glutenin subunits were fractionated by sodium dodecylsulfate polyacrylamid gel electrophoresis (SDS-PAGE) and subsequently analysed by scanning densitometry in order to quantify HMW glutenin fractions. Wheat properties are characterised by four variables: protein...

  7. A New Variable Selection Method Based on Mutual Information Maximization by Replacing Collinear Variables for Nonlinear Quantitative Structure-Property Relationship Models

    Energy Technology Data Exchange (ETDEWEB)

    Ghasemi, Jahan B.; Zolfonoun, Ehsan [Toosi University of Technology, Tehran (Korea, Republic of)

    2012-05-15

    Selection of the most informative molecular descriptors from the original data set is a key step for development of quantitative structure activity/property relationship models. Recently, mutual information (MI) has gained increasing attention in feature selection problems. This paper presents an effective mutual information-based feature selection approach, named mutual information maximization by replacing collinear variables (MIMRCV), for nonlinear quantitative structure-property relationship models. The proposed variable selection method was applied to three different QSPR datasets, soil degradation half-life of 47 organophosphorus pesticides, GC-MS retention times of 85 volatile organic compounds, and water-to-micellar cetyltrimethylammonium bromide partition coefficients of 62 organic compounds.The obtained results revealed that using MIMRCV as feature selection method improves the predictive quality of the developed models compared to conventional MI based variable selection algorithms.

  8. A New Variable Selection Method Based on Mutual Information Maximization by Replacing Collinear Variables for Nonlinear Quantitative Structure-Property Relationship Models

    International Nuclear Information System (INIS)

    Ghasemi, Jahan B.; Zolfonoun, Ehsan

    2012-01-01

    Selection of the most informative molecular descriptors from the original data set is a key step for development of quantitative structure activity/property relationship models. Recently, mutual information (MI) has gained increasing attention in feature selection problems. This paper presents an effective mutual information-based feature selection approach, named mutual information maximization by replacing collinear variables (MIMRCV), for nonlinear quantitative structure-property relationship models. The proposed variable selection method was applied to three different QSPR datasets, soil degradation half-life of 47 organophosphorus pesticides, GC-MS retention times of 85 volatile organic compounds, and water-to-micellar cetyltrimethylammonium bromide partition coefficients of 62 organic compounds.The obtained results revealed that using MIMRCV as feature selection method improves the predictive quality of the developed models compared to conventional MI based variable selection algorithms

  9. Input variable selection for data-driven models of Coriolis flowmeters for two-phase flow measurement

    International Nuclear Information System (INIS)

    Wang, Lijuan; Yan, Yong; Wang, Xue; Wang, Tao

    2017-01-01

    Input variable selection is an essential step in the development of data-driven models for environmental, biological and industrial applications. Through input variable selection to eliminate the irrelevant or redundant variables, a suitable subset of variables is identified as the input of a model. Meanwhile, through input variable selection the complexity of the model structure is simplified and the computational efficiency is improved. This paper describes the procedures of the input variable selection for the data-driven models for the measurement of liquid mass flowrate and gas volume fraction under two-phase flow conditions using Coriolis flowmeters. Three advanced input variable selection methods, including partial mutual information (PMI), genetic algorithm-artificial neural network (GA-ANN) and tree-based iterative input selection (IIS) are applied in this study. Typical data-driven models incorporating support vector machine (SVM) are established individually based on the input candidates resulting from the selection methods. The validity of the selection outcomes is assessed through an output performance comparison of the SVM based data-driven models and sensitivity analysis. The validation and analysis results suggest that the input variables selected from the PMI algorithm provide more effective information for the models to measure liquid mass flowrate while the IIS algorithm provides a fewer but more effective variables for the models to predict gas volume fraction. (paper)

  10. Calibration Variable Selection and Natural Zero Determination for Semispan and Canard Balances

    Science.gov (United States)

    Ulbrich, Norbert M.

    2013-01-01

    Independent calibration variables for the characterization of semispan and canard wind tunnel balances are discussed. It is shown that the variable selection for a semispan balance is determined by the location of the resultant normal and axial forces that act on the balance. These two forces are the first and second calibration variable. The pitching moment becomes the third calibration variable after the normal and axial forces are shifted to the pitch axis of the balance. Two geometric distances, i.e., the rolling and yawing moment arms, are the fourth and fifth calibration variable. They are traditionally substituted by corresponding moments to simplify the use of calibration data during a wind tunnel test. A canard balance is related to a semispan balance. It also only measures loads on one half of a lifting surface. However, the axial force and yawing moment are of no interest to users of a canard balance. Therefore, its calibration variable set is reduced to the normal force, pitching moment, and rolling moment. The combined load diagrams of the rolling and yawing moment for a semispan balance are discussed. They may be used to illustrate connections between the wind tunnel model geometry, the test section size, and the calibration load schedule. Then, methods are reviewed that may be used to obtain the natural zeros of a semispan or canard balance. In addition, characteristics of three semispan balance calibration rigs are discussed. Finally, basic requirements for a full characterization of a semispan balance are reviewed.

  11. The Selection, Use, and Reporting of Control Variables in International Business Research

    DEFF Research Database (Denmark)

    Nielsen, Bo Bernhard; Raswant, Arpit

    2018-01-01

    This study explores the selection, use, and reporting of control variables in studies published in the leading international business (IB) research journals. We review a sample of 246 empirical studies published in the top five IB journals over the period 2012–2015 with particular emphasis...... on selection, use, and reporting of controls. Approximately 83% of studies included only half of what we consider Minimum Standard of Practice with regards to controls, whereas only 38% of the studies met the 75% threshold. We provide recommendations on how to effectively identify, use and report controls...

  12. Joint Bayesian variable and graph selection for regression models with network-structured predictors

    Science.gov (United States)

    Peterson, C. B.; Stingo, F. C.; Vannucci, M.

    2015-01-01

    In this work, we develop a Bayesian approach to perform selection of predictors that are linked within a network. We achieve this by combining a sparse regression model relating the predictors to a response variable with a graphical model describing conditional dependencies among the predictors. The proposed method is well-suited for genomic applications since it allows the identification of pathways of functionally related genes or proteins which impact an outcome of interest. In contrast to previous approaches for network-guided variable selection, we infer the network among predictors using a Gaussian graphical model and do not assume that network information is available a priori. We demonstrate that our method outperforms existing methods in identifying network-structured predictors in simulation settings, and illustrate our proposed model with an application to inference of proteins relevant to glioblastoma survival. PMID:26514925

  13. Demographic Variables and Selective, Sustained Attention and Planning through Cognitive Tasks among Healthy Adults

    Directory of Open Access Journals (Sweden)

    Afsaneh Zarghi

    2011-04-01

    Full Text Available Introduction: Cognitive tasks are considered to be applicable and appropriate in assessing cognitive domains. The purpose of our study is to determine the relationship existence between variables of age, sex and education with selective, sustained attention and planning abilities by means of computerized cognitive tasks among healthy adults. Methods: A cross-sectional study was implemented during 6 months from June to November, 2010 on 84 healthy adults (42 male and 42 female. The whole participants performed computerized CPT, STROOP and TOL tests after being content and trained. Results: The obtained data indicate that there is a significant correlation coefficient between age, sex and education variables (p<0.05. Discussion: The above-mentioned tests can be used to assess selective, sustained attention and planning.

  14. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  15. A Time-Series Water Level Forecasting Model Based on Imputation and Variable Selection Method

    OpenAIRE

    Jun-He Yang; Ching-Hsue Cheng; Chia-Pan Chan

    2017-01-01

    Reservoirs are important for households and impact the national economy. This paper proposed a time-series forecasting model based on estimating a missing value followed by variable selection to forecast the reservoir's water level. This study collected data from the Taiwan Shimen Reservoir as well as daily atmospheric data from 2008 to 2015. The two datasets are concatenated into an integrated dataset based on ordering of the data as a research dataset. The proposed time-series forecasting m...

  16. Demographic Variables and Selective, Sustained Attention and Planning through Cognitive Tasks among Healthy Adults

    OpenAIRE

    Afsaneh Zarghi; Zali; A; Tehranidost; M; Mohammad Reza Zarindast; Ashrafi; F; Doroodgar; Khodadadi

    2011-01-01

    Introduction: Cognitive tasks are considered to be applicable and appropriate in assessing cognitive domains. The purpose of our study is to determine the relationship existence between variables of age, sex and education with selective, sustained attention and planning abilities by means of computerized cognitive tasks among healthy adults. Methods: A cross-sectional study was implemented during 6 months from June to November, 2010 on 84 healthy adults (42 male and 42 female). The whole part...

  17. gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework

    OpenAIRE

    Hofner, Benjamin; Mayr, Andreas; Schmid, Matthias

    2014-01-01

    Generalized additive models for location, scale and shape are a flexible class of regression models that allow to model multiple parameters of a distribution function, such as the mean and the standard deviation, simultaneously. With the R package gamboostLSS, we provide a boosting method to fit these models. Variable selection and model choice are naturally available within this regularized regression framework. To introduce and illustrate the R package gamboostLSS and its infrastructure, we...

  18. Data re-arranging techniques leading to proper variable selections in high energy physics

    Science.gov (United States)

    Kůs, Václav; Bouř, Petr

    2017-12-01

    We introduce a new data based approach to homogeneity testing and variable selection carried out in high energy physics experiments, where one of the basic tasks is to test the homogeneity of weighted samples, mainly the Monte Carlo simulations (weighted) and real data measurements (unweighted). This technique is called ’data re-arranging’ and it enables variable selection performed by means of the classical statistical homogeneity tests such as Kolmogorov-Smirnov, Anderson-Darling, or Pearson’s chi-square divergence test. P-values of our variants of homogeneity tests are investigated and the empirical verification through 46 dimensional high energy particle physics data sets is accomplished under newly proposed (equiprobable) quantile binning. Particularly, the procedure of homogeneity testing is applied to re-arranged Monte Carlo samples and real DATA sets measured at the particle accelerator Tevatron in Fermilab at DØ experiment originating from top-antitop quark pair production in two decay channels (electron, muon) with 2, 3, or 4+ jets detected. Finally, the variable selections in the electron and muon channels induced by the re-arranging procedure for homogeneity testing are provided for Tevatron top-antitop quark data sets.

  19. Improving breast cancer classification with mammography, supported on an appropriate variable selection analysis

    Science.gov (United States)

    Pérez, Noel; Guevara, Miguel A.; Silva, Augusto

    2013-02-01

    This work addresses the issue of variable selection within the context of breast cancer classification with mammography. A comprehensive repository of feature vectors was used including a hybrid subset gathering image-based and clinical features. It aimed to gather experimental evidence of variable selection in terms of cardinality, type and find a classification scheme that provides the best performance over the Area Under Receiver Operating Characteristics Curve (AUC) scores using the ranked features subset. We evaluated and classified a total of 300 subsets of features formed by the application of Chi-Square Discretization, Information-Gain, One-Rule and RELIEF methods in association with Feed-Forward Backpropagation Neural Network (FFBP), Support Vector Machine (SVM) and Decision Tree J48 (DTJ48) Machine Learning Algorithms (MLA) for a comparative performance evaluation based on AUC scores. A variable selection analysis was performed for Single-View Ranking and Multi-View Ranking groups of features. Features subsets representing Microcalcifications (MCs), Masses and both MCs and Masses lesions achieved AUC scores of 0.91, 0.954 and 0.934 respectively. Experimental evidence demonstrated that classification performance was improved by combining image-based and clinical features. The most important clinical and image-based features were StromaDistortion and Circularity respectively. Other less important but worth to use due to its consistency were Contrast, Perimeter, Microcalcification, Correlation and Elongation.

  20. The XRF spectrometer and the selection of analysis conditions (instrumental variables)

    International Nuclear Information System (INIS)

    Willis, J.P.

    2002-01-01

    Full text: This presentation will begin with a brief discussion of EDXRF and flat- and curved-crystal WDXRF spectrometers, contrasting the major differences between the three types. The remainder of the presentation will contain a detailed overview of the choice and settings of the many instrumental variables contained in a modern WDXRF spectrometer, and will discuss critically the choices facing the analyst in setting up a WDXRF spectrometer for different elements and applications. In particular it will discuss the choice of tube target (when a choice is possible), the kV and mA settings, tube filters, collimator masks, collimators, analyzing crystals, secondary collimators, detectors, pulse height selection, X-ray path medium (air, nitrogen, vacuum or helium), counting times for peak and background positions and their effect on counting statistics and lower limit of detection (LLD). The use of Figure of Merit (FOM) calculations to objectively choose the best combination of instrumental variables also will be discussed. This presentation will be followed by a shorter session on a subsequent day entitled - A Selection of XRF Conditions - Practical Session, where participants will be given the opportunity to discuss in groups the selection of the best instrumental variables for three very diverse applications. Copyright (2002) Australian X-ray Analytical Association Inc

  1. Selected Macroeconomic Variables and Stock Market Movements: Empirical evidence from Thailand

    Directory of Open Access Journals (Sweden)

    Joseph Ato Forson

    2014-06-01

    Full Text Available This paper investigates and analyzes the long-run equilibrium relationship between the Thai stock Exchange Index (SETI and selected macroeconomic variables using monthly time series data that cover a 20-year period from January 1990 to December 2009. The following macroeconomic variables are included in our analysis: money supply (MS, the consumer price index (CPI, interest rate (IR and the industrial production index (IP (as a proxy for GDP. Our findings prove that the SET Index and the selected macroeconomic variables are cointegrated at I (1 and have a significant equilibrium relationship over the long run. Money supply demonstrates a strong positive relationship with the SET Index over the long run, whereas the industrial production index and consumer price index show negative long-run relationships with the SET Index. Furthermore, in non-equilibrium situations, the error correction mechanism suggests that the consumer price index, industrial production index and money supply each contribute in some way to restore equilibrium. In addition, using Toda and Yamamoto’s augmented Granger causality test, we identify a bi-causal relationship between industrial production and money supply and unilateral causal relationships between CPI and IR, IP and CPI, MS and CPI, and IP and SETI, indicating that all of these variables are sensitive to Thai stock market movements. The policy implications of these findings are also discussed.

  2. Variable selection based near infrared spectroscopy quantitative and qualitative analysis on wheat wet gluten

    Science.gov (United States)

    Lü, Chengxu; Jiang, Xunpeng; Zhou, Xingfan; Zhang, Yinqiao; Zhang, Naiqian; Wei, Chongfeng; Mao, Wenhua

    2017-10-01

    Wet gluten is a useful quality indicator for wheat, and short wave near infrared spectroscopy (NIRS) is a high performance technique with the advantage of economic rapid and nondestructive test. To study the feasibility of short wave NIRS analyzing wet gluten directly from wheat seed, 54 representative wheat seed samples were collected and scanned by spectrometer. 8 spectral pretreatment method and genetic algorithm (GA) variable selection method were used to optimize analysis. Both quantitative and qualitative model of wet gluten were built by partial least squares regression and discriminate analysis. For quantitative analysis, normalization is the optimized pretreatment method, 17 wet gluten sensitive variables are selected by GA, and GA model performs a better result than that of all variable model, with R2V=0.88, and RMSEV=1.47. For qualitative analysis, automatic weighted least squares baseline is the optimized pretreatment method, all variable models perform better results than those of GA models. The correct classification rates of 3 class of 30% wet gluten content are 95.45, 84.52, and 90.00%, respectively. The short wave NIRS technique shows potential for both quantitative and qualitative analysis of wet gluten for wheat seed.

  3. The Use of Variable Q1 Isolation Windows Improves Selectivity in LC-SWATH-MS Acquisition.

    Science.gov (United States)

    Zhang, Ying; Bilbao, Aivett; Bruderer, Tobias; Luban, Jeremy; Strambio-De-Castillia, Caterina; Lisacek, Frédérique; Hopfgartner, Gérard; Varesio, Emmanuel

    2015-10-02

    As tryptic peptides and metabolites are not equally distributed along the mass range, the probability of cross fragment ion interference is higher in certain windows when fixed Q1 SWATH windows are applied. We evaluated the benefits of utilizing variable Q1 SWATH windows with regards to selectivity improvement. Variable windows based on equalizing the distribution of either the precursor ion population (PIP) or the total ion current (TIC) within each window were generated by an in-house software, swathTUNER. These two variable Q1 SWATH window strategies outperformed, with respect to quantification and identification, the basic approach using a fixed window width (FIX) for proteomic profiling of human monocyte-derived dendritic cells (MDDCs). Thus, 13.8 and 8.4% additional peptide precursors, which resulted in 13.1 and 10.0% more proteins, were confidently identified by SWATH using the strategy PIP and TIC, respectively, in the MDDC proteomic sample. On the basis of the spectral library purity score, some improvement warranted by variable Q1 windows was also observed, albeit to a lesser extent, in the metabolomic profiling of human urine. We show that the novel concept of "scheduled SWATH" proposed here, which incorporates (i) variable isolation windows and (ii) precursor retention time segmentation further improves both peptide and metabolite identifications.

  4. Fingerprints for main varieties of argentinean wines: terroir differentiation by inorganic, organic, and stable isotopic analyses coupled to chemometrics.

    Science.gov (United States)

    Di Paola-Naranjo, Romina D; Baroni, Maria V; Podio, Natalia S; Rubinstein, Hector R; Fabani, Maria P; Badini, Raul G; Inga, Marcela; Ostera, Hector A; Cagnoni, Mariana; Gallegos, Ernesto; Gautier, Eduardo; Peral-Garcia, Pilar; Hoogewerff, Jurian; Wunderlin, Daniel A

    2011-07-27

    Our main goal was to investigate if robust chemical fingerprints could be developed for three Argentinean red wines based on organic, inorganic, and isotopic patterns, in relation to the regional soil composition. Soils and wines from three regions (Mendoza, San Juan, and Córdoba) and three varieties (Cabernet Sauvignon, Malbec, and Syrah) were collected. The phenolic profile was determined by HPLC-MS/MS and multielemental composition by ICP-MS; (87)Sr/(86)Sr and δ(13)C were determined by TIMS and IRMS, respectively. Chemometrics allowed robust differentiation between regions, wine varieties, and the same variety from different regions. Among phenolic compounds, resveratrol concentration was the most useful marker for wine differentiation, whereas Mg, K/Rb, Ca/Sr, and (87)Sr/(86)Sr were the main inorganic and isotopic parameters selected. Generalized Procrustes analysis (GPA) using two studied matrices (wine and soil) shows consensus between them and clear differences between studied areas. Finally, we applied a canonical correlation analysis, demonstrating significant correlation (r = 0.99; p wine composition. To our knowledge this is the first report combining independent variables, constructing a fingerprint including elemental composition, isotopic, and polyphenol patterns to differentiate wines, matching part of this fingerprint with the soil provenance.

  5. ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Berian James, J. [Astronomy Department, University of California, Berkeley, CA 94720-7450 (United States); Brink, Henrik [Dark Cosmology Centre, Juliane Maries Vej 30, 2100 Copenhagen O (Denmark); Long, James P.; Rice, John, E-mail: jwrichar@stat.berkeley.edu [Statistics Department, University of California, Berkeley, CA 94720-7450 (United States)

    2012-01-10

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL-where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up-is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  6. Social variables exert selective pressures in the evolution and form of primate mimetic musculature.

    Science.gov (United States)

    Burrows, Anne M; Li, Ly; Waller, Bridget M; Micheletta, Jerome

    2016-04-01

    Mammals use their faces in social interactions more so than any other vertebrates. Primates are an extreme among most mammals in their complex, direct, lifelong social interactions and their frequent use of facial displays is a means of proximate visual communication with conspecifics. The available repertoire of facial displays is primarily controlled by mimetic musculature, the muscles that move the face. The form of these muscles is, in turn, limited by and influenced by phylogenetic inertia but here we use examples, both morphological and physiological, to illustrate the influence that social variables may exert on the evolution and form of mimetic musculature among primates. Ecomorphology is concerned with the adaptive responses of morphology to various ecological variables such as diet, foliage density, predation pressures, and time of day activity. We present evidence that social variables also exert selective pressures on morphology, specifically using mimetic muscles among primates as an example. Social variables include group size, dominance 'style', and mating systems. We present two case studies to illustrate the potential influence of social behavior on adaptive morphology of mimetic musculature in primates: (1) gross morphology of the mimetic muscles around the external ear in closely related species of macaque (Macaca mulatta and Macaca nigra) characterized by varying dominance styles and (2) comparative physiology of the orbicularis oris muscle among select ape species. This muscle is used in both facial displays/expressions and in vocalizations/human speech. We present qualitative observations of myosin fiber-type distribution in this muscle of siamang (Symphalangus syndactylus), chimpanzee (Pan troglodytes), and human to demonstrate the potential influence of visual and auditory communication on muscle physiology. In sum, ecomorphologists should be aware of social selective pressures as well as ecological ones, and that observed morphology might

  7. HEART RATE VARIABILITY CLASSIFICATION USING SADE-ELM CLASSIFIER WITH BAT FEATURE SELECTION

    Directory of Open Access Journals (Sweden)

    R Kavitha

    2017-07-01

    Full Text Available The electrical activity of the human heart is measured by the vital bio medical signal called ECG. This electrocardiogram is employed as a crucial source to gather the diagnostic information of a patient’s cardiopathy. The monitoring function of cardiac disease is diagnosed by documenting and handling the electrocardiogram (ECG impulses. In the recent years many research has been done and developing an enhanced method to identify the risk in the patient’s body condition by processing and analysing the ECG signal. This analysis of the signal helps to find the cardiac abnormalities, arrhythmias, and many other heart problems. ECG signal is processed to detect the variability in heart rhythm; heart rate variability is calculated based on the time interval between heart beats. Heart Rate Variability HRV is measured by the variation in the beat to beat interval. The Heart rate Variability (HRV is an essential aspect to diagnose the properties of the heart. Recent development enhances the potential with the aid of non-linear metrics in reference point with feature selection. In this paper, the fundamental elements are taken from the ECG signal for feature selection process where Bat algorithm is employed for feature selection to predict the best feature and presented to the classifier for accurate classification. The popular machine learning algorithm ELM is taken for classification, integrated with evolutionary algorithm named Self- Adaptive Differential Evolution Extreme Learning Machine SADEELM to improve the reliability of classification. It combines Effective Fuzzy Kohonen clustering network (EFKCN to be able to increase the accuracy of the effect for HRV transmission classification. Hence, it is observed that the experiment carried out unveils that the precision is improved by the SADE-ELM method and concurrently optimizes the computation time.

  8. ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION

    International Nuclear Information System (INIS)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Berian James, J.; Brink, Henrik; Long, James P.; Rice, John

    2012-01-01

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  9. Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

    Science.gov (United States)

    Richards, Joseph W.; Starr, Dan L.; Brink, Henrik; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; James, J. Berian; Long, James P.; Rice, John

    2012-01-01

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  10. Chemometric assessment of enhanced bioremediation of oil contaminated soils.

    Science.gov (United States)

    Soleimani, Mohsen; Farhoudi, Majid; Christensen, Jan H

    2013-06-15

    Bioremediation is a promising technique for reclamation of oil polluted soils. In this study, six methods for enhancing bioremediation were tested on oil contaminated soils from three refinery areas in Iran (Isfahan, Arak, and Tehran). The methods included bacterial enrichment, planting, and addition of nitrogen and phosphorous, molasses, hydrogen peroxide, and a surfactant (Tween 80). Total petroleum hydrocarbon (TPH) concentrations and CHEMometric analysis of Selected Ion Chromatograms (SIC) termed CHEMSIC method of petroleum biomarkers including terpanes, regular, diaromatic and triaromatic steranes were used for determining the level and type of hydrocarbon contamination. The same methods were used to study oil weathering of 2 to 6 ring polycyclic aromatic compounds (PACs). Results demonstrated that bacterial enrichment and addition of nutrients were most efficient with 50% to 62% removal of TPH. Furthermore, the CHEMSIC results demonstrated that the bacterial enrichment was more efficient in degradation of n-alkanes and low molecular weight PACs as well as alkylated PACs (e.g. C₃-C₄ naphthalenes, C₂ phenanthrenes and C₂-C₃ dibenzothiophenes), while nutrient addition led to a larger relative removal of isoprenoids (e.g. norpristane, pristane and phytane). It is concluded that the CHEMSIC method is a valuable tool for assessing bioremediation efficiency. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Regional regression models of percentile flows for the contiguous United States: Expert versus data-driven independent variable selection

    Directory of Open Access Journals (Sweden)

    Geoffrey Fouad

    2018-06-01

    New hydrological insights for the region: A set of three variables selected based on an expert assessment of factors that influence percentile flows performed similarly to larger sets of variables selected using a data-driven method. Expert assessment variables included mean annual precipitation, potential evapotranspiration, and baseflow index. Larger sets of up to 37 variables contributed little, if any, additional predictive information. Variables used to describe the distribution of basin data (e.g. standard deviation were not useful, and average values were sufficient to characterize physical and climatic basin conditions. Effectiveness of the expert assessment variables may be due to the high degree of multicollinearity (i.e. cross-correlation among additional variables. A tool is provided in the Supplementary material to predict percentile flows based on the three expert assessment variables. Future work should develop new variables with a strong understanding of the processes related to percentile flows.

  12. Cholinergic enhancement reduces functional connectivity and BOLD variability in visual extrastriate cortex during selective attention.

    Science.gov (United States)

    Ricciardi, Emiliano; Handjaras, Giacomo; Bernardi, Giulio; Pietrini, Pietro; Furey, Maura L

    2013-01-01

    Enhancing cholinergic function improves performance on various cognitive tasks and alters neural responses in task specific brain regions. We have hypothesized that the changes in neural activity observed during increased cholinergic function reflect an increase in neural efficiency that leads to improved task performance. The current study tested this hypothesis by assessing neural efficiency based on cholinergically-mediated effects on regional brain connectivity and BOLD signal variability. Nine subjects participated in a double-blind, placebo-controlled crossover fMRI study. Following an infusion of physostigmine (1 mg/h) or placebo, echo-planar imaging (EPI) was conducted as participants performed a selective attention task. During the task, two images comprised of superimposed pictures of faces and houses were presented. Subjects were instructed periodically to shift their attention from one stimulus component to the other and to perform a matching task using hand held response buttons. A control condition included phase-scrambled images of superimposed faces and houses that were presented in the same temporal and spatial manner as the attention task; participants were instructed to perform a matching task. Cholinergic enhancement improved performance during the selective attention task, with no change during the control task. Functional connectivity analyses showed that the strength of connectivity between ventral visual processing areas and task-related occipital, parietal and prefrontal regions reduced significantly during cholinergic enhancement, exclusively during the selective attention task. Physostigmine administration also reduced BOLD signal temporal variability relative to placebo throughout temporal and occipital visual processing areas, again during the selective attention task only. Together with the observed behavioral improvement, the decreases in connectivity strength throughout task-relevant regions and BOLD variability within stimulus

  13. Selection of controlled variables in bioprocesses. Application to a SHARON-Anammox process for autotrophic nitrogen removal

    DEFF Research Database (Denmark)

    Mauricio Iglesias, Miguel; Valverde Perez, Borja; Sin, Gürkan

    Selecting the right controlled variables in a bioprocess is challenging since the objectives of the process (yields, product or substrate concentration) are difficult to relate with a given actuator. We apply here process control tools that can be used to assist in the selection of controlled var...... variables to the case of the SHARON-Anammox process for autotrophic nitrogen removal....

  14. The role of chemometrics in single and sequential extraction assays: a review. Part II. Cluster analysis, multiple linear regression, mixture resolution, experimental design and other techniques.

    Science.gov (United States)

    Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo

    2011-03-04

    Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.

  15. Grading of Chinese Cantonese Sausage Using Hyperspectral Imaging Combined with Chemometric Methods

    Science.gov (United States)

    Gong, Aiping; Zhu, Susu; He, Yong; Zhang, Chu

    2017-01-01

    Fast and accurate grading of Chinese Cantonese sausage is an important concern for customers, organizations, and the industry. Hyperspectral imaging in the spectral range of 874–1734 nm, combined with chemometric methods, was applied to grade Chinese Cantonese sausage. Three grades of intact and sliced Cantonese sausages were studied, including the top, first, and second grades. Support vector machine (SVM) and random forests (RF) techniques were used to build two different models. Second derivative spectra and RF were applied to select optimal wavelengths. The optimal wavelengths were the same for intact and sliced sausages when selected from second derivative spectra, while the optimal wavelengths for intact and sliced sausages selected using RF were quite similar. The SVM and RF models, using full spectra and the optimal wavelengths, obtained acceptable results for intact and sliced sausages. Both models for intact sausages performed better than those for sliced sausages, with a classification accuracy of the calibration and prediction set of over 90%. The overall results indicated that hyperspectral imaging combined with chemometric methods could be used to grade Chinese Cantonese sausages, with intact sausages being better suited for grading. This study will help to develop fast and accurate online grading of Cantonese sausages, as well as other sausages. PMID:28757578

  16. Effects of selected design variables on three ramp, external compression inlet performance. [boundary layer control bypasses, and mass flow rate

    Science.gov (United States)

    Kamman, J. H.; Hall, C. L.

    1975-01-01

    Two inlet performance tests and one inlet/airframe drag test were conducted in 1969 at the NASA-Ames Research Center. The basic inlet system was two-dimensional, three ramp (overhead), external compression, with variable capture area. The data from these tests were analyzed to show the effects of selected design variables on the performance of this type of inlet system. The inlet design variables investigated include inlet bleed, bypass, operating mass flow ratio, inlet geometry, and variable capture area.

  17. gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework

    Directory of Open Access Journals (Sweden)

    Benjamin Hofner

    2016-10-01

    Full Text Available Generalized additive models for location, scale and shape are a flexible class of regression models that allow to model multiple parameters of a distribution function, such as the mean and the standard deviation, simultaneously. With the R package gamboostLSS, we provide a boosting method to fit these models. Variable selection and model choice are naturally available within this regularized regression framework. To introduce and illustrate the R package gamboostLSS and its infrastructure, we use a data set on stunted growth in India. In addition to the specification and application of the model itself, we present a variety of convenience functions, including methods for tuning parameter selection, prediction and visualization of results. The package gamboostLSS is available from the Comprehensive R Archive Network (CRAN at https://CRAN.R-project.org/package=gamboostLSS.

  18. Extreme precipitation variability, forage quality and large herbivore diet selection in arid environments

    Science.gov (United States)

    Cain, James W.; Gedir, Jay V.; Marshal, Jason P.; Krausman, Paul R.; Allen, Jamison D.; Duff, Glenn C.; Jansen, Brian; Morgart, John R.

    2017-01-01

    Nutritional ecology forms the interface between environmental variability and large herbivore behaviour, life history characteristics, and population dynamics. Forage conditions in arid and semi-arid regions are driven by unpredictable spatial and temporal patterns in rainfall. Diet selection by herbivores should be directed towards overcoming the most pressing nutritional limitation (i.e. energy, protein [nitrogen, N], moisture) within the constraints imposed by temporal and spatial variability in forage conditions. We investigated the influence of precipitation-induced shifts in forage nutritional quality and subsequent large herbivore responses across widely varying precipitation conditions in an arid environment. Specifically, we assessed seasonal changes in diet breadth and forage selection of adult female desert bighorn sheep Ovis canadensis mexicana in relation to potential nutritional limitations in forage N, moisture and energy content (as proxied by dry matter digestibility, DMD). Succulents were consistently high in moisture but low in N and grasses were low in N and moisture until the wet period. Nitrogen and moisture content of shrubs and forbs varied among seasons and climatic periods, whereas trees had consistently high N and moderate moisture levels. Shrubs, trees and succulents composed most of the seasonal sheep diets but had little variation in DMD. Across all seasons during drought and during summer with average precipitation, forages selected by sheep were higher in N and moisture than that of available forage. Differences in DMD between sheep diets and available forage were minor. Diet breadth was lowest during drought and increased with precipitation, reflecting a reliance on few key forage species during drought. Overall, forage selection was more strongly associated with N and moisture content than energy content. Our study demonstrates that unlike north-temperate ungulates which are generally reported to be energy-limited, N and moisture

  19. Impact of perennial energy crops income variability on the crop selection of risk averse farmers

    International Nuclear Information System (INIS)

    Alexander, Peter; Moran, Dominic

    2013-01-01

    The UK Government policy is for the area of perennial energy crops in the UK to expand significantly. Farmers need to choose these crops in preference to conventional rotations for this to be achievable. This paper looks at the potential level and variability of perennial energy crop incomes and the relation to incomes from conventional arable crops. Assuming energy crop prices are correlated to oil prices the results suggests that incomes from them are not well correlated to conventional arable crop incomes. A farm scale mathematical programming model is then used to attempt to understand the affect on risk averse farmers crop selection. The inclusion of risk reduces the energy crop price required for the selection of these crops. However yields towards the highest of those predicted in the UK are still required to make them an optimal choice, suggesting only a small area of energy crops within the UK would be expected to be chosen to be grown. This must be regarded as a tentative conclusion, primarily due to high sensitivity found to crop yields, resulting in the proposal for further work to apply the model using spatially disaggregated data. - Highlights: ► Energy crop and conventional crop incomes suggested as uncorrelated. ► Diversification effect of energy crops investigated for a risk averse farmer. ► Energy crops indicated as optimal selection only on highest yielding UK sites. ► Large establishment grant rates to substantially alter crop selections.

  20. Prediction of Placental Barrier Permeability: A Model Based on Partial Least Squares Variable Selection Procedure

    Directory of Open Access Journals (Sweden)

    Yong-Hong Zhang

    2015-05-01

    Full Text Available Assessing the human placental barrier permeability of drugs is very important to guarantee drug safety during pregnancy. Quantitative structure–activity relationship (QSAR method was used as an effective assessing tool for the placental transfer study of drugs, while in vitro human placental perfusion is the most widely used method. In this study, the partial least squares (PLS variable selection and modeling procedure was used to pick out optimal descriptors from a pool of 620 descriptors of 65 compounds and to simultaneously develop a QSAR model between the descriptors and the placental barrier permeability expressed by the clearance indices (CI. The model was subjected to internal validation by cross-validation and y-randomization and to external validation by predicting CI values of 19 compounds. It was shown that the model developed is robust and has a good predictive potential (r2 = 0.9064, RMSE = 0.09, q2 = 0.7323, rp2 = 0.7656, RMSP = 0.14. The mechanistic interpretation of the final model was given by the high variable importance in projection values of descriptors. Using PLS procedure, we can rapidly and effectively select optimal descriptors and thus construct a model with good stability and predictability. This analysis can provide an effective tool for the high-throughput screening of the placental barrier permeability of drugs.

  1. Locating disease genes using Bayesian variable selection with the Haseman-Elston method

    Directory of Open Access Journals (Sweden)

    He Qimei

    2003-12-01

    Full Text Available Abstract Background We applied stochastic search variable selection (SSVS, a Bayesian model selection method, to the simulated data of Genetic Analysis Workshop 13. We used SSVS with the revisited Haseman-Elston method to find the markers linked to the loci determining change in cholesterol over time. To study gene-gene interaction (epistasis and gene-environment interaction, we adopted prior structures, which incorporate the relationship among the predictors. This allows SSVS to search in the model space more efficiently and avoid the less likely models. Results In applying SSVS, instead of looking at the posterior distribution of each of the candidate models, which is sensitive to the setting of the prior, we ranked the candidate variables (markers according to their marginal posterior probability, which was shown to be more robust to the prior. Compared with traditional methods that consider one marker at a time, our method considers all markers simultaneously and obtains more favorable results. Conclusions We showed that SSVS is a powerful method for identifying linked markers using the Haseman-Elston method, even for weak effects. SSVS is very effective because it does a smart search over the entire model space.

  2. A fast chaos-based image encryption scheme with a dynamic state variables selection mechanism

    Science.gov (United States)

    Chen, Jun-xin; Zhu, Zhi-liang; Fu, Chong; Yu, Hai; Zhang, Li-bo

    2015-03-01

    In recent years, a variety of chaos-based image cryptosystems have been investigated to meet the increasing demand for real-time secure image transmission. Most of them are based on permutation-diffusion architecture, in which permutation and diffusion are two independent procedures with fixed control parameters. This property results in two flaws. (1) At least two chaotic state variables are required for encrypting one plain pixel, in permutation and diffusion stages respectively. Chaotic state variables produced with high computation complexity are not sufficiently used. (2) The key stream solely depends on the secret key, and hence the cryptosystem is vulnerable against known/chosen-plaintext attacks. In this paper, a fast chaos-based image encryption scheme with a dynamic state variables selection mechanism is proposed to enhance the security and promote the efficiency of chaos-based image cryptosystems. Experimental simulations and extensive cryptanalysis have been carried out and the results prove the superior security and high efficiency of the scheme.

  3. Relation between sick leave and selected exposure variables among women semiconductor workers in Malaysia

    Science.gov (United States)

    Chee, H; Rampal, K

    2003-01-01

    Aims: To determine the relation between sick leave and selected exposure variables among women semiconductor workers. Methods: This was a cross sectional survey of production workers from 18 semiconductor factories. Those selected had to be women, direct production operators up to the level of line leader, and Malaysian citizens. Sick leave and exposure to physical and chemical hazards were determined by self reporting. Three sick leave variables were used; number of sick leave days taken in the past year was the variable of interest in logistic regression models where the effects of age, marital status, work task, work schedule, work section, and duration of work in factory and work section were also explored. Results: Marital status was strongly linked to the taking of sick leave. Age, work schedule, and duration of work in the factory were significant confounders only in certain cases. After adjusting for these confounders, chemical and physical exposures, with the exception of poor ventilation and smelling chemicals, showed no significant relation to the taking of sick leave within the past year. Work section was a good predictor for taking sick leave, as wafer polishing workers faced higher odds of taking sick leave for each of the three cut off points of seven days, three days, and not at all, while parts assembly workers also faced significantly higher odds of taking sick leave. Conclusion: In Malaysia, the wafer fabrication factories only carry out a limited portion of the work processes, in particular, wafer polishing and the processes immediately prior to and following it. This study, in showing higher illness rates for workers in wafer polishing compared to semiconductor assembly, has implications for the governmental policy of encouraging the setting up of wafer fabrication plants with the full range of work processes. PMID:12660374

  4. Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis

    Directory of Open Access Journals (Sweden)

    Ueki Masao

    2012-05-01

    Full Text Available Abstract Background Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. Results We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium data. Conclusions Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.

  5. Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction.

    Science.gov (United States)

    Liu, Cong; Wang, Xujun; Genchev, Georgi Z; Lu, Hui

    2017-07-15

    New developments in high-throughput genomic technologies have enabled the measurement of diverse types of omics biomarkers in a cost-efficient and clinically-feasible manner. Developing computational methods and tools for analysis and translation of such genomic data into clinically-relevant information is an ongoing and active area of investigation. For example, several studies have utilized an unsupervised learning framework to cluster patients by integrating omics data. Despite such recent advances, predicting cancer prognosis using integrated omics biomarkers remains a challenge. There is also a shortage of computational tools for predicting cancer prognosis by using supervised learning methods. The current standard approach is to fit a Cox regression model by concatenating the different types of omics data in a linear manner, while penalty could be added for feature selection. A more powerful approach, however, would be to incorporate data by considering relationships among omics datatypes. Here we developed two methods: a SKI-Cox method and a wLASSO-Cox method to incorporate the association among different types of omics data. Both methods fit the Cox proportional hazards model and predict a risk score based on mRNA expression profiles. SKI-Cox borrows the information generated by these additional types of omics data to guide variable selection, while wLASSO-Cox incorporates this information as a penalty factor during model fitting. We show that SKI-Cox and wLASSO-Cox models select more true variables than a LASSO-Cox model in simulation studies. We assess the performance of SKI-Cox and wLASSO-Cox using TCGA glioblastoma multiforme and lung adenocarcinoma data. In each case, mRNA expression, methylation, and copy number variation data are integrated to predict the overall survival time of cancer patients. Our methods achieve better performance in predicting patients' survival in glioblastoma and lung adenocarcinoma. Copyright © 2017. Published by Elsevier

  6. Discrimination of tomatoes bred by spaceflight mutagenesis using visible/near infrared spectroscopy and chemometrics

    Science.gov (United States)

    Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong

    2015-04-01

    Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-715 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the

  7. CHEMOMETRICS IN BIOANALYTICAL SAMPLE PREPARATION - A FRACTIONATED COMBINED MIXTURE AND FACTORIAL DESIGN FOR THE MODELING OF THE RECOVERY OF 5 TRICYCLIC AMINES FROM PLASMA AFTER LIQUID-LIQUID-EXTRACTION PRIOR TO HIGH-PERFORMANCE LIQUID-CHROMATOGRAPHY

    NARCIS (Netherlands)

    WIELING, J; MENSINK, CK; JONKMAN, JHG; COENEGRACHT, PMJ; DUINEVELD, CAA; DOORNBOS, DA

    1993-01-01

    A general systematic approach is described for the chemometric modelling of liquid-liquid extraction data of drugs from biological fluids. Extraction solvents were selected from Snyder's solvent selectivity triangle: methyl tert.-butyl ether, methylene chloride and chloroform. The composition of a

  8. Assessment of acute pesticide toxicity with selected biochemical variables in suicide attempting subjects

    International Nuclear Information System (INIS)

    Soomro, A.M.; Seehar, G.M.; Bhanger, M.I.

    2003-01-01

    Pesticide induced changes were assessed in thirty two subjects of attempted suicide cases. Among all, the farmers and their families were recorded as most frequently suicide attempting. The values obtained from seven biochemical variables of 29 years old (average age) hospitalized subjects were compared to the same number and age matched normal volunteers. The results revealed major differences in the mean values of the selected parameters. The mean difference calculate; alkaline phosphatase (178.7 mu/l), Bilirubin (7.5 mg/dl), GPT (59.2 mu/l) and glucose (38.6 mg/dl) were higher than the controls, which indicate the hepatotoxicity induced by the pesticides in suicide attempting individuals. Increase in serum creatinine and urea indicated renal malfunction that could be linked with pesticide induced nephrotoxicity among them. (author)

  9. VARIABILITY OF AMYLOSE AND AMYLOPECTIN IN WINTER WHEAT AND SELECTION FOR SPECIAL PURPOSES

    Directory of Open Access Journals (Sweden)

    Nikolina Weg Krstičević

    2015-06-01

    Full Text Available The aim of this study was to investigate the variability of amylose and amylopectin in 24 Croatian and six foreign winter wheat varieties and to detect the potential of these varieties for special purposes. Starch composition analysis was based on the separation of amylose and amylopectin and the determination of their amounts and ratios. Analysis of the amount of amylose and amylopectin determined statistically highly significant differences between the varieties. The tested varieties are mostly bread wheat of different quality which have the usual content of amylose and amylopectin. Some varieties were identified among them with high amylopectin and low amylose content and one variety with high amylose content. They have the potential in future breeding programs and selection for special purposes.

  10. [Application of characteristic NIR variables selection in portable detection of soluble solids content of apple by near infrared spectroscopy].

    Science.gov (United States)

    Fan, Shu-Xiang; Huang, Wen-Qian; Li, Jiang-Bo; Guo, Zhi-Ming; Zhaq, Chun-Jiang

    2014-10-01

    In order to detect the soluble solids content(SSC)of apple conveniently and rapidly, a ring fiber probe and a portable spectrometer were applied to obtain the spectroscopy of apple. Different wavelength variable selection methods, including unin- formative variable elimination (UVE), competitive adaptive reweighted sampling (CARS) and genetic algorithm (GA) were pro- posed to select effective wavelength variables of the NIR spectroscopy of the SSC in apple based on PLS. The back interval LS- SVM (BiLS-SVM) and GA were used to select effective wavelength variables based on LS-SVM. Selected wavelength variables and full wavelength range were set as input variables of PLS model and LS-SVM model, respectively. The results indicated that PLS model built using GA-CARS on 50 characteristic variables selected from full-spectrum which had 1512 wavelengths achieved the optimal performance. The correlation coefficient (Rp) and root mean square error of prediction (RMSEP) for prediction sets were 0.962, 0.403°Brix respectively for SSC. The proposed method of GA-CARS could effectively simplify the portable detection model of SSC in apple based on near infrared spectroscopy and enhance the predictive precision. The study can provide a reference for the development of portable apple soluble solids content spectrometer.

  11. Bayesian variable selection for post-analytic interrogation of susceptibility loci.

    Science.gov (United States)

    Chen, Siying; Nunez, Sara; Reilly, Muredach P; Foulkes, Andrea S

    2017-06-01

    Understanding the complex interplay among protein coding genes and regulatory elements requires rigorous interrogation with analytic tools designed for discerning the relative contributions of overlapping genomic regions. To this aim, we offer a novel application of Bayesian variable selection (BVS) for classifying genomic class level associations using existing large meta-analysis summary level resources. This approach is applied using the expectation maximization variable selection (EMVS) algorithm to typed and imputed SNPs across 502 protein coding genes (PCGs) and 220 long intergenic non-coding RNAs (lncRNAs) that overlap 45 known loci for coronary artery disease (CAD) using publicly available Global Lipids Gentics Consortium (GLGC) (Teslovich et al., 2010; Willer et al., 2013) meta-analysis summary statistics for low-density lipoprotein cholesterol (LDL-C). The analysis reveals 33 PCGs and three lncRNAs across 11 loci with >50% posterior probabilities for inclusion in an additive model of association. The findings are consistent with previous reports, while providing some new insight into the architecture of LDL-cholesterol to be investigated further. As genomic taxonomies continue to evolve, additional classes such as enhancer elements and splicing regions, can easily be layered into the proposed analysis framework. Moreover, application of this approach to alternative publicly available meta-analysis resources, or more generally as a post-analytic strategy to further interrogate regions that are identified through single point analysis, is straightforward. All coding examples are implemented in R version 3.2.1 and provided as supplemental material. © 2016, The International Biometric Society.

  12. A new simplex chemometric approach to identify olive oil blends with potentially high traceability.

    Science.gov (United States)

    Semmar, N; Laroussi-Mezghani, S; Grati-Kamoun, N; Hammami, M; Artaud, J

    2016-10-01

    Olive oil blends (OOBs) are complex matrices combining different cultivars at variable proportions. Although qualitative determinations of OOBs have been subjected to several chemometric works, quantitative evaluations of their contents remain poorly developed because of traceability difficulties concerning co-occurring cultivars. Around this question, we recently published an original simplex approach helping to develop predictive models of the proportions of co-occurring cultivars from chemical profiles of resulting blends (Semmar & Artaud, 2015). Beyond predictive model construction and validation, this paper presents an extension based on prediction errors' analysis to statistically define the blends with the highest predictability among all the possible ones that can be made by mixing cultivars at different proportions. This provides an interesting way to identify a priori labeled commercial products with potentially high traceability taking into account the natural chemical variability of different constitutive cultivars. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. INDUCED GENETIC VARIABILITY AND SELECTION FOR HIGH YIELDING MUTANTS IN BREAD WHEAT(TRITICUM AESTIVUM L.)

    International Nuclear Information System (INIS)

    SOBIEH, S.EL-S.S.

    2007-01-01

    This study was conducted during the two winter seasons of 2004/2005 and 2005/2006 at the experimental farm belonging to Plant Research Department, Nuclear Research Centre, AEA, Egypt.The aim of this study is to determine the effect of gamma rays(150, 200 and 250 Gy) on means of yield and its attributes for exotic wheat variety (vir-25) and induction of genetic variability that permits to perform visual selection through the irradiated populations, as well as to determine difference in seed protein patterns between vir-25 parent variety and some selectants in M2 generation.The results showed that the different doses of gamma rays had non-significant effect on mean value of yield/plant and significant effect on mean values of it's attributes. 0n the other hand, the considered genetic variability was generated as result of applying gamma irradiation. The highest amount of induced genetic variability was detected for number of grains/ spike, spike length and number of spikes/plant. Additionally, these three traits exhibited strong association with grain yield/plant, hence, they were used as a criterion for selection.Some variant plants were selected from radiation treatment 250 Gy, with 2-10 spikes per plant.These variant plants exhibited increasing in spike length and number of gains/spike.The results also revealed that protein electrophoresis were varied in the number and position of bands from genotype to another and various genotypes share bands with molecular weights 31.4 and 3.2 KD.Many bands were found to be specific for the genotype and the nine wheat mutants were characterized by the presence of bands of molecular weights: 151.9, 125.7, 14.1 and 5.7 KD at M-167.4, 21.7 and 8.2 at M-299.7 KD at M-3136.1, 97.6, 49.8, 27.9 and 20.6 KD at M-4 135.2, 95.3 and 28.1 KD at M-5 135.5, 67.7, 47.1, 32.3, 21.9 and 9.6 KD at M-6 126.1, 112.1, 103.3, 58.8, 20.9 and 12.1 KD at M-7 127.7, 116.6, 93.9, 55.0 and 47.4 KD at M-8 141.7, 96.1, 79.8, 68.9, 42.1, 32.7, 22.0 and 13

  14. The impact of selected organizational variables and managerial leadership on radiation therapists' organizational commitment

    International Nuclear Information System (INIS)

    Akroyd, Duane; Legg, Jeff; Jackowski, Melissa B.; Adams, Robert D.

    2009-01-01

    The purpose of this study was to examine the impact of selected organizational factors and the leadership behavior of supervisors on radiation therapists' commitment to their organizations. The population for this study consists of all full time clinical radiation therapists registered by the American Registry of Radiologic Technologists (ARRT) in the United States. A random sample of 800 radiation therapists was obtained from the ARRT for this study. Questionnaires were mailed to all participants and measured organizational variables; managerial leadership variable and three components of organizational commitment (affective, continuance and normative). It was determined that organizational support, and leadership behavior of supervisors each had a significant and positive affect on normative and affective commitment of radiation therapists and each of the models predicted over 40% of the variance in radiation therapists organizational commitment. This study examined radiation therapists' commitment to their organizations and found that affective (emotional attachment to the organization) and normative (feelings of obligation to the organization) commitments were more important than continuance commitment (awareness of the costs of leaving the organization). This study can help radiation oncology administrators and physicians to understand the values their radiation therapy employees hold that are predictive of their commitment to the organization. A crucial result of the study is the importance of the perceived support of the organization and the leadership skills of managers/supervisors on radiation therapists' commitment to the organization.

  15. The impact of selected organizational variables and managerial leadership on radiation therapists' organizational commitment

    Energy Technology Data Exchange (ETDEWEB)

    Akroyd, Duane [Department of Adult and Community College Education, College of Education, Campus Box 7801, North Carolina State University, Raleigh, NC 27695 (United States)], E-mail: duane_akroyd@ncsu.edu; Legg, Jeff [Department of Radiologic Sciences, Virginia Commonwealth University, Richmond, VA 23284 (United States); Jackowski, Melissa B. [Division of Radiologic Sciences, University of North Carolina School of Medicine 27599 (United States); Adams, Robert D. [Department of Radiation Oncology, University of North Carolina School of Medicine 27599 (United States)

    2009-05-15

    The purpose of this study was to examine the impact of selected organizational factors and the leadership behavior of supervisors on radiation therapists' commitment to their organizations. The population for this study consists of all full time clinical radiation therapists registered by the American Registry of Radiologic Technologists (ARRT) in the United States. A random sample of 800 radiation therapists was obtained from the ARRT for this study. Questionnaires were mailed to all participants and measured organizational variables; managerial leadership variable and three components of organizational commitment (affective, continuance and normative). It was determined that organizational support, and leadership behavior of supervisors each had a significant and positive affect on normative and affective commitment of radiation therapists and each of the models predicted over 40% of the variance in radiation therapists organizational commitment. This study examined radiation therapists' commitment to their organizations and found that affective (emotional attachment to the organization) and normative (feelings of obligation to the organization) commitments were more important than continuance commitment (awareness of the costs of leaving the organization). This study can help radiation oncology administrators and physicians to understand the values their radiation therapy employees hold that are predictive of their commitment to the organization. A crucial result of the study is the importance of the perceived support of the organization and the leadership skills of managers/supervisors on radiation therapists' commitment to the organization.

  16. Spatially variable natural selection and the divergence between parapatric subspecies of lodgepole pine (Pinus contorta, Pinaceae).

    Science.gov (United States)

    Eckert, Andrew J; Shahi, Hurshbir; Datwyler, Shannon L; Neale, David B

    2012-08-01

    Plant populations arrayed across sharp environmental gradients are ideal systems for identifying the genetic basis of ecologically relevant phenotypes. A series of five uplifted marine terraces along the northern coast of California represents one such system where morphologically distinct populations of lodgepole pine (Pinus contorta) are distributed across sharp soil gradients ranging from fertile soils near the coast to podzolic soils ca. 5 km inland. A total of 92 trees was sampled across four coastal marine terraces (N = 10-46 trees/terrace) located in Mendocino County, California and sequenced for a set of 24 candidate genes for growth and responses to various soil chemistry variables. Statistical analyses relying on patterns of nucleotide diversity were employed to identify genes whose diversity patterns were inconsistent with three null models. Most genes displayed patterns of nucleotide diversity that were consistent with null models (N = 19) or with the presence of paralogs (N = 3). Two genes, however, were exceptional: an aluminum responsive ABC-transporter with F(ST) = 0.664 and an inorganic phosphate transporter characterized by divergent haplotypes segregating at intermediate frequencies in most populations. Spatially variable natural selection along gradients of aluminum and phosphate ion concentrations likely accounted for both outliers. These results shed light on some of the genetic components comprising the extended phenotype of this ecosystem, as well as highlight ecotones as fruitful study systems for the detection of adaptive genetic variants.

  17. Repeat what after whom? Exploring variable selectivity in a cross-dialectal shadowing task.

    Directory of Open Access Journals (Sweden)

    Abby eWalker

    2015-05-01

    Full Text Available Twenty women from Christchurch, New Zealand and sixteen from Columbus Ohio (dialect region U.S. Midland participated in a bimodal lexical naming task where they repeated monosyllabic words after four speakers from four regional dialects: New Zealand, Australia, U.S. Inland North and U.S. Midland. The resulting utterances were acoustically analyzed, and presented to listeners on Amazon Mechanical Turk in an AXB task. Convergence is observed, but differs depending on the dialect of the speaker, the dialect of the model, the particular word class being shadowed, and the order in which dialects are presented to participants. We argue that these patterns are generally consistent with findings that convergence is promoted by a large phonetic distance between shadower and model (Babel, 2010, contra Kim, Horton & Bradlow, 2011, and greater existing variability in a vowel class (Babel, 2012. The results also suggest that more comparisons of accommodation towards different dialects are warranted, and that the investigation of the socio-indexical meaning of specific linguistic forms in context is a promising avenue for understanding variable selectivity in convergence.

  18. Quality Evaluation of Potentilla fruticosa L. by High Performance Liquid Chromatography Fingerprinting Associated with Chemometric Methods.

    Science.gov (United States)

    Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue

    2016-01-01

    The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines.

  19. Quality Evaluation of Potentilla fruticosa L. by High Performance Liquid Chromatography Fingerprinting Associated with Chemometric Methods

    Science.gov (United States)

    Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue

    2016-01-01

    The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines. PMID:26890416

  20. Near Infrared Spectroscopy Calibration for Wood Chemistry: Which Chemometric Technique Is Best for Prediction and Interpretation?

    Directory of Open Access Journals (Sweden)

    Brian K. Via

    2014-07-01

    Full Text Available This paper addresses the precision in factor loadings during partial least squares (PLS and principal components regression (PCR of wood chemistry content from near infrared reflectance (NIR spectra. The precision of the loadings is considered important because these estimates are often utilized to interpret chemometric models or selection of meaningful wavenumbers. Standard laboratory chemistry methods were employed on a mixed genus/species hardwood sample set. PLS and PCR, before and after 1st derivative pretreatment, was utilized for model building and loadings investigation. As demonstrated by others, PLS was found to provide better predictive diagnostics. However, PCR exhibited a more precise estimate of loading peaks which makes PCR better for interpretation. Application of the 1st derivative appeared to assist in improving both PCR and PLS loading precision, but due to the small sample size, the two chemometric methods could not be compared statistically. This work is important because to date most research works have committed to PLS because it yields better predictive performance. But this research suggests there is a tradeoff between better prediction and model interpretation. Future work is needed to compare PLS and PCR for a suite of spectral pretreatment techniques.

  1. Near infrared spectroscopy calibration for wood chemistry: which chemometric technique is best for prediction and interpretation?

    Science.gov (United States)

    Via, Brian K; Zhou, Chengfeng; Acquah, Gifty; Jiang, Wei; Eckhardt, Lori

    2014-07-25

    This paper addresses the precision in factor loadings during partial least squares (PLS) and principal components regression (PCR) of wood chemistry content from near infrared reflectance (NIR) spectra. The precision of the loadings is considered important because these estimates are often utilized to interpret chemometric models or selection of meaningful wavenumbers. Standard laboratory chemistry methods were employed on a mixed genus/species hardwood sample set. PLS and PCR, before and after 1st derivative pretreatment, was utilized for model building and loadings investigation. As demonstrated by others, PLS was found to provide better predictive diagnostics. However, PCR exhibited a more precise estimate of loading peaks which makes PCR better for interpretation. Application of the 1st derivative appeared to assist in improving both PCR and PLS loading precision, but due to the small sample size, the two chemometric methods could not be compared statistically. This work is important because to date most research works have committed to PLS because it yields better predictive performance. But this research suggests there is a tradeoff between better prediction and model interpretation. Future work is needed to compare PLS and PCR for a suite of spectral pretreatment techniques.

  2. Comparison of Three Plot Selection Methods for Estimating Change in Temporally Variable, Spatially Clustered Populations.

    Energy Technology Data Exchange (ETDEWEB)

    Thompson, William L. [Bonneville Power Administration, Portland, OR (US). Environment, Fish and Wildlife

    2001-07-01

    Monitoring population numbers is important for assessing trends and meeting various legislative mandates. However, sampling across time introduces a temporal aspect to survey design in addition to the spatial one. For instance, a sample that is initially representative may lose this attribute if there is a shift in numbers and/or spatial distribution in the underlying population that is not reflected in later sampled plots. Plot selection methods that account for this temporal variability will produce the best trend estimates. Consequently, I used simulation to compare bias and relative precision of estimates of population change among stratified and unstratified sampling designs based on permanent, temporary, and partial replacement plots under varying levels of spatial clustering, density, and temporal shifting of populations. Permanent plots produced more precise estimates of change than temporary plots across all factors. Further, permanent plots performed better than partial replacement plots except for high density (5 and 10 individuals per plot) and 25% - 50% shifts in the population. Stratified designs always produced less precise estimates of population change for all three plot selection methods, and often produced biased change estimates and greatly inflated variance estimates under sampling with partial replacement. Hence, stratification that remains fixed across time should be avoided when monitoring populations that are likely to exhibit large changes in numbers and/or spatial distribution during the study period. Key words: bias; change estimation; monitoring; permanent plots; relative precision; sampling with partial replacement; temporary plots.

  3. Assessment on pattern of urban air quality by using chemometric ...

    African Journals Online (AJOL)

    The study evaluate the relationship between the main daily concentrations of criteria air pollutants in urban areas and their associations by using chemometric technique. Data were gathered from the Department of Environmental for three years observations (2011-2013) consisting of 5 major pollutants such as SO2, NO2, ...

  4. Principal Component Analysis: Most Favourite Tool in Chemometrics

    Indian Academy of Sciences (India)

    Abstract. Principal component analysis (PCA) is the most commonlyused chemometric technique. It is an unsupervised patternrecognition technique. PCA has found applications in chemistry,biology, medicine and economics. The present work attemptsto understand how PCA work and how can we interpretits results.

  5. Chemometric Strategies for Peak Detection and Profiling from Multidimensional Chromatography.

    Science.gov (United States)

    Navarro-Reig, Meritxell; Bedia, Carmen; Tauler, Romà; Jaumot, Joaquim

    2018-04-03

    The increasing complexity of omics research has encouraged the development of new instrumental technologies able to deal with these challenging samples. In this way, the rise of multidimensional separations should be highlighted due to the massive amounts of information that provide with an enhanced analyte determination. Both proteomics and metabolomics benefit from this higher separation capacity achieved when different chromatographic dimensions are combined, either in LC or GC. However, this vast quantity of experimental information requires the application of chemometric data analysis strategies to retrieve this hidden knowledge, especially in the case of nontargeted studies. In this work, the most common chemometric tools and approaches for the analysis of this multidimensional chromatographic data are reviewed. First, different options for data preprocessing and enhancement of the instrumental signal are introduced. Next, the most used chemometric methods for the detection of chromatographic peaks and the resolution of chromatographic and spectral contributions (profiling) are presented. The description of these data analysis approaches is complemented with enlightening examples from omics fields that demonstrate the exceptional potential of the combination of multidimensional separation techniques and chemometric tools of data analysis. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Acoustic chemometric prediction of total solids in bioslurry

    DEFF Research Database (Denmark)

    Ihunegbo, Felicia; Madsen, Michael; Esbensen, Kim

    2012-01-01

    .86%) in the range of 5.8–10.8% w/w dry matter. Based on these excellent prediction performance measures, it is concluded that acoustic chemometrics has come of age as a full grown PAT approach for on-line monitoring of dry matter (TS) in complex bioslurry, with a promising application potential in other biomass...

  7. FCERI AND HISTAMINE METABOLISM GENE VARIABILITY IN SELECTIVE RESPONDERS TO NSAIDS

    Directory of Open Access Journals (Sweden)

    Gemma Amo

    2016-09-01

    Full Text Available The high-affinity IgE receptor (Fcε RI is a heterotetramer of three subunits: Fcε RIα, Fcε RIβ and Fcε RIγ (αβγ2 encoded by three genes designated as FCER1A, FCER1B (MS4A2 and FCER1G, respectively. Recent evidence points to FCERI gene variability as a relevant factor in the risk of developing allergic diseases. Because Fcε RI plays a key role in the events downstream of the triggering factors in immunological response, we hypothesized that FCERI gene variants might be related with the risk of, or with the clinical response to, selective (IgE mediated non-steroidal anti-inflammatory (NSAID hypersensitivity.From a cohort of 314 patients suffering from selective hypersensitivity to metamizole, ibuprofen, diclofenac, paracetamol, acetylsalicylic acid (ASA, propifenazone, naproxen, ketoprofen, dexketoprofen, etofenamate, aceclofenac, etoricoxib, dexibuprofen, indomethacin, oxyphenylbutazone or piroxicam, and 585 unrelated healthy controls that tolerated these NSAIDs, we analyzed the putative effects of the FCERI SNPs FCER1A rs2494262, rs2427837 and rs2251746; FCER1B rs1441586, rs569108 and rs512555; FCER1G rs11587213, rs2070901 and rs11421. Furthermore, in order to identify additional genetic markers which might be associated with the risk of developing selective NSAID hypersensitivity, or which may modify the putative association of FCERI gene variations with risk, we analyzed polymorphisms known to affect histamine synthesis or metabolism, such as rs17740607, rs2073440, rs1801105, rs2052129, rs10156191, rs1049742 and rs1049793 in the HDC, HNMT and DAO genes.No major genetic associations with risk or with clinical presentation, and no gene-gene interactions, or gene-phenotype interactions (including age, gender, IgE concentration, antecedents of atopy, culprit drug or clinical presentation were identified in patients. However, logistic regression analyses indicated that the presence of antecedents of atopy and the DAO SNP rs2052129 (GG

  8. Disruption of Brewers' yeast by hydrodynamic cavitation: Process variables and their influence on selective release.

    Science.gov (United States)

    Balasundaram, B; Harrison, S T L

    2006-06-05

    Intracellular products, not secreted from the microbial cell, are released by breaking the cell envelope consisting of cytoplasmic membrane and an outer cell wall. Hydrodynamic cavitation has been reported to cause microbial cell disruption. By manipulating the operating variables involved, a wide range of intensity of cavitation can be achieved resulting in a varying extent of disruption. The effect of the process variables including cavitation number, initial cell concentration of the suspension and the number of passes across the cavitation zone on the release of enzymes from various locations of the Brewers' yeast was studied. The release profile of the enzymes studied include alpha-glucosidase (periplasmic), invertase (cell wall bound), alcohol dehydrogenase (ADH; cytoplasmic) and glucose-6-phosphate dehydrogenase (G6PDH; cytoplasmic). An optimum cavitation number Cv of 0.13 for maximum disruption was observed across the range Cv 0.09-0.99. The optimum cell concentration was found to be 0.5% (w/v, wet wt) when varying over the range 0.1%-5%. The sustained effect of cavitation on the yeast cell wall when re-circulating the suspension across the cavitation zone was found to release the cell wall bound enzyme invertase (86%) to a greater extent than the enzymes from other locations of the cell (e.g. periplasmic alpha-glucosidase at 17%). Localised damage to the cell wall could be observed using transmission electron microscopy (TEM) of cells subjected to less intense cavitation conditions. Absence of the release of cytoplasmic enzymes to a significant extent, absence of micronisation as observed by TEM and presence of a lower number of proteins bands in the culture supernatant on SDS-PAGE analysis following hydrodynamic cavitation compared to disruption by high-pressure homogenisation confirmed the selective release offered by hydrodynamic cavitation. Copyright 2006 Wiley Periodicals, Inc.

  9. Selection of entropy-measure parameters for knowledge discovery in heart rate variability data.

    Science.gov (United States)

    Mayer, Christopher C; Bachler, Martin; Hörtenhuber, Matthias; Stocker, Christof; Holzinger, Andreas; Wassertheurer, Siegfried

    2014-01-01

    Heart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery. This study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds rF and rL for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test. The first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning rF and rL showed that there is no optimal choice, but r = rF = rL is reasonable with r = rChon or r = 0.2σ. Some of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical

  10. Improving the Classification Accuracy for Near-Infrared Spectroscopy of Chinese Salvia miltiorrhiza Using Local Variable Selection

    Directory of Open Access Journals (Sweden)

    Lianqing Zhu

    2018-01-01

    Full Text Available In order to improve the classification accuracy of Chinese Salvia miltiorrhiza using near-infrared spectroscopy, a novel local variable selection strategy is thus proposed. Combining the strengths of the local algorithm and interval partial least squares, the spectra data have firstly been divided into several pairs of classes in sample direction and equidistant subintervals in variable direction. Then, a local classification model has been built, and the most proper spectral region has been selected based on the new evaluation criterion considering both classification error rate and best predictive ability under the leave-one-out cross validation scheme for each pair of classes. Finally, each observation can be assigned to belong to the class according to the statistical analysis of classification results of the local classification model built on selected variables. The performance of the proposed method was demonstrated through near-infrared spectra of cultivated or wild Salvia miltiorrhiza, which are collected from 8 geographical origins in 5 provinces of China. For comparison, soft independent modelling of class analogy and partial least squares discriminant analysis methods are, respectively, employed as the classification model. Experimental results showed that classification performance of the classification model with local variable selection was obvious better than that without variable selection.

  11. A modification of the successive projections algorithm for spectral variable selection in the presence of unknown interferents.

    Science.gov (United States)

    Soares, Sófacles Figueredo Carreiro; Galvão, Roberto Kawakami Harrop; Araújo, Mário César Ugulino; da Silva, Edvan Cirino; Pereira, Claudete Fernandes; de Andrade, Stéfani Iury Evangelista; Leite, Flaviano Carvalho

    2011-03-09

    This work proposes a modification to the successive projections algorithm (SPA) aimed at selecting spectral variables for multiple linear regression (MLR) in the presence of unknown interferents not included in the calibration data set. The modified algorithm favours the selection of variables in which the effect of the interferent is less pronounced. The proposed procedure can be regarded as an adaptive modelling technique, because the spectral features of the samples to be analyzed are considered in the variable selection process. The advantages of this new approach are demonstrated in two analytical problems, namely (1) ultraviolet-visible spectrometric determination of tartrazine, allure red and sunset yellow in aqueous solutions under the interference of erythrosine, and (2) near-infrared spectrometric determination of ethanol in gasoline under the interference of toluene. In these case studies, the performance of conventional MLR-SPA models is substantially degraded by the presence of the interferent. This problem is circumvented by applying the proposed Adaptive MLR-SPA approach, which results in prediction errors smaller than those obtained by three other multivariate calibration techniques, namely stepwise regression, full-spectrum partial-least-squares (PLS) and PLS with variables selected by a genetic algorithm. An inspection of the variable selection results reveals that the Adaptive approach successfully avoids spectral regions in which the interference is more intense. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. Chemometric methods and near-infrared spectroscopy applied to bioenergy production

    International Nuclear Information System (INIS)

    Liebmann, B.

    2010-01-01

    data analysis (i) successfully determine the concentrations of moisture, protein, and starch in the feedstock material as well as glucose, ethanol, glycerol, lactic acid, acetic acid in the processed bioethanol broths; (ii) and allow quantifying a complex biofuel's property such as the heating value. At the third stage, this thesis focuses on new chemometric methods that improve mathematical analysis of multivariate data such as NIR spectra. The newly developed method 'repeated double cross validation' (rdCV) separates optimization of regression models from tests of model performance; furthermore, rdCV estimates the variability of the model performance based on a large number of prediction errors from test samples. The rdCV procedure has been applied to both the classical PLS regression and the robust 'partial robust M' regression method, which can handle erroneous data. The peculiar and relatively unknown 'random projection' method is tested for its potential of dimensionality reduction of data from chemometrics and chemoinformatics. The main findings are: (i) rdCV fosters a realistic assessment of model performance, (ii) robust regression has outstanding performance for data containing outliers and thus is strongly recommendable, and (iii) random projection is a useful niche application for high-dimensional data combined with possible restrictions in data storage and computing time. The three chemometric methods described are available as functions for the free software R. (author) [de

  13. Chemometrics in analytical chemistry-part I: history, experimental design and data analysis tools.

    Science.gov (United States)

    Brereton, Richard G; Jansen, Jeroen; Lopes, João; Marini, Federico; Pomerantsev, Alexey; Rodionova, Oxana; Roger, Jean Michel; Walczak, Beata; Tauler, Romà

    2017-10-01

    Chemometrics has achieved major recognition and progress in the analytical chemistry field. In the first part of this tutorial, major achievements and contributions of chemometrics to some of the more important stages of the analytical process, like experimental design, sampling, and data analysis (including data pretreatment and fusion), are summarised. The tutorial is intended to give a general updated overview of the chemometrics field to further contribute to its dissemination and promotion in analytical chemistry.

  14. Selecting sagebrush seed sources for restoration in a variable climate: ecophysiological variation among genotypes

    Science.gov (United States)

    Germino, Matthew J.

    2012-01-01

    Big sagebrush (Artemisia tridentata) communities dominate a large fraction of the United States and provide critical habitat for a number of wildlife species of concern. Loss of big sagebrush due to fire followed by poor restoration success continues to reduce ecological potential of this ecosystem type, particularly in the Great Basin. Choice of appropriate seed sources for restoration efforts is currently unguided due to knowledge gaps on genetic variation and local adaptation as they relate to a changing landscape. We are assessing ecophysiological responses of big sagebrush to climate variation, comparing plants that germinated from ~20 geographically distinct populations of each of the three subspecies of big sagebrush. Seedlings were previously planted into common gardens by US Forest Service collaborators Drs. B. Richardson and N. Shaw, (USFS Rocky Mountain Research Station, Provo, Utah and Boise, Idaho) as part of the Great Basin Native Plant Selection and Increase Project. Seed sources spanned all states in the conterminous Western United States. Germination, establishment, growth and ecophysiological responses are being linked to genomics and foliar palatability. New information is being produced to aid choice of appropriate seed sources by Bureau of Land Management and USFS field offices when they are planning seed acquisitions for emergency post-fire rehabilitation projects while considering climate variability and wildlife needs.

  15. Selective attrition and intraindividual variability in response time moderate cognitive change.

    Science.gov (United States)

    Yao, Christie; Stawski, Robert S; Hultsch, David F; MacDonald, Stuart W S

    2016-01-01

    Selection of a developmental time metric is useful for understanding causal processes that underlie aging-related cognitive change and for the identification of potential moderators of cognitive decline. Building on research suggesting that time to attrition is a metric sensitive to non-normative influences of aging (e.g., subclinical health conditions), we examined reason for attrition and intraindividual variability (IIV) in reaction time as predictors of cognitive performance. Three hundred and four community dwelling older adults (64-92 years) completed annual assessments in a longitudinal study. IIV was calculated from baseline performance on reaction time tasks. Multilevel models were fit to examine patterns and predictors of cognitive change. We show that time to attrition was associated with cognitive decline. Greater IIV was associated with declines on executive functioning and episodic memory measures. Attrition due to personal health reasons was also associated with decreased executive functioning compared to that of individuals who remained in the study. These findings suggest that time to attrition is a useful metric for representing cognitive change, and reason for attrition and IIV are predictive of non-normative influences that may underlie instances of cognitive loss in older adults.

  16. Resiliency and subjective health assessment. Moderating role of selected psychosocial variables

    Directory of Open Access Journals (Sweden)

    Michalina Sołtys

    2015-12-01

    Full Text Available Background Resiliency is defined as a relatively permanent personality trait, which may be assigned to the category of health resources. The aim of this study was to determine conditions in which resiliency poses a significant health resource (moderation, thereby broadening knowledge of the specifics of the relationship between resiliency and subjective health assessment. Participants and procedure The study included 142 individuals. In order to examine the level of resiliency, the Assessment Resiliency Scale (SPP-25 by N. Ogińska-Bulik and Z. Juczyński was used. Participants evaluated subjective health state by means of an analogue-visual scale. Additionally, in the research the following moderating variables were controlled: sex, objective health status, having a partner, professional activity and age. These data were obtained by personal survey. Results The results confirmed the relationship between resiliency and subjective health assessment. Multiple regression analysis revealed that sex, having a partner and professional activity are significant moderators of associations between level of resiliency and subjective health evaluation. However, statistically significant interaction effects for health status and age as a moderator were not observed. Conclusions Resiliency is associated with subjective health assessment among adults, and selected socio-demographic features (such as sex, having a partner, professional activity moderate this relationship. This confirms the significant role of resiliency as a health resource and a reason to emphasize the benefits of enhancing the potential of individuals for their psychophysical wellbeing. However, the research requires replication in a more homogeneous sample.

  17. Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies.

    Science.gov (United States)

    Savitsky, Terrance; Vannucci, Marina; Sha, Naijun

    2011-02-01

    This paper presents a unified treatment of Gaussian process models that extends to data from the exponential dispersion family and to survival data. Our specific interest is in the analysis of data sets with predictors that have an a priori unknown form of possibly nonlinear associations to the response. The modeling approach we describe incorporates Gaussian processes in a generalized linear model framework to obtain a class of nonparametric regression models where the covariance matrix depends on the predictors. We consider, in particular, continuous, categorical and count responses. We also look into models that account for survival outcomes. We explore alternative covariance formulations for the Gaussian process prior and demonstrate the flexibility of the construction. Next, we focus on the important problem of selecting variables from the set of possible predictors and describe a general framework that employs mixture priors. We compare alternative MCMC strategies for posterior inference and achieve a computationally efficient and practical approach. We demonstrate performances on simulated and benchmark data sets.

  18. Relationship of Powder Feedstock Variability to Microstructure and Defects in Selective Laser Melted Alloy 718

    Science.gov (United States)

    Smith, T. M.; Kloesel, M. F.; Sudbrack, C. K.

    2017-01-01

    Powder-bed additive manufacturing processes use fine powders to build parts layer by layer. For selective laser melted (SLM) Alloy 718, the powders that are available off-the-shelf are in the 10-45 or 15-45 micron size range. A comprehensive investigation of sixteen powders from these typical ranges and two off-nominal-sized powders is underway to gain insight into the impact of feedstock on processing, durability and performance of 718 SLM space-flight hardware. This talk emphasizes an aspect of this work: the impact of powder variability on the microstructure and defects observed in the as-fabricated and full heated material, where lab-scale components were built using vendor recommended parameters. These typical powders exhibit variation in composition, percentage of fines, roughness, morphology and particle size distribution. How these differences relate to the melt-pool size, porosity, grain structure, precipitate distributions, and inclusion content will be presented and discussed in context of build quality and powder acceptance.

  19. r2VIM: A new variable selection method for random forests in genome-wide association studies.

    Science.gov (United States)

    Szymczak, Silke; Holzinger, Emily; Dasgupta, Abhijit; Malley, James D; Molloy, Anne M; Mills, James L; Brody, Lawrence C; Stambolian, Dwight; Bailey-Wilson, Joan E

    2016-01-01

    Machine learning methods and in particular random forests (RFs) are a promising alternative to standard single SNP analyses in genome-wide association studies (GWAS). RFs provide variable importance measures (VIMs) to rank SNPs according to their predictive power. However, in contrast to the established genome-wide significance threshold, no clear criteria exist to determine how many SNPs should be selected for downstream analyses. We propose a new variable selection approach, recurrent relative variable importance measure (r2VIM). Importance values are calculated relative to an observed minimal importance score for several runs of RF and only SNPs with large relative VIMs in all of the runs are selected as important. Evaluations on simulated GWAS data show that the new method controls the number of false-positives under the null hypothesis. Under a simple alternative hypothesis with several independent main effects it is only slightly less powerful than logistic regression. In an experimental GWAS data set, the same strong signal is identified while the approach selects none of the SNPs in an underpowered GWAS. The novel variable selection method r2VIM is a promising extension to standard RF for objectively selecting relevant SNPs in GWAS while controlling the number of false-positive results.

  20. An experiment on selecting most informative variables in socio-economic data

    Directory of Open Access Journals (Sweden)

    L. Jenkins

    2014-01-01

    Full Text Available In many studies where data are collected on several variables, there is a motivation to find if fewer variables would provide almost as much information. Variance of a variable about its mean is the common statistical measure of information content, and that is used here. We are interested whether the variability in one variable is sufficiently correlated with that in one or more of the other variables that the first variable is redundant. We wish to find one or more ‘principal variables’ that sufficiently reflect the information content in all the original variables. The paper explains the method of principal variables and reports experiments using the technique to see if just a few variables are sufficient to reflect the information in 11 socioeconomic variables on 130 countries from a World Bank (WB database. While the method of principal variables is highly successful in a statistical sense, the WB data varies greatly from year to year, demonstrating that fewer variables wo uld be inadequate for this data.

  1. Application of mass spectrometry based electronic nose and chemometrics for fingerprinting radiation treatment

    Science.gov (United States)

    Gupta, Sumit; Variyar, Prasad S.; Sharma, Arun

    2015-01-01

    Volatile compounds were isolated from apples and grapes employing solid phase micro extraction (SPME) and subsequently analyzed by GC/MS equipped with a transfer line without stationary phase. Single peak obtained was integrated to obtain total mass spectrum of the volatile fraction of samples. A data matrix having relative abundance of all mass-to-charge ratios was subjected to principal component analysis (PCA) and linear discriminant analysis (LDA) to identify radiation treatment. PCA results suggested that there is sufficient variability between control and irradiated samples to build classification models based on supervised techniques. LDA successfully aided in segregating control from irradiated samples at all doses (0.1, 0.25, 0.5, 1.0, 1.5, 2.0 kGy). SPME-MS with chemometrics was successfully demonstrated as simple screening method for radiation treatment.

  2. Distribution and mobility of metals in contaminated sites. chemometric investigation of pollutant profiles.

    Science.gov (United States)

    Abollino, Ornella; Aceto, Maurizio; Malandrino, Mery; Mentasti, Edoardo; Sarzanini, Corrado; Barberis, Renzo

    2002-01-01

    The distribution and mobility of heavy metals in the soils of two contaminated sites in Piedmont (Italy) was investigated, evaluating the horizontal and vertical profiles of 15 metals, namely Al, Cd, Cu, Cr, Fe. La, Mn, Ni, Pb, Sc, Ti, V, Y, Zn and Zr. The concentrations in the most polluted areas of the sites were higher than the acceptable limits reported in Italian and Dutch legislations for soil reclamation. Chemometric elaboration of the results by pattern recognition techniques allowed us to identify groups of samples with similar characteristics and to find correlations among the variables. The pollutant mobility was studied by extraction with water, dilute acetic acid and EDTA and by applying Tessier's procedure. The fraction of mobile species, which potentially is the most harmful for the environment, was found to be higher than the one normally present in unpolluted soils, where heavy metals are, to a higher extent, strongly bound to the matrix.

  3. Chemometric approach to evaluate heavy metals’ content in Daucus Carota from different localities in Serbia

    Directory of Open Access Journals (Sweden)

    Mitic Violeta D.

    2015-01-01

    Full Text Available The aim of this study was to evaluate heavy metal content in carrots (Daucus carota from the different localities in Serbia and assess by the cluster analysis (CA and principal components analysis (PCA the heavy metal contamination of carrots from these areas. Carrot was collected at 13 locations in five districts. Chemometric methods (CA and PCA were applied to classify localities according to heavy metal content in carrots. CA separated localities into two statistical significant clusters. PCA permitted the reduction of 12 variables to four principal components explaining 79.94% of the total variance. The first most important principal component was strongly associated with the value of Cu, Sb, Pb and Tl. This study revealed that CA and PCA appear useful tools for differentiation of localities in different districts using the profile of heavy metal in carrot samples. [Projekat Ministarstva nauke Republike Srbije, br. 172051

  4. Stochastic weather inputs for improved urban water demand forecasting: application of nonlinear input variable selection and machine learning methods

    Science.gov (United States)

    Quilty, J.; Adamowski, J. F.

    2015-12-01

    Urban water supply systems are often stressed during seasonal outdoor water use as water demands related to the climate are variable in nature making it difficult to optimize the operation of the water supply system. Urban water demand forecasts (UWD) failing to include meteorological conditions as inputs to the forecast model may produce poor forecasts as they cannot account for the increase/decrease in demand related to meteorological conditions. Meteorological records stochastically simulated into the future can be used as inputs to data-driven UWD forecasts generally resulting in improved forecast accuracy. This study aims to produce data-driven UWD forecasts for two different Canadian water utilities (Montreal and Victoria) using machine learning methods by first selecting historical UWD and meteorological records derived from a stochastic weather generator using nonlinear input variable selection. The nonlinear input variable selection methods considered in this work are derived from the concept of conditional mutual information, a nonlinear dependency measure based on (multivariate) probability density functions and accounts for relevancy, conditional relevancy, and redundancy from a potential set of input variables. The results of our study indicate that stochastic weather inputs can improve UWD forecast accuracy for the two sites considered in this work. Nonlinear input variable selection is suggested as a means to identify which meteorological conditions should be utilized in the forecast.

  5. Variations in Carabidae assemblages across the farmland habitats in relation to selected environmental variables including soil properties

    Directory of Open Access Journals (Sweden)

    Beáta Baranová

    2018-03-01

    Full Text Available The variations in ground beetles (Coleoptera: Carabidae assemblages across the three types of farmland habitats, arable land, meadows and woody vegetation were studied in relation to vegetation cover structure, intensity of agrotechnical interventions and selected soil properties. Material was pitfall trapped in 2010 and 2011 on twelve sites of the agricultural landscape in the Prešov town and its near vicinity, Eastern Slovakia. A total of 14,763 ground beetle individuals were entrapped. Material collection resulted into 92 Carabidae species, with the following six species dominating: Poecilus cupreus, Pterostichus melanarius, Pseudoophonus rufipes, Brachinus crepitans, Anchomenus dorsalis and Poecilus versicolor. Studied habitats differed significantly in the number of entrapped individuals, activity abundance as well as representation of the carabids according to their habitat preferences and ability to fly. However, no significant distinction was observed in the diversity, evenness neither dominance. The most significant environmental variables affecting Carabidae assemblages species variability were soil moisture and herb layer 0-20 cm. Another best variables selected by the forward selection were intensity of agrotechnical interventions, humus content and shrub vegetation. The other from selected soil properties seem to have just secondary meaning for the adult carabids. Environmental variables have the strongest effect on the habitat specialists, whereas ground beetles without special requirements to the habitat quality seem to be affected by the studied environmental variables just little.

  6. Developing a spatial-statistical model and map of historical malaria prevalence in Botswana using a staged variable selection procedure

    Directory of Open Access Journals (Sweden)

    Mabaso Musawenkosi LH

    2007-09-01

    Full Text Available Abstract Background Several malaria risk maps have been developed in recent years, many from the prevalence of infection data collated by the MARA (Mapping Malaria Risk in Africa project, and using various environmental data sets as predictors. Variable selection is a major obstacle due to analytical problems caused by over-fitting, confounding and non-independence in the data. Testing and comparing every combination of explanatory variables in a Bayesian spatial framework remains unfeasible for most researchers. The aim of this study was to develop a malaria risk map using a systematic and practicable variable selection process for spatial analysis and mapping of historical malaria risk in Botswana. Results Of 50 potential explanatory variables from eight environmental data themes, 42 were significantly associated with malaria prevalence in univariate logistic regression and were ranked by the Akaike Information Criterion. Those correlated with higher-ranking relatives of the same environmental theme, were temporarily excluded. The remaining 14 candidates were ranked by selection frequency after running automated step-wise selection procedures on 1000 bootstrap samples drawn from the data. A non-spatial multiple-variable model was developed through step-wise inclusion in order of selection frequency. Previously excluded variables were then re-evaluated for inclusion, using further step-wise bootstrap procedures, resulting in the exclusion of another variable. Finally a Bayesian geo-statistical model using Markov Chain Monte Carlo simulation was fitted to the data, resulting in a final model of three predictor variables, namely summer rainfall, mean annual temperature and altitude. Each was independently and significantly associated with malaria prevalence after allowing for spatial correlation. This model was used to predict malaria prevalence at unobserved locations, producing a smooth risk map for the whole country. Conclusion We have

  7. Dynamic variable selection in SNP genotype autocalling from APEX microarray data

    Directory of Open Access Journals (Sweden)

    Zamar Ruben H

    2006-11-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are DNA sequence variations, occurring when a single nucleotide – adenine (A, thymine (T, cytosine (C or guanine (G – is altered. Arguably, SNPs account for more than 90% of human genetic variation. Our laboratory has developed a highly redundant SNP genotyping assay consisting of multiple probes with signals from multiple channels for a single SNP, based on arrayed primer extension (APEX. This mini-sequencing method is a powerful combination of a highly parallel microarray with distinctive Sanger-based dideoxy terminator sequencing chemistry. Using this microarray platform, our current genotype calling system (known as SNP Chart is capable of calling single SNP genotypes by manual inspection of the APEX data, which is time-consuming and exposed to user subjectivity bias. Results Using a set of 32 Coriell DNA samples plus three negative PCR controls as a training data set, we have developed a fully-automated genotyping algorithm based on simple linear discriminant analysis (LDA using dynamic variable selection. The algorithm combines separate analyses based on the multiple probe sets to give a final posterior probability for each candidate genotype. We have tested our algorithm on a completely independent data set of 270 DNA samples, with validated genotypes, from patients admitted to the intensive care unit (ICU of St. Paul's Hospital (plus one negative PCR control sample. Our method achieves a concordance rate of 98.9% with a 99.6% call rate for a set of 96 SNPs. By adjusting the threshold value for the final posterior probability of the called genotype, the call rate reduces to 94.9% with a higher concordance rate of 99.6%. We also reversed the two independent data sets in their training and testing roles, achieving a concordance rate up to 99.8%. Conclusion The strength of this APEX chemistry-based platform is its unique redundancy having multiple probes for a single SNP. Our

  8. Network-based group variable selection for detecting expression quantitative trait loci (eQTL

    Directory of Open Access Journals (Sweden)

    Zhang Xuegong

    2011-06-01

    Full Text Available Abstract Background Analysis of expression quantitative trait loci (eQTL aims to identify the genetic loci associated with the expression level of genes. Penalized regression with a proper penalty is suitable for the high-dimensional biological data. Its performance should be enhanced when we incorporate biological knowledge of gene expression network and linkage disequilibrium (LD structure between loci in high-noise background. Results We propose a network-based group variable selection (NGVS method for QTL detection. Our method simultaneously maps highly correlated expression traits sharing the same biological function to marker sets formed by LD. By grouping markers, complex joint activity of multiple SNPs can be considered and the dimensionality of eQTL problem is reduced dramatically. In order to demonstrate the power and flexibility of our method, we used it to analyze two simulations and a mouse obesity and diabetes dataset. We considered the gene co-expression network, grouped markers into marker sets and treated the additive and dominant effect of each locus as a group: as a consequence, we were able to replicate results previously obtained on the mouse linkage dataset. Furthermore, we observed several possible sex-dependent loci and interactions of multiple SNPs. Conclusions The proposed NGVS method is appropriate for problems with high-dimensional data and high-noise background. On eQTL problem it outperforms the classical Lasso method, which does not consider biological knowledge. Introduction of proper gene expression and loci correlation information makes detecting causal markers more accurate. With reasonable model settings, NGVS can lead to novel biological findings.

  9. Bayesian nonparametric variable selection as an exploratory tool for discovering differentially expressed genes.

    Science.gov (United States)

    Shahbaba, Babak; Johnson, Wesley O

    2013-05-30

    High-throughput scientific studies involving no clear a priori hypothesis are common. For example, a large-scale genomic study of a disease may examine thousands of genes without hypothesizing that any specific gene is responsible for the disease. In these studies, the objective is to explore a large number of possible factors (e.g., genes) in order to identify a small number that will be considered in follow-up studies that tend to be more thorough and on smaller scales. A simple, hierarchical, linear regression model with random coefficients is assumed for case-control data that correspond to each gene. The specific model used will be seen to be related to a standard Bayesian variable selection model. Relatively large regression coefficients correspond to potential differences in responses for cases versus controls and thus to genes that might 'matter'. For large-scale studies, and using a Dirichlet process mixture model for the regression coefficients, we are able to find clusters of regression effects of genes with increasing potential effect or 'relevance', in relation to the outcome of interest. One cluster will always correspond to genes whose coefficients are in a neighborhood that is relatively close to zero and will be deemed least relevant. Other clusters will correspond to increasing magnitudes of the random/latent regression coefficients. Using simulated data, we demonstrate that our approach could be quite effective in finding relevant genes compared with several alternative methods. We apply our model to two large-scale studies. The first study involves transcriptome analysis of infection by human cytomegalovirus. The second study's objective is to identify differentially expressed genes between two types of leukemia. Copyright © 2012 John Wiley & Sons, Ltd.

  10. Determination of Commercials Cooking Oils and Fats Using Chemometrics Methods

    International Nuclear Information System (INIS)

    Azwan Mat Lazim; Mohd Zuli Jaafar; Phang Wei Shong, P.W.; Suzereen Jamil

    2013-01-01

    In this study, chemometric method has been used in determining the oil quality. The samples used were olive oil, sunflower oil and butter from two different brands. Two different conditions were applied, either it was fresh or fried. Titratio, a conventional method was used to determine free fatty acids content (FFA), iodine value (IV), and peroxide value (PV). Twelve samples were then used for analysis and their FTIR spectra were measured at 4000-400 cm -1 . The computer stimulation was used to process the data based on their pattern recognition which optimized by principal component analysis (PCA) and partial least squares (PLS). PCA model was used to distinguish the properties between fresh and fried oil. The PLS model was used to predict the value for validation test in comparison with conventional results. Results showed the validation value for fresh oil was 0.90. This indicated the chemometric method was in agreement with conventional method. (author)

  11. Determination and discrimination of biodiesel fuels by gas chromatographic and chemometric methods

    Science.gov (United States)

    Milina, R.; Mustafa, Z.; Bojilov, D.; Dagnon, S.; Moskovkina, M.

    2016-03-01

    Pattern recognition method (PRM) was applied to gas chromatographic (GC) data for a fatty acid methyl esters (FAME) composition of commercial and laboratory synthesized biodiesel fuels from vegetable oils including sunflower, rapeseed, corn and palm oils. Two GC quantitative methods to calculate individual fames were compared: Area % and internal standard. The both methods were applied for analysis of two certified reference materials. The statistical processing of the obtained results demonstrates the accuracy and precision of the two methods and allows them to be compared. For further chemometric investigations of biodiesel fuels by their FAME-profiles any of those methods can be used. PRM results of FAME profiles of samples from different vegetable oils show a successful recognition of biodiesels according to the feedstock. The information obtained can be used for selection of feedstock to produce biodiesels with certain properties, for assessing their interchangeability, for fuel spillage and remedial actions in the environment.

  12. Determination and discrimination of biodiesel fuels by gas chromatographic and chemometric methods

    Directory of Open Access Journals (Sweden)

    Milina R.

    2016-03-01

    Full Text Available Pattern recognition method (PRM was applied to gas chromatographic (GC data for a fatty acid methyl esters (FAME composition of commercial and laboratory synthesized biodiesel fuels from vegetable oils including sunflower, rapeseed, corn and palm oils. Two GC quantitative methods to calculate individual fames were compared: Area % and internal standard. The both methods were applied for analysis of two certified reference materials. The statistical processing of the obtained results demonstrates the accuracy and precision of the two methods and allows them to be compared. For further chemometric investigations of biodiesel fuels by their FAME-profiles any of those methods can be used. PRM results of FAME profiles of samples from different vegetable oils show a successful recognition of biodiesels according to the feedstock. The information obtained can be used for selection of feedstock to produce biodiesels with certain properties, for assessing their interchangeability, for fuel spillage and remedial actions in the environment.

  13. QUASI-STELLAR OBJECT SELECTION ALGORITHM USING TIME VARIABILITY AND MACHINE LEARNING: SELECTION OF 1620 QUASI-STELLAR OBJECT CANDIDATES FROM MACHO LARGE MAGELLANIC CLOUD DATABASE

    International Nuclear Information System (INIS)

    Kim, Dae-Won; Protopapas, Pavlos; Alcock, Charles; Trichas, Markos; Byun, Yong-Ik; Khardon, Roni

    2011-01-01

    We present a new quasi-stellar object (QSO) selection algorithm using a Support Vector Machine, a supervised classification method, on a set of extracted time series features including period, amplitude, color, and autocorrelation value. We train a model that separates QSOs from variable stars, non-variable stars, and microlensing events using 58 known QSOs, 1629 variable stars, and 4288 non-variables in the MAssive Compact Halo Object (MACHO) database as a training set. To estimate the efficiency and the accuracy of the model, we perform a cross-validation test using the training set. The test shows that the model correctly identifies ∼80% of known QSOs with a 25% false-positive rate. The majority of the false positives are Be stars. We applied the trained model to the MACHO Large Magellanic Cloud (LMC) data set, which consists of 40 million light curves, and found 1620 QSO candidates. During the selection none of the 33,242 known MACHO variables were misclassified as QSO candidates. In order to estimate the true false-positive rate, we crossmatched the candidates with astronomical catalogs including the Spitzer Surveying the Agents of a Galaxy's Evolution LMC catalog and a few X-ray catalogs. The results further suggest that the majority of the candidates, more than 70%, are QSOs.

  14. Bayesian inference for the genetic control of water deficit tolerance in spring wheat by stochastic search variable selection.

    Science.gov (United States)

    Safari, Parviz; Danyali, Syyedeh Fatemeh; Rahimi, Mehdi

    2018-06-02

    Drought is the main abiotic stress seriously influencing wheat production. Information about the inheritance of drought tolerance is necessary to determine the most appropriate strategy to develop tolerant cultivars and populations. In this study, generation means analysis to identify the genetic effects controlling grain yield inheritance in water deficit and normal conditions was considered as a model selection problem in a Bayesian framework. Stochastic search variable selection (SSVS) was applied to identify the most important genetic effects and the best fitted models using different generations obtained from two crosses applying two water regimes in two growing seasons. The SSVS is used to evaluate the effect of each variable on the dependent variable via posterior variable inclusion probabilities. The model with the highest posterior probability is selected as the best model. In this study, the grain yield was controlled by the main effects (additive and non-additive effects) and epistatic. The results demonstrate that breeding methods such as recurrent selection and subsequent pedigree method and hybrid production can be useful to improve grain yield.

  15. Angular scanning and variable wavelength surface plasmon resonance allowing free sensor surface selection for optimum material- and bio-sensing

    NARCIS (Netherlands)

    Lakayan, Dina; Tuppurainen, Jussipekka; Albers, Martin; van Lint, Matthijs J.; van Iperen, Dick J.; Weda, Jelmer J.A.; Kuncova-Kallio, Johana; Somsen, Govert W.; Kool, Jeroen

    2018-01-01

    A variable-wavelength Kretschmann configuration surface plasmon resonance (SPR) apparatus with angle scanning is presented. The setup provides the possibility of selecting the optimum wavelength with respect to the properties of the metal layer of the sensorchip, sample matrix, and biomolecular

  16. Multivariate modeling of complications with data driven variable selection: Guarding against overfitting and effects of data set size

    International Nuclear Information System (INIS)

    Schaaf, Arjen van der; Xu Chengjian; Luijk, Peter van; Veld, Aart A. van’t; Langendijk, Johannes A.; Schilstra, Cornelis

    2012-01-01

    Purpose: Multivariate modeling of complications after radiotherapy is frequently used in conjunction with data driven variable selection. This study quantifies the risk of overfitting in a data driven modeling method using bootstrapping for data with typical clinical characteristics, and estimates the minimum amount of data needed to obtain models with relatively high predictive power. Materials and methods: To facilitate repeated modeling and cross-validation with independent datasets for the assessment of true predictive power, a method was developed to generate simulated data with statistical properties similar to real clinical data sets. Characteristics of three clinical data sets from radiotherapy treatment of head and neck cancer patients were used to simulate data with set sizes between 50 and 1000 patients. A logistic regression method using bootstrapping and forward variable selection was used for complication modeling, resulting for each simulated data set in a selected number of variables and an estimated predictive power. The true optimal number of variables and true predictive power were calculated using cross-validation with very large independent data sets. Results: For all simulated data set sizes the number of variables selected by the bootstrapping method was on average close to the true optimal number of variables, but showed considerable spread. Bootstrapping is more accurate in selecting the optimal number of variables than the AIC and BIC alternatives, but this did not translate into a significant difference of the true predictive power. The true predictive power asymptotically converged toward a maximum predictive power for large data sets, and the estimated predictive power converged toward the true predictive power. More than half of the potential predictive power is gained after approximately 200 samples. Our simulations demonstrated severe overfitting (a predicative power lower than that of predicting 50% probability) in a number of small

  17. The Salience of Selected Variables on Choice for Movie Attendance among High School Students.

    Science.gov (United States)

    Austin, Bruce A.

    A questionnaire was designed for a study assessing both the importance of 28 variables in movie attendance and the importance of movie-going as a leisure-time activity. Respondents were 130 ninth and twelfth grade students. The 28 variables were broadly organized into eight categories: movie production personnel, production elements, advertising,…

  18. PLS-based and regularization-based methods for the selection of relevant variables in non-targeted metabolomics data

    Directory of Open Access Journals (Sweden)

    Renata Bujak

    2016-07-01

    Full Text Available Non-targeted metabolomics constitutes a part of systems biology and aims to determine many metabolites in complex biological samples. Datasets obtained in non-targeted metabolomics studies are multivariate and high-dimensional due to the sensitivity of mass spectrometry-based detection methods as well as complexity of biological matrices. Proper selection of variables which contribute into group classification is a crucial step, especially in metabolomics studies which are focused on searching for disease biomarker candidates. In the present study, three different statistical approaches were tested using two metabolomics datasets (RH and PH study. Orthogonal projections to latent structures-discriminant analysis (OPLS-DA without and with multiple testing correction as well as least absolute shrinkage and selection operator (LASSO were tested and compared. For the RH study, OPLS-DA model built without multiple testing correction, selected 46 and 218 variables based on VIP criteria using Pareto and UV scaling, respectively. In the case of the PH study, 217 and 320 variables were selected based on VIP criteria using Pareto and UV scaling, respectively. In the RH study, OPLS-DA model built with multiple testing correction, selected 4 and 19 variables as statistically significant in terms of Pareto and UV scaling, respectively. For PH study, 14 and 18 variables were selected based on VIP criteria in terms of Pareto and UV scaling, respectively. Additionally, the concept and fundaments of the least absolute shrinkage and selection operator (LASSO with bootstrap procedure evaluating reproducibility of results, was demonstrated. In the RH and PH study, the LASSO selected 14 and 4 variables with reproducibility between 99.3% and 100%. However, apart from the popularity of PLS-DA and OPLS-DA methods in metabolomics, it should be highlighted that they do not control type I or type II error, but only arbitrarily establish a cut-off value for PLS-DA loadings

  19. The role of protozoa-driven selection in shaping human genetic variability.

    Science.gov (United States)

    Pozzoli, Uberto; Fumagalli, Matteo; Cagliani, Rachele; Comi, Giacomo P; Bresolin, Nereo; Clerici, Mario; Sironi, Manuela

    2010-03-01

    Protozoa exert a strong selective pressure in humans. The selection signatures left by these pathogens can be exploited to identify genetic modulators of infection susceptibility. We show that protozoa diversity in different geographic locations is a good measure of protozoa-driven selective pressure; protozoa diversity captured selection signatures at known malaria resistance loci and identified several selected single nucleotide polymorphisms in immune and hemolytic anemia genes. A genome-wide search enabled us to identify 5180 variants mapping to 1145 genes that are subjected to protozoa-driven selective pressure. We provide a genome-wide estimate of protozoa-driven selective pressure and identify candidate susceptibility genes for protozoa-borne diseases. Copyright 2010 Elsevier Ltd. All rights reserved.

  20. Variable selection for confounder control, flexible modeling and Collaborative Targeted Minimum Loss-based Estimation in causal inference

    Science.gov (United States)

    Schnitzer, Mireille E.; Lok, Judith J.; Gruber, Susan

    2015-01-01

    This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple example, we directly show the consequences of adjusting for pure causes of the exposure when using inverse probability of treatment weighting (IPTW). Such variables are likely to be selected when using a naive approach to model selection for the propensity score. We describe how the method of Collaborative Targeted minimum loss-based estimation (C-TMLE; van der Laan and Gruber, 2010) capitalizes on the collaborative double robustness property of semiparametric efficient estimators to select covariates for the propensity score based on the error in the conditional outcome model. Finally, we compare several approaches to automated variable selection in low-and high-dimensional settings through a simulation study. From this simulation study, we conclude that using IPTW with flexible prediction for the propensity score can result in inferior estimation, while Targeted minimum loss-based estimation and C-TMLE may benefit from flexible prediction and remain robust to the presence of variables that are highly correlated with treatment. However, in our study, standard influence function-based methods for the variance underestimated the standard errors, resulting in poor coverage under certain data-generating scenarios. PMID:26226129

  1. Chemometric and Statistical Analyses of ToF-SIMS Spectra of Increasingly Complex Biological Samples

    Energy Technology Data Exchange (ETDEWEB)

    Berman, E S; Wu, L; Fortson, S L; Nelson, D O; Kulp, K S; Wu, K J

    2007-10-24

    Characterizing and classifying molecular variation within biological samples is critical for determining fundamental mechanisms of biological processes that will lead to new insights including improved disease understanding. Towards these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance, including monosaccharide isomers, pure proteins, complex protein mixtures, and mouse embryo tissues. The complex mass spectral data sets produced were analyzed using five common statistical and chemometric multivariate analysis techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision tree analysis by recursive partitioning. PCA was found to be a valuable first step in multivariate analysis, providing insight both into the relative groupings of samples and into the molecular basis for those groupings. For the monosaccharides, pure proteins and protein mixture samples, all of LDA, PLSDA, and SIMCA were found to produce excellent classification given a sufficient number of compound variables calculated. For the mouse embryo tissues, however, SIMCA did not produce as accurate a classification. The decision tree analysis was found to be the least successful for all the data sets, providing neither as accurate a classification nor chemical insight for any of the tested samples. Based on these results we conclude that as the complexity of the sample increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification analysis. This study demonstrates the strength of ToF-SIMS combined with multivariate statistical and chemometric techniques to classify increasingly complex biological samples

  2. Chemometrics applications in biotechnology processes: predicting column integrity and impurity clearance during reuse of chromatography resin.

    Science.gov (United States)

    Rathore, Anurag S; Mittal, Shachi; Lute, Scott; Brorson, Kurt

    2012-01-01

    Separation media, in particular chromatography media, is typically one of the major contributors to the cost of goods for production of a biotechnology therapeutic. To be cost-effective, it is industry practice that media be reused over several cycles before being discarded. The traditional approach for estimating the number of cycles a particular media can be reused for involves performing laboratory scale experiments that monitor column performance and carryover. This dataset is then used to predict the number of cycles the media can be used at manufacturing scale (concurrent validation). Although, well accepted and widely practiced, there are challenges associated with extrapolating the laboratory scale data to manufacturing scale due to differences that may exist across scales. Factors that may be different include: level of impurities in the feed material, lot to lot variability in feedstock impurities, design of the column housing unit with respect to cleanability, and homogeneity of the column packing. In view of these challenges, there is a need for approaches that may be able to predict column underperformance at the manufacturing scale over the product lifecycle. In case such an underperformance is predicted, the operators can unpack and repack the chromatography column beforehand and thus avoid batch loss. Chemometrics offers one such solution. In this article, we present an application of chemometrics toward the analysis of a set of chromatography profiles with the intention of predicting the various events of column underperformance including the backpressure buildup and inefficient deoxyribonucleic acid clearance. Copyright © 2012 American Institute of Chemical Engineers (AIChE).

  3. Classification of Tropical River Using Chemometrics Technique: Case Study in Pahang River, Malaysia

    International Nuclear Information System (INIS)

    Mohd Khairul Amri Kamarudin; Mohd Ekhwan Toriman; Nur Hishaam Sulaiman

    2015-01-01

    River classification is very important to know the river characteristic in study areas, where this database can help to understand the behaviour of the river. This article discusses about river classification using Chemometrics techniques in mainstream of Pahang River. Based on river survey, GIS and Remote Sensing database, the chemometric analysis techniques have been used to identify the cluster on the Pahang River using Hierarchical Agglomerative Cluster Analysis (HACA). Calibration and validation process using Discriminant Analysis (DA) has been used to confirm the HACA result. Principal Component Analysis (PCA) study to see the strong coefficient where the Pahang River has been classed. The results indicated the main of Pahang River has been classed to three main clusters as upstream, middle stream and downstream. Base on DA analysis, the calibration and validation model shows 100 % convinced. While the PCA indicates there are three variables that have a significant correlation, domination slope with R"2 0.796, L/D ratio with R"2 -0868 and sinuosity with R"2 0.557. Map of the river classification with moving class also was produced. Where the green colour considered in valley erosion zone, yellow in a low terrace of land near the channels and red colour class in flood plain and valley deposition zone. From this result, the basic information can be produced to understand the characteristics of the main Pahang River. This result is important to local authorities to make decisions according to the cluster or guidelines for future study in Pahang River, Malaysia specifically and for Tropical River generally. The research findings are important to local authorities by providing basic data as a guidelines to the integrated river management at Pahang River, and Tropical River in general. (author)

  4. Multivariate Approaches for Simultaneous Determination of Avanafil and Dapoxetine by UV Chemometrics and HPLC-QbD in Binary Mixtures and Pharmaceutical Product.

    Science.gov (United States)

    2016-04-07

    Multivariate UV-spectrophotometric methods and Quality by Design (QbD) HPLC are described for concurrent estimation of avanafil (AV) and dapoxetine (DP) in the binary mixture and in the dosage form. Chemometric methods have been developed, including classical least-squares, principal component regression, partial least-squares, and multiway partial least-squares. Analytical figures of merit, such as sensitivity, selectivity, analytical sensitivity, LOD, and LOQ were determined. QbD consists of three steps, starting with the screening approach to determine the critical process parameter and response variables. This is followed by understanding of factors and levels, and lastly the application of a Box-Behnken design containing four critical factors that affect the method. From an Ishikawa diagram and a risk assessment tool, four main factors were selected for optimization. Design optimization, statistical calculation, and final-condition optimization of all the reactions were Carried out. Twenty-five experiments were done, and a quadratic model was used for all response variables. Desirability plot, surface plot, design space, and three-dimensional plots were calculated. In the optimized condition, HPLC separation was achieved on Phenomenex Gemini C18 column (250 × 4.6 mm, 5 μm) using acetonitrile-buffer (ammonium acetate buffer at pH 3.7 with acetic acid) as a mobile phase at flow rate of 0.7 mL/min. Quantification was done at 239 nm, and temperature was set at 20°C. The developed methods were validated and successfully applied for simultaneous determination of AV and DP in the dosage form.

  5. The selection of a mode of urban transportation: Integrating psychological variables to discrete choice models

    International Nuclear Information System (INIS)

    Cordoba Maquilon, Jorge E; Gonzalez Calderon, Carlos A; Posada Henao, John J

    2011-01-01

    A study using revealed preference surveys and psychological tests was conducted. Key psychological variables of behavior involved in the choice of transportation mode in a population sample of the Metropolitan Area of the Valle de Aburra were detected. The experiment used the random utility theory for discrete choice models and reasoned action in order to assess beliefs. This was used as a tool for analysis of the psychological variables using the sixteen personality factor questionnaire (16PF test). In addition to the revealed preference surveys, two other surveys were carried out: one with socio-economic characteristics and the other with latent indicators. This methodology allows for an integration of discrete choice models and latent variables. The integration makes the model operational and quantifies the unobservable psychological variables. The most relevant result obtained was that anxiety affects the choice of urban transportation mode and shows that physiological alterations, as well as problems in perception and beliefs, can affect the decision-making process.

  6. Oracle Efficient Variable Selection in Random and Fixed Effects Panel Data Models

    DEFF Research Database (Denmark)

    Kock, Anders Bredahl

    This paper generalizes the results for the Bridge estimator of Huang et al. (2008) to linear random and fixed effects panel data models which are allowed to grow in both dimensions. In particular we show that the Bridge estimator is oracle efficient. It can correctly distinguish between relevant...... and irrelevant variables and the asymptotic distribution of the estimators of the coefficients of the relevant variables is the same as if only these had been included in the model, i.e. as if an oracle had revealed the true model prior to estimation. In the case of more explanatory variables than observations......, we prove that the Marginal Bridge estimator can asymptotically correctly distinguish between relevant and irrelevant explanatory variables. We do this without restricting the dependence between covariates and without assuming sub Gaussianity of the error terms thereby generalizing the results...

  7. Spatial Air Quality Modelling Using Chemometrics Techniques: A Case Study in Peninsular Malaysia

    International Nuclear Information System (INIS)

    Azman Azid; Hafizan Juahir; Mohammad Azizi Amran; Zarizal Suhaili; Mohamad Romizan Osman; Asyaari Muhamad; Asyaari Muhamad; Ismail Zainal Abidin; Nur Hishaam Sulaiman; Ahmad Shakir Mohd Saudi

    2015-01-01

    This study shows the effectiveness of hierarchical agglomerative cluster analysis (HACA), discriminant analysis (DA), principal component analysis (PCA), and multiple linear regressions (MLR) for assessment of air quality data and recognition of air pollution sources. 12 months data (January-December 2007) consisting of 14 stations in Peninsular Malaysia with 14 parameters were applied. Three significant clusters - low pollution source (LPS), moderate pollution source (MPS), and slightly high pollution source (SHPS) were generated via HACA. Forward stepwise of DA managed to discriminate eight variables, whereas backward stepwise of DA managed to discriminate nine variables out of fourteen variables. The PCA and FA results show the main contributor of air pollution in Peninsular Malaysia is the combustion of fossil fuel from industrial activities, transportation and agriculture systems. Four MLR models show that PM_1_0 account as the most and the highest pollution contributor to Malaysian air quality. From the study, it can be stipulated that the application of chemometrics techniques can disclose meaningful information on the spatial variability of a large and complex air quality data. A clearer review about the air quality and a novelty design of air quality monitoring network for better management of air pollution can be achieved via these methods. (author)

  8. Diagnostic Value of Selected Echocardiographic Variables to Identify Pulmonary Hypertension in Dogs with Myxomatous Mitral Valve Disease.

    Science.gov (United States)

    Tidholm, A; Höglund, K; Häggström, J; Ljungvall, I

    2015-01-01

    Pulmonary hypertension (PH) is commonly associated with myxomatous mitral valve disease (MMVD). Because dogs with PH present without measureable tricuspid regurgitation (TR), it would be useful to investigate echocardiographic variables that can identify PH. To investigate associations between estimated systolic TR pressure gradient (TRPG) and dog characteristics and selected echocardiographic variables. 156 privately owned dogs. Prospective observational study comparing the estimations of TRPG with dog characteristics and selected echocardiographic variables in dogs with MMVD and measureable TR. Tricuspid regurgitation pressure gradient was significantly (P modeled as linear variables LA/Ao (P modeled as second order polynomial variables: AT/DT (P = .0039) and LVIDDn (P value for the final model was 0.45 and receiver operating characteristic curve analysis suggested the model's performance to predict PH, defined as 36, 45, and 55 mmHg as fair (area under the curve [AUC] = 0.80), good (AUC = 0.86), and excellent (AUC = 0.92), respectively. In dogs with MMVD, the presence of PH might be suspected with the combination of decreased PA AT/DT, increased RVIDDn and LA/Ao, and a small or great LVIDDn. Copyright © 2015 The Authors Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.

  9. Effects of musical tempo on physiological, affective, and perceptual variables and performance of self-selected walking pace.

    Science.gov (United States)

    Almeida, Flávia Angélica Martins; Nunes, Renan Felipe Hartmann; Ferreira, Sandro Dos Santos; Krinski, Kleverton; Elsangedy, Hassan Mohamed; Buzzachera, Cosme Franklin; Alves, Ragami Chaves; Gregorio da Silva, Sergio

    2015-06-01

    [Purpose] This study investigated the effects of musical tempo on physiological, affective, and perceptual responses as well as the performance of self-selected walking pace. [Subjects] The study included 28 adult women between 29 and 51 years old. [Methods] The subjects were divided into three groups: no musical stimulation group (control), and 90 and 140 beats per minute musical tempo groups. Each subject underwent three experimental sessions: involved familiarization with the equipment, an incremental test to exhaustion, and a 30-min walk on a treadmill at a self-selected pace, respectively. During the self-selected walking session, physiological, perceptual, and affective variables were evaluated, and walking performance was evaluated at the end. [Results] There were no significant differences in physiological variables or affective response among groups. However, there were significant differences in perceptual response and walking performance among groups. [Conclusion] Fast music (140 beats per minute) promotes a higher rating of perceived exertion and greater performance in self-selected walking pace without significantly altering physiological variables or affective response.

  10. Determination of main fruits in adulterated nectars by ATR-FTIR spectroscopy combined with multivariate calibration and variable selection methods.

    Science.gov (United States)

    Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho

    2018-07-15

    Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Genotype-by-environment interactions leads to variable selection on life-history strategy in Common Evening Primrose (Oenothera biennis).

    Science.gov (United States)

    Johnson, M T J

    2007-01-01

    Monocarpic plant species, where reproduction is fatal, frequently exhibit variation in the length of their prereproductive period prior to flowering. If this life-history variation in flowering strategy has a genetic basis, genotype-by-environment interactions (G x E) may maintain phenotypic diversity in flowering strategy. The native monocarpic plant Common Evening Primrose (Oenothera biennis L., Onagraceae) exhibits phenotypic variation for annual vs. biennial flowering strategies. I tested whether there was a genetic basis to variation in flowering strategy in O. biennis, and whether environmental variation causes G x E that imposes variable selection on flowering strategy. In a field experiment, I randomized more than 900 plants from 14 clonal families (genotypes) into five distinct habitats that represented a natural productivity gradient. G x E strongly affected the lifetime fruit production of O. biennis, with the rank-order in relative fitness of genotypes changing substantially between habitats. I detected genetic variation in annual vs. biennial strategies in most habitats, as well as a G x E effect on flowering strategy. This variation in flowering strategy was correlated with genetic variation in relative fitness, and phenotypic and genotypic selection analyses revealed that environmental variation resulted in variable directional selection on annual vs. biennial strategies. Specifically, a biennial strategy was favoured in moderately productive environments, whereas an annual strategy was favoured in low-productivity environments. These results highlight the importance of variable selection for the maintenance of genetic variation in the life-history strategy of a monocarpic plant.

  12. Impact of menstruation on select hematology and clinical chemistry variables in cynomolgus macaques.

    Science.gov (United States)

    Perigard, Christopher J; Parrula, M Cecilia M; Larkin, Matthew H; Gleason, Carol R

    2016-06-01

    In preclinical studies with cynomolgus macaques, it is common to have one or more females presenting with menses. Published literature indicates that the blood lost during menses causes decreases in red blood cell mass variables (RBC, HGB, and HCT), which would be a confounding factor in the interpretation of drug-related effects on clinical pathology data, but no scientific data have been published to support this claim. This investigation was conducted to determine if the amount of blood lost during menses in cynomolgus macaques has an effect on routine hematology and serum chemistry variables. Ten female cynomolgus macaques (Macaca fascicularis), 5 to 6.5 years old, were observed daily during approximately 3 months (97 days) for the presence of menses. Hematology and serum chemistry variables were evaluated twice weekly. The results indicated that menstruation affects the erythrogram including RBC, HGB, HCT, MCHC, MCV, reticulocyte count, RDW, the leukogram including neutrophil, lymphocyte, and monocyte counts, and chemistry variables, including GGT activity, and the concentrations of total proteins, albumin, globulins, and calcium. The magnitude of the effect of menstruation on susceptible variables is dependent on the duration of the menstrual phase. Macaques with menstrual phases lasting ≥ 7 days are more likely to develop changes in variables related to chronic blood loss. In preclinical toxicology studies with cynomolgus macaques, interpretation of changes in several commonly evaluated hematology and serum chemistry variables requires adequate clinical observation and documentation concerning presence and duration of menses. There is a concern that macaques with long menstrual cycles can develop iron deficiency anemia due to chronic menstrual blood loss. © 2016 American Society for Veterinary Clinical Pathology.

  13. Using Variable Dwell Time to Accelerate Gaze-based Web Browsing with Two-step Selection

    OpenAIRE

    Chen, Zhaokang; Shi, Bertram E.

    2017-01-01

    In order to avoid the "Midas Touch" problem, gaze-based interfaces for selection often introduce a dwell time: a fixed amount of time the user must fixate upon an object before it is selected. Past interfaces have used a uniform dwell time across all objects. Here, we propose an algorithm for adjusting the dwell times of different objects based on the inferred probability that the user intends to select them. In particular, we introduce a probabilistic model of natural gaze behavior while sur...

  14. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    Science.gov (United States)

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  15. Firefly as a novel swarm intelligence variable selection method in spectroscopy.

    Science.gov (United States)

    Goodarzi, Mohammad; dos Santos Coelho, Leandro

    2014-12-10

    A critical step in multivariate calibration is wavelength selection, which is used to build models with better prediction performance when applied to spectral data. Up to now, many feature selection techniques have been developed. Among all different types of feature selection techniques, those based on swarm intelligence optimization methodologies are more interesting since they are usually simulated based on animal and insect life behavior to, e.g., find the shortest path between a food source and their nests. This decision is made by a crowd, leading to a more robust model with less falling in local minima during the optimization cycle. This paper represents a novel feature selection approach to the selection of spectroscopic data, leading to more robust calibration models. The performance of the firefly algorithm, a swarm intelligence paradigm, was evaluated and compared with genetic algorithm and particle swarm optimization. All three techniques were coupled with partial least squares (PLS) and applied to three spectroscopic data sets. They demonstrate improved prediction results in comparison to when only a PLS model was built using all wavelengths. Results show that firefly algorithm as a novel swarm paradigm leads to a lower number of selected wavelengths while the prediction performance of built PLS stays the same. Copyright © 2014. Published by Elsevier B.V.

  16. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

    Science.gov (United States)

    Sun, Lei; Wang, Jun; Wei, Jinmao

    2017-03-14

    The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.

  17. An Investigation of Selected Variables Related to Student Algebra I Performance in Mississippi

    Science.gov (United States)

    Scott, Undray

    2016-01-01

    This research study attempted to determine if specific variables were related to student performance on the Algebra I subject-area test. This study also sought to determine in which of grades 8, 9, or 10 students performed better on the Algebra I Subject Area Test. This study also investigated the different criteria that are used to schedule…

  18. Variable Selection Strategies for Small-area Estimation Using FIA Plots and Remotely Sensed Data

    Science.gov (United States)

    Andrew Lister; Rachel Riemann; James Westfall; Mike Hoppus

    2005-01-01

    The USDA Forest Service's Forest Inventory and Analysis (FIA) unit maintains a network of tens of thousands of georeferenced forest inventory plots distributed across the United States. Data collected on these plots include direct measurements of tree diameter and height and other variables. We present a technique by which FIA plot data and coregistered...

  19. Cortical Response Variability as a Developmental Index of Selective Auditory Attention

    Science.gov (United States)

    Strait, Dana L.; Slater, Jessica; Abecassis, Victor; Kraus, Nina

    2014-01-01

    Attention induces synchronicity in neuronal firing for the encoding of a given stimulus at the exclusion of others. Recently, we reported decreased variability in scalp-recorded cortical evoked potentials to attended compared with ignored speech in adults. Here we aimed to determine the developmental time course for this neural index of auditory…

  20. Variable selection for modelling effects of eutrophication on stream and river ecosystems

    NARCIS (Netherlands)

    Nijboer, R.C.; Verdonschot, P.F.M.

    2004-01-01

    Models are needed for forecasting the effects of eutrophication on stream and river ecosystems. Most of the current models do not include differences in local stream characteristics and effects on the biota. To define the most important variables that should be used in a stream eutrophication model,

  1. SPATIAL AND TEMPORAL VARIABILITY IN ACROLEIN AND SELECT VOLATILE ORGANIC COMPOUNDS IN DETROIT, MICHIGAN

    Science.gov (United States)

    The variability in outdoor concentrations of acrolein, benzene, toluene, ethylbenzene and xylenes (BTEX), and 1,3-butadiene was examined for data measured during summer 2004 of the Detroit Exposure and Aerosol Research Study (DEARS). Results for acrolein indicated no significant...

  2. Temporal variability of selected chemical and physical propertires of topsoil of three soil types

    Czech Academy of Sciences Publication Activity Database

    Jirků, V.; Kodešová, R.; Nikodem, A.; Mühlhanselová, M.; Žigová, Anna

    2013-01-01

    Roč. 15, - (2013) ISSN 1607-7962. [EGU General Assembly /10./. 07.04.2013-12.04.2013, Vienna] R&D Projects: GA ČR GA526/08/0434 Institutional support: RVO:67985831 Keywords : soil properties * soil types * temporal variability Subject RIV: DF - Soil Science http://meetingorganizer.copernicus.org/EGU2013/EGU2013-7650-1.pdf

  3. A variational conformational dynamics approach to the selection of collective variables in metadynamics

    Science.gov (United States)

    McCarty, James; Parrinello, Michele

    2017-11-01

    In this paper, we combine two powerful computational techniques, well-tempered metadynamics and time-lagged independent component analysis. The aim is to develop a new tool for studying rare events and exploring complex free energy landscapes. Metadynamics is a well-established and widely used enhanced sampling method whose efficiency depends on an appropriate choice of collective variables. Often the initial choice is not optimal leading to slow convergence. However by analyzing the dynamics generated in one such run with a time-lagged independent component analysis and the techniques recently developed in the area of conformational dynamics, we obtain much more efficient collective variables that are also better capable of illuminating the physics of the system. We demonstrate the power of this approach in two paradigmatic examples.

  4. Identifying market segments in consumer markets: variable selection and data interpretation

    OpenAIRE

    Tonks, D G

    2004-01-01

    Market segmentation is often articulated as being a process which displays the recognised features of classical rationalism but in part; convention, convenience, prior experience and the overarching impact of rhetoric will influence if not determine the outcomes of a segmentation exercise. Particular examples of this process are addressed critically in this paper which concentrates on the issues of variable choice for multivariate approaches to market segmentation and also the methods used fo...

  5. Variability in dose estimates associated with the food-chain transport and ingestion of selected radionuclides

    International Nuclear Information System (INIS)

    Hoffman, F.O.; Gardner, R.H.; Eckerman, K.F.

    1982-06-01

    Dose predictions for the ingestion of 90 Sr and 137 Cs, using aquatic and terrestrial food chain transport models similar to those in the Nuclear Regulatory Commission's Regulatory Guide 1.109, are evaluated through estimating the variability of model parameters and determining the effect of this variability on model output. The variability in the predicted dose equivalent is determined using analytical and numerical procedures. In addition, a detailed discussion is included on 90 Sr dosimetry. The overall estimates of uncertainty are most relevant to conditions where site-specific data is unavailable and when model structure and parameter estimates are unbiased. Based on the comparisons performed in this report, it is concluded that the use of the generic default parameters in Regulatory Guide 1.109 will usually produce conservative dose estimates that exceed the 90th percentile of the predicted distribution of dose equivalents. An exception is the meat pathway for 137 Cs, in which use of generic default values results in a dose estimate at the 24th percentile. Among the terrestrial pathways of exposure, the non-leafy vegetable pathway is the most important for 90 Sr. For 90 Sr, the parameters for soil retention, soil-to-plant transfer, and internal dosimetry contribute most significantly to the variability in the predicted dose for the combined exposure to all terrestrial pathways. For 137 Cs, the meat transfer coefficient the mass interception factor for pasture forage, and the ingestion dose factor are the most important parameters. The freshwater finfish bioaccumulation factor is the most important parameter for the dose prediction of 90 Sr and 137 Cs transported over the water-fish-man pathway

  6. Rationalization of dye uptake on titania slides for dye-sensitized solar cells by a combined chemometric and structural approach.

    Science.gov (United States)

    Gianotti, Valentina; Favaro, Giada; Bonandini, Luca; Palin, Luca; Croce, Gianluca; Boccaleri, Enrico; Artuso, Emma; van Beek, Wouter; Barolo, Claudia; Milanesio, Marco

    2014-11-01

    A model photosensitizer (D5) for application in dye-sensitized solar cells has been studied by a combination of XRD, theoretical calculations, and spectroscopic/chemometric methods. The conformational stability and flexibility of D5 and molecular interactions between adjacent molecules were characterized to obtain the driving forces that govern D5 uptake and grafting and to infer the most likely arrangement of the molecules on the surface of TiO2. A spectroscopic/chemometric approach was then used to yield information about the correlations between three variables that govern the uptake itself: D5 concentration, dispersant (chenodeoxycholic acid; CDCA) concentration, and contact time. The obtained regression model shows that large uptakes can be obtained at high D5 concentrations in the presence of CDCA with a long contact time, or in absence of CDCA if the contact time is short, which suggests how dye uptake and photovoltaic device preparation can be optimized. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Fourier transform infrared spectroscopy and chemometrics for the characterization and discrimination of writing/photocopier paper types: Application in forensic document examinations.

    Science.gov (United States)

    Kumar, Raj; Kumar, Vinay; Sharma, Vishal

    2017-01-05

    The aim of the present work is to explore the non-destructive application of ATR-FTIR technique for characterization and discrimination of paper samples which could be helpful to give forensic aid in resolving legal cases. Twenty-four types of paper brands were purchased from local market in and around Chandigarh, India. All the paper samples were subjected to ATR-FTIR analysis from 400 to 4000cm(-1) wavenumber range. The qualitative feature and Chemometrics of the obtained spectral data are used for characterization and discrimination. Characterization is achieved by matching the peaks with standards of cellulose and inorganic fillers, a usual constituents of paper. Three different regions of IR, i.e. 400-2000cm(-1), 2000-4000cm(-1) and 400-4000cm(-1) were selected for differentiation by Chemometrics analysis. The discrimination is achieved on the basis of three principal components, i.e. PC 1, PC 2 and PC 3. It is observed that maximum discrimination was procured in the wave number range of i.e. 2000-4000cm(-1). Discriminating power was calculated on the basis of qualitative features as well, and it is found that the discrimination of paper samples was better achieved by Chemometrics analysis rather than qualitative features. The discriminating power by Chemometrics is 99.64% and which is larger as ever achieved by any group for present number of samples. The present result confirms that this study will be highly useful in forensic document examination work in the legal cases, where the authenticity of the document is challenged. The results are completely analytical and, therefore, overcome the problem encounter in traditional routine light/radiation scanning methods which are still in practice by various questioned document laboratories. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Fourier transform infrared spectroscopy and chemometrics for the characterization and discrimination of writing/photocopier paper types: Application in forensic document examinations

    Science.gov (United States)

    Kumar, Raj; Kumar, Vinay; Sharma, Vishal

    2017-01-01

    The aim of the present work is to explore the non-destructive application of ATR-FTIR technique for characterization and discrimination of paper samples which could be helpful to give forensic aid in resolving legal cases. Twenty-four types of paper brands were purchased from local market in and around Chandigarh, India. All the paper samples were subjected to ATR-FTIR analysis from 400 to 4000 cm- 1 wavenumber range. The qualitative feature and Chemometrics of the obtained spectral data are used for characterization and discrimination. Characterization is achieved by matching the peaks with standards of cellulose and inorganic fillers, a usual constituents of paper. Three different regions of IR, i.e. 400-2000 cm- 1, 2000-4000 cm- 1 and 400-4000 cm- 1 were selected for differentiation by Chemometrics analysis. The discrimination is achieved on the basis of three principal components, i.e. PC 1, PC 2 and PC 3. It is observed that maximum discrimination was procured in the wave number range of i.e. 2000-4000 cm- 1. Discriminating power was calculated on the basis of qualitative features as well, and it is found that the discrimination of paper samples was better achieved by Chemometrics analysis rather than qualitative features. The discriminating power by Chemometrics is 99.64% and which is larger as ever achieved by any group for present number of samples. The present result confirms that this study will be highly useful in forensic document examination work in the legal cases, where the authenticity of the document is challenged. The results are completely analytical and, therefore, overcome the problem encounter in traditional routine light/radiation scanning methods which are still in practice by various questioned document laboratories.

  9. Ultrahigh Dimensional Variable Selection for Interpolation of Point Referenced Spatial Data: A Digital Soil Mapping Case Study

    Science.gov (United States)

    Lamb, David W.; Mengersen, Kerrie

    2016-01-01

    Modern soil mapping is characterised by the need to interpolate point referenced (geostatistical) observations and the availability of large numbers of environmental characteristics for consideration as covariates to aid this interpolation. Modelling tasks of this nature also occur in other fields such as biogeography and environmental science. This analysis employs the Least Angle Regression (LAR) algorithm for fitting Least Absolute Shrinkage and Selection Operator (LASSO) penalized Multiple Linear Regressions models. This analysis demonstrates the efficiency of the LAR algorithm at selecting covariates to aid the interpolation of geostatistical soil carbon observations. Where an exhaustive search of the models that could be constructed from 800 potential covariate terms and 60 observations would be prohibitively demanding, LASSO variable selection is accomplished with trivial computational investment. PMID:27603135

  10. Empirically Driven Variable Selection for the Estimation of Causal Effects with Observational Data

    Science.gov (United States)

    Keller, Bryan; Chen, Jianshen

    2016-01-01

    Observational studies are common in educational research, where subjects self-select or are otherwise non-randomly assigned to different interventions (e.g., educational programs, grade retention, special education). Unbiased estimation of a causal effect with observational data depends crucially on the assumption of ignorability, which specifies…

  11. Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data.

    Science.gov (United States)

    Ultsch, Alfred; Lötsch, Jörn

    2015-01-01

    Multivariate data sets often differ in several factors or derived statistical parameters, which have to be selected for a valid interpretation. Basing this selection on traditional statistical limits leads occasionally to the perception of losing information from a data set. This paper proposes a novel method for calculating precise limits for the selection of parameter sets. The algorithm is based on an ABC analysis and calculates these limits on the basis of the mathematical properties of the distribution of the analyzed items. The limits implement the aim of any ABC analysis, i.e., comparing the increase in yield to the required additional effort. In particular, the limit for set A, the "important few", is optimized in a way that both, the effort and the yield for the other sets (B and C), are minimized and the additional gain is optimized. As a typical example from biomedical research, the feasibility of the ABC analysis as an objective replacement for classical subjective limits to select highly relevant variance components of pain thresholds is presented. The proposed method improved the biological interpretation of the results and increased the fraction of valid information that was obtained from the experimental data. The method is applicable to many further biomedical problems including the creation of diagnostic complex biomarkers or short screening tests from comprehensive test batteries. Thus, the ABC analysis can be proposed as a mathematically valid replacement for traditional limits to maximize the information obtained from multivariate research data.

  12. The study of variability and strain selection in Streptomyces atroolivaceus. III

    International Nuclear Information System (INIS)

    Blumauerova, M.; Lipavska, H.; Stajner, K.; Vanek, Z.

    1976-01-01

    Mutants of Streptomyces atroolivaceus blocked in the biosynthesis of mithramycin were isolated both by natural selection and after treatment with mutagenic factors (UV and gamma rays, nitrous acid). Both physical factors were more effective than nitrous acid. The selection was complicated by the high instability of isolates, out of which 20 to 80%=. (depending on their origin) reversed spontaneously to the parent type. Primary screening (selection of morphological variants and determination of their activity using the method of agar blocks) made it possible to detect only potentially non-productive strains; however, the final selection always had to be made under submerged conditions. Fifty-four stable non-productive mutants were divided, according to results of the chromatographic analysis, into five groups differing in the production of the six biologically inactive metabolites. The mutants did not accumulate chromomycinone, chromocyclomycin and chromocyclin. On mixed cultivation none of the pairs of mutants was capable of the cosynthesis of mithramycin or of new compounds differing from standard metabolites. Possible causes of the above results are discussed. (author)

  13. COPD phenotypes on computed tomography and its correlation with selected lung function variables in severe patients

    Directory of Open Access Journals (Sweden)

    da Silva SMD

    2016-03-01

    Full Text Available Silvia Maria Doria da Silva, Ilma Aparecida Paschoal, Eduardo Mello De Capitani, Marcos Mello Moreira, Luciana Campanatti Palhares, Mônica Corso PereiraPneumology Service, Department of Internal Medicine, School of Medical Sciences, State University of Campinas (UNICAMP, Campinas, São Paulo, BrazilBackground: Computed tomography (CT phenotypic characterization helps in understanding the clinical diversity of chronic obstructive pulmonary disease (COPD patients, but its clinical relevance and its relationship with functional features are not clarified. Volumetric capnography (VC uses the principle of gas washout and analyzes the pattern of CO2 elimination as a function of expired volume. The main variables analyzed were end-tidal concentration of carbon dioxide (ETCO2, Slope of phase 2 (Slp2, and Slope of phase 3 (Slp3 of capnogram, the curve which represents the total amount of CO2 eliminated by the lungs during each breath.Objective: To investigate, in a group of patients with severe COPD, if the phenotypic analysis by CT could identify different subsets of patients, and if there was an association of CT findings and functional variables.Subjects and methods: Sixty-five patients with COPD Gold III–IV were admitted for clinical evaluation, high-resolution CT, and functional evaluation (spirometry, 6-minute walk test [6MWT], and VC. The presence and profusion of tomography findings were evaluated, and later, the patients were identified as having emphysema (EMP or airway disease (AWD phenotype. EMP and AWD groups were compared; tomography findings scores were evaluated versus spirometric, 6MWT, and VC variables.Results: Bronchiectasis was found in 33.8% and peribronchial thickening in 69.2% of the 65 patients. Structural findings of airways had no significant correlation with spirometric variables. Air trapping and EMP were strongly correlated with VC variables, but in opposite directions. There was some overlap between the EMP and AWD

  14. Characterization of Machine Variability and Progressive Heat Treatment in Selective Laser Melting of Inconel 718

    Science.gov (United States)

    Prater, Tracie; Tilson, Will; Jones, Zack

    2015-01-01

    The absence of an economy of scale in spaceflight hardware makes additive manufacturing an immensely attractive option for propulsion components. As additive manufacturing techniques are increasingly adopted by government and industry to produce propulsion hardware in human-rated systems, significant development efforts are needed to establish these methods as reliable alternatives to conventional subtractive manufacturing. One of the critical challenges facing powder bed fusion techniques in this application is variability between machines used to perform builds. Even with implementation of robust process controls, it is possible for two machines operating at identical parameters with equivalent base materials to produce specimens with slightly different material properties. The machine variability study presented here evaluates 60 specimens of identical geometry built using the same parameters. 30 samples were produced on machine 1 (M1) and the other 30 samples were built on machine 2 (M2). Each of the 30-sample sets were further subdivided into three subsets (with 10 specimens in each subset) to assess the effect of progressive heat treatment on machine variability. The three categories for post-processing were: stress relief, stress relief followed by hot isostatic press (HIP), and stress relief followed by HIP followed by heat treatment per AMS 5664. Each specimen (a round, smooth tensile) was mechanically tested per ASTM E8. Two formal statistical techniques, hypothesis testing for equivalency of means and one-way analysis of variance (ANOVA), were applied to characterize the impact of machine variability and heat treatment on six material properties: tensile stress, yield stress, modulus of elasticity, fracture elongation, and reduction of area. This work represents the type of development effort that is critical as NASA, academia, and the industrial base work collaboratively to establish a path to certification for additively manufactured parts. For future

  15. The relationship between selected variables and customer loyalty within an optometric practice environment

    Directory of Open Access Journals (Sweden)

    T. Van Vuuren

    2012-12-01

    Full Text Available Purpose: The purpose of the research that informed this article was to examine the relationship between customer satisfaction, trust, supplier image, commitment and customer loyalty within an optometric practice environment. Problem investigated: Optometric businesses need to adopt their strategies to enhance loyalty, as customer satisfaction is not enough to ensure loyalty and customer retention. An understanding of the variables influencing loyalty could help businesses within the optometric service environment to retain their customers and become more profitable. Methodology: The methodological approach followed was exploratory and quantitative in nature. The sample consisted of 357 customers who visited the practice twice or more over the previous six years. A structured questionnaire, with a five-point Likert scale, was fielded to gather the data. The descriptive and multiple regression analysis approach was used to analyse the results. Collinearity statistics and Pearson's correlation coefficient were also calculated to determine which independent variable has the largest influence on customer loyalty. Findings and implications: The main finding is that customer satisfaction had the highest correlation with customer loyalty. The other independent variables, however, also appear to significantly influence customer loyalty within an optometric practice environment. The implication is that optometric practices need to focus on customer satisfaction, trust, supplier image and commitment when addressing the improvement of customer loyalty. Originality and value of the research: The article contributes to the improvement of customer loyalty within a service business environment that could assist in facilitating larger market share, higher customer retention and greater profitability for the business over the long term.

  16. Select injury-related variables are affected by stride length and foot strike style during running.

    Science.gov (United States)

    Boyer, Elizabeth R; Derrick, Timothy R

    2015-09-01

    Some frontal plane and transverse plane variables have been associated with running injury, but it is not known if they differ with foot strike style or as stride length is shortened. To identify if step width, iliotibial band strain and strain rate, positive and negative free moment, pelvic drop, hip adduction, knee internal rotation, and rearfoot eversion differ between habitual rearfoot and habitual mid-/forefoot strikers when running with both a rearfoot strike (RFS) and a mid-/forefoot strike (FFS) at 3 stride lengths. Controlled laboratory study. A total of 42 healthy runners (21 habitual rearfoot, 21 habitual mid-/forefoot) ran overground at 3.35 m/s with both a RFS and a FFS at their preferred stride lengths and 5% and 10% shorter. Variables did not differ between habitual groups. Step width was 1.5 cm narrower for FFS, widening to 0.8 cm as stride length shortened. Iliotibial band strain and strain rate did not differ between foot strikes but decreased as stride length shortened (0.3% and 1.8%/s, respectively). Pelvic drop was reduced 0.7° for FFS compared with RFS, and both pelvic drop and hip adduction decreased as stride length shortened (0.8° and 1.5°, respectively). Peak knee internal rotation was not affected by foot strike or stride length. Peak rearfoot eversion was not different between foot strikes but decreased 0.6° as stride length shortened. Peak positive free moment (normalized to body weight [BW] and height [h]) was not affected by foot strike or stride length. Peak negative free moment was -0.0038 BW·m/h greater for FFS and decreased -0.0004 BW·m/h as stride length shortened. The small decreases in most variables as stride length shortened were likely associated with the concomitant wider step width. RFS had slightly greater pelvic drop, while FFS had slightly narrower step width and greater negative free moment. Shortening one's stride length may decrease or at least not increase propensity for running injuries based on the variables

  17. Effect of Integrated Yoga Module on Selected Psychological Variables among Women with Anxiety Problem.

    Science.gov (United States)

    Parthasarathy, S; Jaiganesh, K; Duraisamy

    2014-01-01

    The implementation of yogic practices has proven benefits in both organic and psychological diseases. Forty-five women with anxiety selected by a random sampling method were divided into three groups. Experimental group I was subjected to asanas, relaxation and pranayama while Experimental group II was subjected to an integrated yoga module. The control group did not receive any intervention. Anxiety was measured by Taylor's Manifest Anxiety Scale before and after treatment. Frustration was measured through Reaction to Frustration Scale. All data were spread in an Excel sheet to be analysed with SPSS 16 software using analysis of covariance (ANCOVA). Selected yoga and asanas decreased anxiety and frustration scores but treatment with an integrated yoga module resulted in significant reduction of anxiety and frustration. To conclude, the practice of asanas and yoga decreased anxiety in women, and yoga as an integrated module significantly improved anxiety scores in young women with proven anxiety without any ill effects.

  18. Induction and selection of superior genetic variables of oil seed rape (brassica napus L.)

    International Nuclear Information System (INIS)

    Shah, S.S.; Ali, I.; Rehman, K.

    1990-01-01

    Dry and uniform seeds of two rape seed varieties, Ganyou-5 and Tower, were subjected to different doses of gamma rays. Genetic variation in yield and yield components generated in M1 was studied in M2 and 30 useful variants were isolated from a large magnetized population. The selected mutants were progeny tested for stability of the characters in M3. Only five out of 30 progenies were identified to be uniform and stable. Further selection was made in the segregating m3 progenies. Results on some of the promising mutants are reported. The effect of irradiation treatment was highly pronounced on pod length, seeds per pod and 1000-seed weight. The genetic changes thus induced would help to evolve high yielding versions of different rape seed varieties under local environmental conditions. (author)

  19. Travelling green : Variables influencing students’ intention to select a green hotel

    OpenAIRE

    Lindqvist, Julia; Andersson, Mikaela

    2015-01-01

    Problematization: Tourism has a major impact on the environment. However, there is a conflict of interest making it difficult for the hotel business to decrease this impact. On the one hand, there is a pressure for environmentally friendly behaviour from society. On the other hand, the customers want to be pampered during their hotel stay. This makes it necessary to further investigate what influences customers’ intention to select a green hotel. Therefore this thesis examines students’ inten...

  20. NMR and Chemometric Characterization of Vacuum Residues and Vacuum Gas Oils from Crude Oils of Different Origin

    Directory of Open Access Journals (Sweden)

    Jelena Parlov Vuković

    2015-03-01

    Full Text Available NMR spectroscopy in combination with statistical methods was used to study vacuum residues and vacuum gas oils from 32 crude oils of different origin. Two chemometric metodes were applied. Firstly, principal component analysis on complete spectra was used to perform classification of samples and clear distinction between vacuum residues and vacuum light and heavy gas oils were obtained. To quantitatively predict the composition of asphaltenes, principal component regression models using areas of resonance signals spaned by 11 frequency bins of the 1H NMR spectra were build. The first 5 principal components accounted for more than 94 % of variations in the input data set and coefficient of determination for correlation between measured and predicted values was R2 = 0.7421. Although this value is not significant, it shows the underlying linear dependence in the data. Pseudo two-dimensional DOSY NMR experiments were used to assess the composition and structural properties of asphaltenes in a selected crude oil and its vacuum residue on the basis of their different hydrodynamic behavior and translational diffusion coefficients. DOSY spectra showed the presence of several asphaltene aggregates differing in size and interactions they formed. The obtained results have shown that NMR techniques in combination with chemometrics are very useful to analyze vacuum residues and vacuum gas oils. Furthermore, we expect that our ongoing investigation of asphaltenes from crude oils of different origin will elucidate in more details composition, structure and properties of these complex molecular systems.

  1. The effect of aquatic plyometric training with and without resistance on selected physical fitness variables among volleyball players

    Directory of Open Access Journals (Sweden)

    K. KAMALAKKANNAN

    2011-06-01

    Full Text Available The purpose of this study is to analyze the effect of aquatic plyometric training with and without the use ofweights on selected physical fitness variables among volleyball players. To achieve the purpose of these study 36physically active undergraduate volleyball players between 18 and 20 years of age volunteered as participants.The participants were randomly categorized into three groups of 12 each: a control group (CG, an aquaticPlyometric training with weight group (APTWG, and an aquatic Plyometric training without weight group(APTWOG. The subjects of the control group were not exposed to any training. Both experimental groupsunderwent their respective experimental treatment for 12 weeks, 3 days per week and a single session on eachday. Speed, endurance, and explosive power were measured as the dependent variables for this study. 36 days ofexperimental treatment was conducted for all the groups and pre and post data was collected. The collected datawere analyzed using an analysis of covariance (ANCOVA and followed by a Scheffé’s post hoc test. The resultsrevealed significant differences between groups on all the selected dependent variables. This study demonstratedthat aquatic plyometric training can be one effective means for improving speed, endurance, and explosivepower in volley ball players

  2. Chemometric characterization of soil depth profiles

    International Nuclear Information System (INIS)

    Krieg, M.; Einax, J.

    1994-01-01

    The application of multivariate-statistical methods to the description of the metal distribution in soil depth profiles is shown. By means of cluster analysis, it is possible to get a first overview of the main differences in the metal status of the soil horizons. In case of anthropogenic soil pollution or geogenic enrichment, cluster analysis was able to detect the extent of the polluted soil layer or the different geological layers. The results of cluster analysis can be confirmed by means of multidimensional variance and discriminant analysis. Methods of discriminant analysis can also be used as a tool to determine the optimum number of variables which has to be measured for the classification of unknown soil samples into different pollution levels. Factor analysis yields an identification of not directly observable relationships between the variables. With additional knowledge about the orographic situation of the area and the probable sources of emission the factor loadings give information on the immission structure at the sampling location. (orig.)

  3. Chemometric techniques in oil classification from oil spill fingerprinting.

    Science.gov (United States)

    Ismail, Azimah; Toriman, Mohd Ekhwan; Juahir, Hafizan; Kassim, Azlina Md; Zain, Sharifuddin Md; Ahmad, Wan Kamaruzaman Wan; Wong, Kok Fah; Retnam, Ananthy; Zali, Munirah Abdul; Mokhtar, Mazlin; Yusri, Mohd Ayub

    2016-10-15

    Extended use of GC-FID and GC-MS in oil spill fingerprinting and matching is significantly important for oil classification from the oil spill sources collected from various areas of Peninsular Malaysia and Sabah (East Malaysia). Oil spill fingerprinting from GC-FID and GC-MS coupled with chemometric techniques (discriminant analysis and principal component analysis) is used as a diagnostic tool to classify the types of oil polluting the water. Clustering and discrimination of oil spill compounds in the water from the actual site of oil spill events are divided into four groups viz. diesel, Heavy Fuel Oil (HFO), Mixture Oil containing Light Fuel Oil (MOLFO) and Waste Oil (WO) according to the similarity of their intrinsic chemical properties. Principal component analysis (PCA) demonstrates that diesel, HFO, MOLFO and WO are types of oil or oil products from complex oil mixtures with a total variance of 85.34% and are identified with various anthropogenic activities related to either intentional releasing of oil or accidental discharge of oil into the environment. Our results show that the use of chemometric techniques is significant in providing independent validation for classifying the types of spilled oil in the investigation of oil spill pollution in Malaysia. This, in consequence would result in cost and time saving in identification of the oil spill sources. Copyright © 2016. Published by Elsevier Ltd.

  4. Neuronal Intra-Individual Variability Masks Response Selection Differences between ADHD Subtypes—A Need to Change Perspectives

    Directory of Open Access Journals (Sweden)

    Annet Bluschke

    2017-06-01

    Full Text Available Due to the high intra-individual variability in attention deficit/hyperactivity disorder (ADHD, there may be considerable bias in knowledge about altered neurophysiological processes underlying executive dysfunctions in patients with different ADHD subtypes. When aiming to establish dimensional cognitive-neurophysiological constructs representing symptoms of ADHD as suggested by the initiative for Research Domain Criteria, it is crucial to consider such processes independent of variability. We examined patients with the predominantly inattentive subtype (attention deficit disorder, ADD and the combined subtype of ADHD (ADHD-C in a flanker task measuring conflict control. Groups were matched for task performance. Besides using classic event-related potential (ERP techniques and source localization, neurophysiological data was also analyzed using residue iteration decomposition (RIDE to statistically account for intra-individual variability and S-LORETA to estimate the sources of the activations. The analysis of classic ERPs related to conflict monitoring revealed no differences between patients with ADD and ADHD-C. When individual variability was accounted for, clear differences became apparent in the RIDE C-cluster (analog to the P3 ERP-component. While patients with ADD distinguished between compatible and incompatible flanker trials early on, patients with ADHD-C seemed to employ more cognitive resources overall. These differences are reflected in inferior parietal areas. The study demonstrates differences in neuronal mechanisms related to response selection processes between ADD and ADHD-C which, according to source localization, arise from the inferior parietal cortex. Importantly, these differences could only be detected when accounting for intra-individual variability. The results imply that it is very likely that differences in neurophysiological processes between ADHD subtypes are underestimated and have not been recognized because intra

  5. Robust portfolio selection based on asymmetric measures of variability of stock returns

    Science.gov (United States)

    Chen, Wei; Tan, Shaohua

    2009-10-01

    This paper addresses a new uncertainty set--interval random uncertainty set for robust optimization. The form of interval random uncertainty set makes it suitable for capturing the downside and upside deviations of real-world data. These deviation measures capture distributional asymmetry and lead to better optimization results. We also apply our interval random chance-constrained programming to robust mean-variance portfolio selection under interval random uncertainty sets in the elements of mean vector and covariance matrix. Numerical experiments with real market data indicate that our approach results in better portfolio performance.

  6. Functional Data Analysis Applied in Chemometrics

    DEFF Research Database (Denmark)

    Muller, Martha

    nutritional status and metabolic phenotype. We want to understand how metabolomic spectra can be analysed using functional data analysis to detect the in uence of dierent factors on specic metabolites. These factors can include, for example, gender, diet culture or dietary intervention. In Paper I we apply...... representation of each spectrum. Subset selection of wavelet coecients generates the input to mixed models. Mixed-model methodology enables us to take the study design into account while modelling covariates. Bootstrap-based inference preserves the correlation structure between curves and enables the estimation...

  7. Conflict Management Styles of Selected Managers and Their Relationship With Management and Organization Variables

    Directory of Open Access Journals (Sweden)

    Concepcion Martires

    1990-12-01

    Full Text Available This study sought to determine the relationship between the conflict management styles of managers and certain management and organization factors. A total of 462 top, middle, and lower managers from 72 companies participated in the study which utilized the Thomas-Killman Conflict Mode Instrument. To facilitate the computation of the statistical data, a microcomputer and a software package was used.The majority of the managers of the 17 types of organization included in the study use collaborative mode of managing conflict. This finding is congruent with the findings of past studies conducted on managers of commercial banks, service, manufacturing, trading advertising, appliance, investment houses, and overseas recruitment industries showing their high degree of objectivity and assertiveness of their own personal goals and of other people's concerns. The second dominant style, which is compromising, indicates their desire in sharing and searching for solutions that result in satisfaction among conflicting parties. This finding is highly consistent with the strong Filipino value of smooth interpersonal relationships (SIR as reflected and discussed in the numerous researches on Filipino values.The chi-square tests generated by the computer package in statistics showed independence between the manager's conflict management styles and each of the variables of sex, civil status, position level at work, work experience, type of corporation, and number of subordinates. This result is again congruent with those of past studies conducted in the Philippines. The past and present findings may imply that conflict management mode may be a highly personal style that is not dependent on any of these variables included in the study. However, the chi-square tests show that management style is dependent on the manager's age and educational attainment.

  8. Joint High-Dimensional Bayesian Variable and Covariance Selection with an Application to eQTL Analysis

    KAUST Repository

    Bhadra, Anindya

    2013-04-22

    We describe a Bayesian technique to (a) perform a sparse joint selection of significant predictor variables and significant inverse covariance matrix elements of the response variables in a high-dimensional linear Gaussian sparse seemingly unrelated regression (SSUR) setting and (b) perform an association analysis between the high-dimensional sets of predictors and responses in such a setting. To search the high-dimensional model space, where both the number of predictors and the number of possibly correlated responses can be larger than the sample size, we demonstrate that a marginalization-based collapsed Gibbs sampler, in combination with spike and slab type of priors, offers a computationally feasible and efficient solution. As an example, we apply our method to an expression quantitative trait loci (eQTL) analysis on publicly available single nucleotide polymorphism (SNP) and gene expression data for humans where the primary interest lies in finding the significant associations between the sets of SNPs and possibly correlated genetic transcripts. Our method also allows for inference on the sparse interaction network of the transcripts (response variables) after accounting for the effect of the SNPs (predictor variables). We exploit properties of Gaussian graphical models to make statements concerning conditional independence of the responses. Our method compares favorably to existing Bayesian approaches developed for this purpose. © 2013, The International Biometric Society.

  9. Identification of solid state fermentation degree with FT-NIR spectroscopy: Comparison of wavelength variable selection methods of CARS and SCARS

    Science.gov (United States)

    Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai

    2015-10-01

    The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree.

  10. Soil Cd, Cr, Cu, Ni, Pb and Zn sorption and retention models using SVM: Variable selection and competitive model.

    Science.gov (United States)

    González Costa, J J; Reigosa, M J; Matías, J M; Covelo, E F

    2017-09-01

    The aim of this study was to model the sorption and retention of Cd, Cu, Ni, Pb and Zn in soils. To that extent, the sorption and retention of these metals were studied and the soil characterization was performed separately. Multiple stepwise regression was used to produce multivariate models with linear techniques and with support vector machines, all of which included 15 explanatory variables characterizing soils. When the R-squared values are represented, two different groups are noticed. Cr, Cu and Pb sorption and retention show a higher R-squared; the most explanatory variables being humified organic matter, Al oxides and, in some cases, cation-exchange capacity (CEC). The other group of metals (Cd, Ni and Zn) shows a lower R-squared, and clays are the most explanatory variables, including a percentage of vermiculite and slime. In some cases, quartz, plagioclase or hematite percentages also show some explanatory capacity. Support Vector Machine (SVM) regression shows that the different models are not as regular as in multiple regression in terms of number of variables, the regression for nickel adsorption being the one with the highest number of variables in its optimal model. On the other hand, there are cases where the most explanatory variables are the same for two metals, as it happens with Cd and Cr adsorption. A similar adsorption mechanism is thus postulated. These patterns of the introduction of variables in the model allow us to create explainability sequences. Those which are the most similar to the selectivity sequences obtained by Covelo (2005) are Mn oxides in multiple regression and change capacity in SVM. Among all the variables, the only one that is explanatory for all the metals after applying the maximum parsimony principle is the percentage of sand in the retention process. In the competitive model arising from the aforementioned sequences, the most intense competitiveness for the adsorption and retention of different metals appears between

  11. SEASONAL VARIABILITY OF SELECTED NUTRIENTS IN THE WATERS OF LAKES NIEPRUSZEWSKIE, PAMIATKOWSKIE AND STRYKOWSKIE

    Directory of Open Access Journals (Sweden)

    Anna Zbierska

    2016-09-01

    Full Text Available The paper presents the evaluation of seasonal and long-term changes in selected nutrients of three lakes of the Poznań Lakeland. The lakes were selected due to the high risk of pollution from agricultural and residential areas. Water samples were taken in 6 control points in the spring, summer and autumn, from 2004 to 2014. Trophic status of the lakes was evaluated based on the concentration of nutrients (nitrates, nitrites, ammonium, nitrogen and phosphorus and indicators of eutrophication. Studies have shown that the concentration of nutrients varied greatly both in individual years and seasons of the analyzed decades, especially in Lakes Niepruszewskie and Pamiątkowskie. The main problem is the high concentration of nitrates. In general, it showed an upward trend until 2013, especially in the spring. This may indicate that actions restricting runoff pollution from agricultural sources have not been fully effective. On the other hand, a marked downward trend in the concentrations of NH4 over the years from 2004 to 2014, especially after 2007, indicates a gradual improvement of wastewater management. Moreover, seasonal variation in NH4 concentrations differed from those of NO3 and NO2. The highest values were reported in the autumn season, the lowest in the summer. Concentrations of nutrients and eutrophication indexes reached high values in all analysed lakes, indicating a eutrophic or hypertrophic state of the lakes. The high value of the N:P ratio indicates that the lakes had a huge surplus of nitrogen, and phosphorus is a productivity limiting factor.

  12. Chemometrics applied to quality control and metabolomics for traditional Chinese medicines.

    Science.gov (United States)

    Liu, Shao; Liang, Yi-Zeng; Liu, Hai-Tao

    2016-03-15

    Traditional Chinese medicines (TCMs) bring a great challenge in quality control and evaluating the efficacy because of their complexity of chemical composition. Chemometric techniques provide a good opportunity for mining more useful chemical information from TCMs. Then, the application of chemometrics in the field of TCMs is spontaneous and necessary. This review focuses on the recent various important chemometrics tools for chromatographic fingerprinting, including peak alignment information features, baseline correction and applications of chemometrics in metabolomics and modernization of TCMs, including authentication and evaluation of the quality of TCMs, evaluating the efficacy of TCMs and essence of TCM syndrome. In the conclusions, the general trends and some recommendations for improving chromatographic metabolomics data analysis are provided. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. A Variable Service Broker Routing Policy for data center selection in cloud analyst

    Directory of Open Access Journals (Sweden)

    Ahmad M. Manasrah

    2017-07-01

    Full Text Available Cloud computing depends on sharing distributed computing resources to handle different services such as servers, storage and applications. The applications and infrastructures are provided as pay per use services through data center to the end user. The data centers are located at different geographic locations. However, these data centers can get overloaded with the increase number of client applications being serviced at the same time and location; this will degrade the overall QoS of the distributed services. Since different user applications may require different configuration and requirements, measuring the user applications performance of various resources is challenging. The service provider cannot make decisions for the right level of resources. Therefore, we propose a Variable Service Broker Routing Policy – VSBRP, which is a heuristic-based technique that aims to achieve minimum response time through considering the communication channel bandwidth, latency and the size of the job. The proposed service broker policy will also reduce the overloading of the data centers by redirecting the user requests to the next data center that yields better response and processing time. The simulation shows promising results in terms of response and processing time compared to other known broker policies from the literature.

  14. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.

    Science.gov (United States)

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-11-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.

  15. Impact of oil price shocks on selected macroeconomic variables in Nigeria

    International Nuclear Information System (INIS)

    Iwayemi, Akin; Fowowe, Babajide

    2011-01-01

    The impact of oil price shocks on the macroeconomy has received a great deal of attention since the 1970 s. Initially, many empirical studies found a significant negative effect between oil price shocks and GDP but more recently, empirical studies have reported an insignificant relationship between oil shocks and the macroeconomy. A key feature of existing research is that it applies predominantly to advanced, oil-importing countries. For oil-exporting countries, different conclusions are expected but this can only be ascertained empirically. This study conducts an empirical analysis of the effects of oil price shocks on a developing country oil-exporter - Nigeria. Our findings showed that oil price shocks do not have a major impact on most macroeconomic variables in Nigeria. The results of the Granger-causality tests, impulse response functions, and variance decomposition analysis all showed that different measures of linear and positive oil shocks have not caused output, government expenditure, inflation, and the real exchange rate. The tests support the existence of asymmetric effects of oil price shocks because we find that negative oil shocks significantly cause output and the real exchange rate. (author)

  16. On the selection of significant variables in a model for the deteriorating process of facades

    Science.gov (United States)

    Serrat, C.; Gibert, V.; Casas, J. R.; Rapinski, J.

    2017-10-01

    In previous works the authors of this paper have introduced a predictive system that uses survival analysis techniques for the study of time-to-failure in the facades of a building stock. The approach is population based, in order to obtain information on the evolution of the stock across time, and to help the manager in the decision making process on global maintenance strategies. For the decision making it is crutial to determine those covariates -like materials, morphology and characteristics of the facade, orientation or environmental conditions- that play a significative role in the progression of different failures. The proposed platform also incorporates an open source GIS plugin that includes survival and test moduli that allow the investigator to model the time until a lesion taking into account the variables collected during the inspection process. The aim of this paper is double: a) to shortly introduce the predictive system, as well as the inspection and the analysis methodologies and b) to introduce and illustrate the modeling strategy for the deteriorating process of an urban front. The illustration will be focused on the city of L’Hospitalet de Llobregat (Barcelona, Spain) in which more than 14,000 facades have been inspected and analyzed.

  17. Variable selection based on clustering analysis for improvement of polyphenols prediction in green tea using synchronous fluorescence spectra

    Science.gov (United States)

    Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi

    2018-04-01

    Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.

  18. Selective nature and inherent variability of interrill erosion across prolonged rainfall simulation

    Science.gov (United States)

    Hu, Y.; Kuhn, N. J.; Fister, W.

    2012-04-01

    Sediment of interrill erosion has been generally recognized to be selectively enriched with soil organic carbon (SOC) and fine fractions (clay/silt-sized particles or aggregates) in comparison to source area soil. Limited kinetic energy and lack of concentrated runoff are the dominant factors causing selective detachment and transportation. Although enrichment ratios of SOC (ERsoc) in eroded sediment were generally reported > 1, the values varied widely. Causal factors to variation, such as initial soil properties, rainfall properties and experimental conditions, have been extensively discussed. But less attention was directed to the potential influence of prolonged rainfall time onto the temporal pattern of ERsoc. Conservation of mass dictates that ERsoc must be balanced by a decline in the source material which should also lead to a reduced or even negative ERsoc in sediment over time. Besides, the stabilizing effects of structural crust on reducing erosional variation, and the unavoidable variations of erosional response induced by the inherent complexity of interrill erosion, have scarcely been integrated. Moreover, during a prolonged rainfall event surface roughness evolves and affects the movement of eroded aggregates and mineral particles. In this study, two silt loams from Möhlin, Switzerland, organically (OS) and conventionally farmed (CS), were exposed to simulated rainfall of 30 mm h-1 for up to 6 hours. Round donut-flumes with a confined eroding area (1845 cm2) and limited transporting distance (20 cm) were used. Sediments, runoff and subsurface flow were collected in intervals of 30 min. Loose aggregates left on the eroded soil surface, crusts and the soil underneath the crusts were collected after the experiment. All the samples were analyzed for total organic carbon (TOC) content, and texture. Laser scanning of soil surface was applied before and after the rainfall event. The whole experiment was repeated for 10 times. Results from this study showed

  19. The influence of selected socio-demographic variables on symptoms occurring during the menopause

    Directory of Open Access Journals (Sweden)

    Marta Makara-Studzińska

    2015-02-01

    Full Text Available Introduction: It is considered that the lifestyle conditioned by socio-demographic or socio-economic factors determines the health condition of people to the greatest extent. The aim of this study is to evaluate the influence of selected socio-demographic factors on the kinds of symptoms occurring during menopause. Material and methods : The study group consisted of 210 women aged 45 to 65, not using hormone replacement therapy, staying at healthcare centers for rehabilitation treatment. The study was carried out in 2013-2014 in the Silesian, Podlaskie and Lesser Poland voivodeships. The set of tools consisted of the authors’ own survey questionnaire and the Menopause Rating Scale (MRS. Results : The most commonly occurring symptom in the group of studied women was a depressive mood, from the group of psychological symptoms, followed by physical and mental fatigue, and discomfort connected with muscle and joint pain. The greatest intensity of symptoms was observed in the group of women with the lowest level of education, reporting an average or bad material situation, and unemployed women. Conclusions : An alarmingly high number of reported psychological symptoms in the group of menopausal women was observed, and in particular among the group of low socio-economic status. Career seems to be a factor reducing the risk of occurrence of psychological symptoms. There is an urgent need for health promotion and prophylaxis in the group of menopausal women, and in many cases for implementation of specialist psychological assistance.

  20. Detecting temporal changes in acoustic scenes: The variable benefit of selective attention.

    Science.gov (United States)

    Demany, Laurent; Bayle, Yann; Puginier, Emilie; Semal, Catherine

    2017-09-01

    Four experiments investigated change detection in acoustic scenes consisting of a sum of five amplitude-modulated pure tones. As the tones were about 0.7 octave apart and were amplitude-modulated with different frequencies (in the range 2-32 Hz), they were perceived as separate streams. Listeners had to detect a change in the frequency (experiments 1 and 2) or the shape (experiments 3 and 4) of the modulation of one of the five tones, in the presence of an informative cue orienting selective attention either before the scene (pre-cue) or after it (post-cue). The changes left intensity unchanged and were not detectable in the spectral (tonotopic) domain. Performance was much better with pre-cues than with post-cues. Thus, change deafness was manifest in the absence of an appropriate focusing of attention when the change occurred, even though the streams and the changes to be detected were acoustically very simple (in contrast to the conditions used in previous demonstrations of change deafness). In one case, the results were consistent with a model based on the assumption that change detection was possible if and only if attention was endogenously focused on a single tone. However, it was also found that changes resulting in a steepening of amplitude rises were to some extent able to draw attention exogenously. Change detection was not markedly facilitated when the change produced a discontinuity in the modulation domain, contrary to what could be expected from the perspective of predictive coding. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. A volatolomic approach for studying plant variability: the case of selected Helichrysum species (Asteraceae).

    Science.gov (United States)

    Giuliani, Claudia; Lazzaro, Lorenzo; Calamassi, Roberto; Calamai, Luca; Romoli, Riccardo; Fico, Gelsomina; Foggi, Bruno; Mariotti Lippi, Marta

    2016-10-01

    The species of Helichrysum sect. Stoechadina (Asteraceae) are well-known for their secondary metabolite content and the characteristic aromatic bouquets. In the wild, populations exhibit a wide phenotypic plasticity which makes critical the circumscription of species and infraspecific ranks. Previous investigations on Helichrysum italicum complex focused on a possible phytochemical typification based on hydrodistilled essential oils. Aims of this paper are three-fold: (i) characterizing the volatile profiles of different populations, testing (ii) how these profiles vary across populations and (iii) how the phytochemical diversity may contribute in solving taxonomic problems. Nine selected Helichrysum populations, included within the H. italicum complex, Helichrysum litoreum and Helichrysum stoechas, were investigated. H. stoechas was chosen as outgroup for validating the method. After collection in the wild, plants were cultivated in standard growing conditions for over one year. Annual leafy shoots were screened in the post-blooming period for the emissions of volatile organic compounds (VOCs) by means of headspace solid phase microextraction coupled with gas-chromatography and mass spectrometry (HS-SPME-GC/MS). The VOC composition analysis revealed the production of overall 386 different compounds, with terpenes being the most represented compound class. Statistical data processing allowed the identification of the indicator compounds that differentiate the single populations, revealing the influence of the geographical provenance area in determining the volatile profiles. These results suggested the potential use of VOCs as valuable diacritical characters in discriminating the Helichrysum populations. In addition, the cross-validation analysis hinted the potentiality of this volatolomic study in the discrimination of the Helichrysum species and subspecies, highlighting a general congruence with the current taxonomic treatment of the genus. The consistency

  2. Habitat Heterogeneity Variably Influences Habitat Selection by Wild Herbivores in a Semi-Arid Tropical Savanna Ecosystem.

    Directory of Open Access Journals (Sweden)

    Victor K Muposhi

    Full Text Available An understanding of the habitat selection patterns by wild herbivores is critical for adaptive management, particularly towards ecosystem management and wildlife conservation in semi arid savanna ecosystems. We tested the following predictions: (i surface water availability, habitat quality and human presence have a strong influence on the spatial distribution of wild herbivores in the dry season, (ii habitat suitability for large herbivores would be higher compared to medium-sized herbivores in the dry season, and (iii spatial extent of suitable habitats for wild herbivores will be different between years, i.e., 2006 and 2010, in Matetsi Safari Area, Zimbabwe. MaxEnt modeling was done to determine the habitat suitability of large herbivores and medium-sized herbivores. MaxEnt modeling of habitat suitability for large herbivores using the environmental variables was successful for the selected species in 2006 and 2010, except for elephant (Loxodonta africana for the year 2010. Overall, large herbivores probability of occurrence was mostly influenced by distance from rivers. Distance from roads influenced much of the variability in the probability of occurrence of medium-sized herbivores. The overall predicted area for large and medium-sized herbivores was not different. Large herbivores may not necessarily utilize larger habitat patches over medium-sized herbivores due to the habitat homogenizing effect of water provisioning. Effect of surface water availability, proximity to riverine ecosystems and roads on habitat suitability of large and medium-sized herbivores in the dry season was highly variable thus could change from one year to another. We recommend adaptive management initiatives aimed at ensuring dynamic water supply in protected areas through temporal closure and or opening of water points to promote heterogeneity of wildlife habitats.

  3. Microwave-assisted of dispersive liquid-liquid microextraction and spectrophotometric determination of uranium after optimization based on Box-Behnken design and chemometrics methods

    Science.gov (United States)

    Niazi, Ali; Khorshidi, Neda; Ghaemmaghami, Pegah

    2015-01-01

    In this study an analytical procedure based on microwave-assisted dispersive liquid-liquid microextraction (MA-DLLME) and spectrophotometric coupled with chemometrics methods is proposed to determine uranium. In the proposed method, 4-(2-pyridylazo) resorcinol (PAR) is used as a chelating agent, and chloroform and ethanol are selected as extraction and dispersive solvent. The optimization strategy is carried out by using two level full factorial designs. Results of the two level full factorial design (24) based on an analysis of variance demonstrated that the pH, concentration of PAR, amount of dispersive and extraction solvents are statistically significant. Optimal condition for three variables: pH, concentration of PAR, amount of dispersive and extraction solvents are obtained by using Box-Behnken design. Under the optimum conditions, the calibration graphs are linear in the range of 20.0-350.0 ng mL-1 with detection limit of 6.7 ng mL-1 (3δB/slope) and the enrichment factor of this method for uranium reached at 135. The relative standard deviation (R.S.D.) is 1.64% (n = 7, c = 50 ng mL-1). The partial least squares (PLS) modeling was used for multivariate calibration of the spectrophotometric data. The orthogonal signal correction (OSC) was used for preprocessing of data matrices and the prediction results of model, with and without using OSC, were statistically compared. MA-DLLME-OSC-PLS method was presented for the first time in this study. The root mean squares error of prediction (RMSEP) for uranium determination using PLS and OSC-PLS models were 4.63 and 0.98, respectively. This procedure allows the determination of uranium synthesis and real samples such as waste water with good reliability of the determination.

  4. Microwave-assisted of dispersive liquid-liquid microextraction and spectrophotometric determination of uranium after optimization based on Box-Behnken design and chemometrics methods.

    Science.gov (United States)

    Niazi, Ali; Khorshidi, Neda; Ghaemmaghami, Pegah

    2015-01-25

    In this study an analytical procedure based on microwave-assisted dispersive liquid-liquid microextraction (MA-DLLME) and spectrophotometric coupled with chemometrics methods is proposed to determine uranium. In the proposed method, 4-(2-pyridylazo) resorcinol (PAR) is used as a chelating agent, and chloroform and ethanol are selected as extraction and dispersive solvent. The optimization strategy is carried out by using two level full factorial designs. Results of the two level full factorial design (2(4)) based on an analysis of variance demonstrated that the pH, concentration of PAR, amount of dispersive and extraction solvents are statistically significant. Optimal condition for three variables: pH, concentration of PAR, amount of dispersive and extraction solvents are obtained by using Box-Behnken design. Under the optimum conditions, the calibration graphs are linear in the range of 20.0-350.0 ng mL(-1) with detection limit of 6.7 ng mL(-1) (3δB/slope) and the enrichment factor of this method for uranium reached at 135. The relative standard deviation (R.S.D.) is 1.64% (n=7, c=50 ng mL(-1)). The partial least squares (PLS) modeling was used for multivariate calibration of the spectrophotometric data. The orthogonal signal correction (OSC) was used for preprocessing of data matrices and the prediction results of model, with and without using OSC, were statistically compared. MA-DLLME-OSC-PLS method was presented for the first time in this study. The root mean squares error of prediction (RMSEP) for uranium determination using PLS and OSC-PLS models were 4.63 and 0.98, respectively. This procedure allows the determination of uranium synthesis and real samples such as waste water with good reliability of the determination. Copyright © 2014. Published by Elsevier B.V.

  5. Association and discrimination of diesel fuels using chemometric procedures.

    Science.gov (United States)

    Marshall, Lucas J; McIlroy, John W; McGuffin, Victoria L; Waddell Smith, Ruth

    2009-08-01

    Five neat diesel samples were analyzed by gas chromatography-mass spectrometry and total ion chromatograms as well as extracted ion profiles of the alkane and aromatic compound classes were generated. A retention time alignment algorithm was employed to align chromatograms prior to peak area normalization. Pearson product moment correlation coefficients and principal components analysis were then employed to investigate association and discrimination among the diesel samples. The same procedures were also used to investigate the association of a diesel residue to its neat counterpart. Current limitations in the retention time alignment algorithm and the subsequent effect on the association and discrimination of the diesel samples are discussed. An understanding of these issues is crucial to ensure the accuracy of data interpretation based on such chemometric procedures.

  6. A primer to nutritional metabolomics by NMR spectroscopy and chemometrics

    DEFF Research Database (Denmark)

    Savorani, Francesco; Rasmussen, Morten Arendt; Mikkelsen, Mette Skau

    2013-01-01

    This paper outlines the advantages and disadvantages of using high throughput NMR metabolomics for nutritional studies with emphasis on the workflow and data analytical methods for generation of new knowledge. The paper describes one-by-one the major research activities in the interdisciplinary...... metabolomics platform and highlights the opportunities that NMR spectra can provide in future nutrition studies. Three areas are emphasized: (1) NMR as an unbiased and non-destructive platform for providing an overview of the metabolome under investigation, (2) NMR for providing versatile information and data...... structures for multivariate pattern recognition methods and (3) NMR for providing a unique fingerprint of the lipoprotein status of the subject. For the first time in history, by combining NMR spectroscopy and chemometrics we are able to perform inductive nutritional research as a complement to the deductive...

  7. Chemometrics approach to substrate development, case: semisyntetic cheese

    DEFF Research Database (Denmark)

    Nielsen, Per Væggemose; Hansen, Birgitte Vedel

    1998-01-01

    from food production facilities.The Chemometrics approach to substrate development is illustrated by the development of a semisyntetic cheese substrate. Growth, colour formation and mycotoxin production of 6 cheese related fungi were studied on 9 types of natural cheeses and 24 synthetic cheese......, the most frequently occurring contaminant on semi-hard cheese. Growth experiments on the substrate were repeatable and reproducible. The substrate was also suitable for the starter P. camemberti. Mineral elements in cheese were shown to have strong effect on growth, mycotoxin production and colour...... formation of fungi. For P. roqueforti, P. discolor, P. verrucosum and Aspergillus versicolor the substrate was less suitable as a model cheese substrate, which indicates great variation in nutritional demands of the fungi. Substrates suitable for studies of specific cheese types was found for P. roqueforti...

  8. A chemometric approach to the characterisation of historical mortars

    International Nuclear Information System (INIS)

    Rampazzi, L.; Pozzi, A.; Sansonetti, A.; Toniolo, L.; Giussani, B.

    2006-01-01

    The compositional knowledge of historical mortars is of great concern in case of provenance and dating investigations and of conservation works since the nature of the raw materials suggests the most compatible conservation products. The classic characterisation usually goes through various analytical determinations, while conservation laboratories call for simple and quick analyses able to enlighten the nature of mortars, usually in terms of the binder fraction. A chemometric approach to the matter is here undertaken. Specimens of mortars were prepared with calcitic and dolomitic binders and analysed by Atomic Spectroscopy. Principal Components Analysis (PCA) was used to investigate the features of specimens and samples. A Partial Least Square (PLS1) regression was done in order to predict the binder/aggregate ratio. The model was applied to historical mortars from the churches of St. Lorenzo (Milan) and St. Abbondio (Como). The accordance between the predictive model and the real samples is discussed

  9. An adaptive technique for multiscale approximate entropy (MAEbin) threshold (r) selection: application to heart rate variability (HRV) and systolic blood pressure variability (SBPV) under postural stress.

    Science.gov (United States)

    Singh, Amritpal; Saini, Barjinder Singh; Singh, Dilbag

    2016-06-01

    Multiscale approximate entropy (MAE) is used to quantify the complexity of a time series as a function of time scale τ. Approximate entropy (ApEn) tolerance threshold selection 'r' is based on either: (1) arbitrary selection in the recommended range (0.1-0.25) times standard deviation of time series (2) or finding maximum ApEn (ApEnmax) i.e., the point where self-matches start to prevail over other matches and choosing the corresponding 'r' (rmax) as threshold (3) or computing rchon by empirically finding the relation between rmax, SD1/SD2 ratio and N using curve fitting, where, SD1 and SD2 are short-term and long-term variability of a time series respectively. None of these methods is gold standard for selection of 'r'. In our previous study [1], an adaptive procedure for selection of 'r' is proposed for approximate entropy (ApEn). In this paper, this is extended to multiple time scales using MAEbin and multiscale cross-MAEbin (XMAEbin). We applied this to simulations i.e. 50 realizations (n = 50) of random number series, fractional Brownian motion (fBm) and MIX (P) [1] series of data length of N = 300 and short term recordings of HRV and SBPV performed under postural stress from supine to standing. MAEbin and XMAEbin analysis was performed on laboratory recorded data of 50 healthy young subjects experiencing postural stress from supine to upright. The study showed that (i) ApEnbin of HRV is more than SBPV in supine position but is lower than SBPV in upright position (ii) ApEnbin of HRV decreases from supine i.e. 1.7324 ± 0.112 (mean ± SD) to upright 1.4916 ± 0.108 due to vagal inhibition (iii) ApEnbin of SBPV increases from supine i.e. 1.5535 ± 0.098 to upright i.e. 1.6241 ± 0.101 due sympathetic activation (iv) individual and cross complexities of RRi and systolic blood pressure (SBP) series depend on time scale under consideration (v) XMAEbin calculated using ApEnmax is correlated with cross-MAE calculated using ApEn (0.1-0.26) in steps of 0

  10. Model selection for semiparametric marginal mean regression accounting for within-cluster subsampling variability and informative cluster size.

    Science.gov (United States)

    Shen, Chung-Wei; Chen, Yi-Hau

    2018-03-13

    We propose a model selection criterion for semiparametric marginal mean regression based on generalized estimating equations. The work is motivated by a longitudinal study on the physical frailty outcome in the elderly, where the cluster size, that is, the number of the observed outcomes in each subject, is "informative" in the sense that it is related to the frailty outcome itself. The new proposal, called Resampling Cluster Information Criterion (RCIC), is based on the resampling idea utilized in the within-cluster resampling method (Hoffman, Sen, and Weinberg, 2001, Biometrika 88, 1121-1134) and accommodates informative cluster size. The implementation of RCIC, however, is free of performing actual resampling of the data and hence is computationally convenient. Compared with the existing model selection methods for marginal mean regression, the RCIC method incorporates an additional component accounting for variability of the model over within-cluster subsampling, and leads to remarkable improvements in selecting the correct model, regardless of whether the cluster size is informative or not. Applying the RCIC method to the longitudinal frailty study, we identify being female, old age, low income and life satisfaction, and chronic health conditions as significant risk factors for physical frailty in the elderly. © 2018, The International Biometric Society.

  11. Petroleomics by electrospray ionization FT-ICR mass spectrometry coupled to partial least squares with variable selection methods: prediction of the total acid number of crude oils.

    Science.gov (United States)

    Terra, Luciana A; Filgueiras, Paulo R; Tose, Lílian V; Romão, Wanderson; de Souza, Douglas D; de Castro, Eustáquio V R; de Oliveira, Mirela S L; Dias, Júlio C M; Poppi, Ronei J

    2014-10-07

    Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.

  12. Metabolomic differentiation of maca (Lepidium meyenii) accessions cultivated under different conditions using NMR and chemometric analysis.

    Science.gov (United States)

    Zhao, Jianping; Avula, Bharathi; Chan, Michael; Clément, Céline; Kreuzer, Michael; Khan, Ikhlas A

    2012-01-01

    To gain insights on the effects of color type, cultivation history, and growing site on the composition alterations of maca (Lepidium meyenii Walpers) hypocotyls, NMR profiling combined with chemometric analysis was applied to investigate the metabolite variability in different maca accessions. Maca hypocotyls with different colors (yellow, pink, violet, and lead-colored) cultivated at different geographic sites and different areas were examined for differences in metabolite expression. Differentiations of the maca accessions grown under the different cultivation conditions were determined by principle component analyses (PCAs) which were performed on the datasets derived from their ¹H NMR spectra. A total of 16 metabolites were identified by NMR analysis, and the changes in metabolite levels in relation to the color types and growing conditions of maca hypocotyls were evaluated using univariate statistical analysis. In addition, the changes of the correlation pattern among the metabolites identified in the maca accessions planted at the two different sites were examined. The results from both multivariate and univariate analysis indicated that the planting site was the major determining factor with regards to metabolite variations in maca hypocotyls, while the color of maca accession seems to be of minor importance in this respect. © Georg Thieme Verlag KG Stuttgart · New York.

  13. Influence of genotype and crop year in the chemometrics of almond and pistachio oils.

    Science.gov (United States)

    Rabadán, Adrián; Álvarez-Ortí, Manuel; Gómez, Ricardo; de Miguel, Concepción; Pardo, José E

    2018-04-01

    Almond and pistachio oils can be considered as interesting products to produce and commercialize owing to their health-promoting properties. However, these properties are not consistent because of the differences that appear in oils as a result of the genotype and the crop year. The analysis of these variations and their origin is decisive in ensuring the commercial future prospects of these nut oils. Although significant variability has been reported in almond and pistachio oils as a result of the crop year and the interaction between crop year and genotype, the genotype itself remains the main factor determining oil chemometrics. Oil fatty acid profile has been mainly determined by the genotype, with the exception of palmitic fatty acid in pistachio oil. However, the crop year affects the concentration of some minor components of crucial nutritional interest as total polyphenols and phytosterols. Regarding reported differences in oil, some almond and pistachio genotypes should be prioritized for oil extraction. Breeding programmes focused on the improvement of specific characteristics of almond and pistachio oils should focus on chemical parameters mainly determined by the genotype. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.

  14. Botanical discrimination of Greek unifloral honeys with physico-chemical and chemometric analyses.

    Science.gov (United States)

    Karabagias, Ioannis K; Badeka, Anastasia V; Kontakos, Stavros; Karabournioti, Sofia; Kontominas, Michael G

    2014-12-15

    The aim of the present study was to investigate the possibility of characterisation and classification of Greek unifloral honeys (pine, thyme, fir and orange blossom) according to botanical origin using volatile compounds, conventional physico-chemical parameters and chemometric analyses (MANOVA and Linear Discriminant Analysis). For this purpose, 119 honey samples were collected during the harvesting period 2011 from 14 different regions in Greece known to produce unifloral honey of good quality. Physico-chemical analysis included the identification and semi quantification of fifty five volatile compounds performed by Headspace Solid Phase Microextraction coupled to gas chromatography/mass spectroscopy and the determination of conventional quality parameters such as pH, free, lactonic, total acidity, electrical conductivity, moisture, ash, lactonic/free acidity ratio and colour parameters L, a, b. Results showed that using 40 diverse variables (30 volatile compounds of different classes and 10 physico-chemical parameters) the honey samples were satisfactorily classified according to botanical origin using volatile compounds (84.0% correct prediction), physicochemical parameters (97.5% correct prediction), and the combination of both (95.8% correct prediction) indicating that multi element analysis comprises a powerful tool for honey discrimination purposes. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. A chemometric method to identify enzymatic reactions leading to the transition from glycolytic oscillations to waves

    Science.gov (United States)

    Zimányi, László; Khoroshyy, Petro; Mair, Thomas

    2010-06-01

    In the present work we demonstrate that FTIR-spectroscopy is a powerful tool for the time resolved and noninvasive measurement of multi-substrate/product interactions in complex metabolic networks as exemplified by the oscillating glycolysis in a yeast extract. Based on a spectral library constructed from the pure glycolytic intermediates, chemometric analysis of the complex spectra allowed us the identification of many of these intermediates. Singular value decomposition and multiple level wavelet decomposition were used to separate drifting substances from oscillating ones. This enabled us to identify slow and fast variables of glycolytic oscillations. Most importantly, we can attribute a qualitative change in the positive feedback regulation of the autocatalytic reaction to the transition from homogeneous oscillations to travelling waves. During the oscillatory phase the enzyme phosphofructokinase is mainly activated by its own product ADP, whereas the transition to waves is accompanied with a shift of the positive feedback from ADP to AMP. This indicates that the overall energetic state of the yeast extract determines the transition between spatially homogeneous oscillations and travelling waves.

  16. Chemometric study of Andalusian extra virgin olive oils Raman spectra: Qualitative and quantitative information.

    Science.gov (United States)

    Sánchez-López, E; Sánchez-Rodríguez, M I; Marinas, A; Marinas, J M; Urbano, F J; Caridad, J M; Moalem, M

    2016-08-15

    Authentication of extra virgin olive oil (EVOO) is an important topic for olive oil industry. The fraudulent practices in this sector are a major problem affecting both producers and consumers. This study analyzes the capability of FT-Raman combined with chemometric treatments of prediction of the fatty acid contents (quantitative information), using gas chromatography as the reference technique, and classification of diverse EVOOs as a function of the harvest year, olive variety, geographical origin and Andalusian PDO (qualitative information). The optimal number of PLS components that summarizes the spectral information was introduced progressively. For the estimation of the fatty acid composition, the lowest error (both in fitting and prediction) corresponded to MUFA, followed by SAFA and PUFA though such errors were close to zero in all cases. As regards the qualitative variables, discriminant analysis allowed a correct classification of 94.3%, 84.0%, 89.0% and 86.6% of samples for harvest year, olive variety, geographical origin and PDO, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Fast data preprocessing for chromatographic fingerprints of tomato cell wall polysaccharides using chemometric methods.

    Science.gov (United States)

    Quéméner, Bernard; Bertrand, Dominique; Marty, Isabelle; Causse, Mathilde; Lahaye, Marc

    2007-02-02

    The variability in the chemistry of cell wall polysaccharides in pericarp tissue of red-ripe tomato fruit (Solanum lycopersicon Mill.) was characterized by chemical methods and enzymatic degradations coupled to high performance anion exchange chromatography (HPAEC) and mass spectrometry analysis. Large fruited line, Levovil (LEV) carrying introgressed chromosome fragments from a cherry tomato line Cervil (CER) on chromosomes 4 (LC4), 9 (LC9), or on chromosomes 1, 2, 4 and 9 (LCX) and containing quantitative trait loci (QTLs) for texture traits, was studied. In order to differentiate cell wall polysaccharide modifications in the tomato fruit collection by multivariate analysis, chromatograms were corrected for baseline drift and shift of the component elution time using an approach derived from image analysis and mathematical morphology. The baseline was first corrected by using a "moving window" approach while the peak-matching method developed was based upon location of peaks as local maxima within a window of a definite size. The fast chromatographic data preprocessing proposed was a prerequisite for the different chemometric treatments, such as variance and principal component analysis applied herein to the analysis. Applied to the tomato collection, the combined enzymatic degradations and HPAEC analyses revealed that the firm LCX and CER genotypes showed a higher proportion of glucuronoxylans and pectic arabinan side chains while the mealy LC9 genotype demonstrated the highest content of pectic galactan side chains. QTLs on tomato chromosomes 1, 2, 4 and 9 contain important genes controlling glucuronoxylan and pectic neutral side chains biosynthesis and/or metabolism.

  18. The influence of some selected variables from accounting system on profit or loss of agricultural companies in the Slovak republic

    Directory of Open Access Journals (Sweden)

    Alexandra Ferenczi Vaňová

    2017-01-01

    Full Text Available 1024x768 The article presents the influence assessment of significance of some selected variables from the entrepreneurs' accounting system on the achieved profit or loss of the agricultural companies in the Slovak Republic. Accounting information serves as an active tool for internal users for operational as well as strategic company management, and for external users the information is determined as legally binding output information which is a subject to disclosure. Individual financial statements of assessed agricultural companies are considered to be the relevant source of information. Agricultural companies are represented by commercial companies and agricultural cooperatives. Profit or loss after income tax presents the final complex effect of economic company's performance. The existence and development of companies is conditioned by assets which amount and structure depend on focus and the range of subject activity but as well as on specific factors set by the production process in the agricultural primary production. The increase in liabilities is notable by the influence of unsufficient amount of own company funding sources, mainly the increase in trade payables. The continuance of company reproduction process is secured by a bank loan drawdown. The income situation of companies of agricultural primary production is favourably influenced by the subsidies of non-investment character. During the observed period of years 2004 - 2014 the examined variables were assessed by means of statistical methods. The obtained results of rate determination of statistical correlation between selected variables by means of classical canonical analysis and non-parametric correlation analysis secured that in the assessed group of companies all analysed variables influenced statistically significantly profit or loss after income tax, mainly the total value of assets and non-investment subsidies, except for years 2010, 2012 a 2013, when the statistically

  19. Genetic variability and natural selection at the ligand domain of the Duffy binding protein in brazilian Plasmodium vivax populations

    Directory of Open Access Journals (Sweden)

    Gil Luiz HS

    2010-11-01

    Full Text Available Abstract Background Plasmodium vivax malaria is a major public health challenge in Latin America, Asia and Oceania, with 130-435 million clinical cases per year worldwide. Invasion of host blood cells by P. vivax mainly depends on a type I membrane protein called Duffy binding protein (PvDBP. The erythrocyte-binding motif of PvDBP is a 170 amino-acid stretch located in its cysteine-rich region II (PvDBPII, which is the most variable segment of the protein. Methods To test whether diversifying natural selection has shaped the nucleotide diversity of PvDBPII in Brazilian populations, this region was sequenced in 122 isolates from six different geographic areas. A Bayesian method was applied to test for the action of natural selection under a population genetic model that incorporates recombination. The analysis was integrated with a structural model of PvDBPII, and T- and B-cell epitopes were localized on the 3-D structure. Results The results suggest that: (i recombination plays an important role in determining the haplotype structure of PvDBPII, and (ii PvDBPII appears to contain neutrally evolving codons as well as codons evolving under natural selection. Diversifying selection preferentially acts on sites identified as epitopes, particularly on amino acid residues 417, 419, and 424, which show strong linkage disequilibrium. Conclusions This study shows that some polymorphisms of PvDBPII are present near the erythrocyte-binding domain and might serve to elude antibodies that inhibit cell invasion. Therefore, these polymorphisms should be taken into account when designing vaccines aimed at eliciting antibodies to inhibit erythrocyte invasion.

  20. Synthesis, Characterization, and Variable-Temperature NMR Studies of Silver(I) Complexes for Selective Nitrene Transfer.

    Science.gov (United States)

    Huang, Minxue; Corbin, Joshua R; Dolan, Nicholas S; Fry, Charles G; Vinokur, Anastasiya I; Guzei, Ilia A; Schomaker, Jennifer M

    2017-06-05

    An array of silver complexes supported by nitrogen-donor ligands catalyze the transformation of C═C and C-H bonds to valuable C-N bonds via nitrene transfer. The ability to achieve high chemoselectivity and site selectivity in an amination event requires an understanding of both the solid- and solution-state behavior of these catalysts. X-ray structural characterizations were helpful in determining ligand features that promote the formation of monomeric versus dimeric complexes. Variable-temperature 1 H and DOSY NMR experiments were especially useful for understanding how the ligand identity influences the nuclearity, coordination number, and fluxional behavior of silver(I) complexes in solution. These insights are valuable for developing improved ligand designs.

  1. Application of chemometric analysis based on physicochemical and chromatographic data for the differentiation origin of plant protection products containing chlorpyrifos.

    Science.gov (United States)

    Miszczyk, Marek; Płonka, Marlena; Bober, Katarzyna; Dołowy, Małgorzata; Pyka, Alina; Pszczolińska, Klaudia

    2015-01-01

    The aim of this study was to investigate the similarities and dissimilarities between the pesticide samples in form of emulsifiable concentrates (EC) formulation containing chlorpyrifos as active ingredient coming from different sources (i.e., shops and wholesales) and also belonging to various series. The results obtained by the Headspace Gas Chromatography-Mass Spectrometry method and also some selected physicochemical properties of examined pesticides including pH, density, stability, active ingredient and water content in pesticides tested were compared using two chemometric methods. Applicability of simple cluster analysis and also principal component analysis of obtained data in differentiation of examined plant protection products coming from different sources was confirmed. It would be advantageous in the routine control of originality and also in the detection of counterfeit pesticides, respectively, among commercially available pesticides containing chlorpyrifos as an active ingredient.

  2. Modelling of Hydrophilic Interaction Liquid Chromatography Stationary Phases Using Chemometric Approaches

    Science.gov (United States)

    Ortiz-Villanueva, Elena; Tauler, Romà

    2017-01-01

    Metabolomics is a powerful and widely used approach that aims to screen endogenous small molecules (metabolites) of different families present in biological samples. The large variety of compounds to be determined and their wide diversity of physical and chemical properties have promoted the development of different types of hydrophilic interaction liquid chromatography (HILIC) stationary phases. However, the selection of the most suitable HILIC stationary phase is not straightforward. In this work, four different HILIC stationary phases have been compared to evaluate their potential application for the analysis of a complex mixture of metabolites, a situation similar to that found in non-targeted metabolomics studies. The obtained chromatographic data were analyzed by different chemometric methods to explore the behavior of the considered stationary phases. ANOVA-simultaneous component analysis (ASCA), principal component analysis (PCA) and partial least squares regression (PLS) were used to explore the experimental factors affecting the stationary phase performance, the main similarities and differences among chromatographic conditions used (stationary phase and pH) and the molecular descriptors most useful to understand the behavior of each stationary phase. PMID:29064436

  3. Chemical pattern of brazilian apples: a chemometric approach based on the Fuji and Gala varieties

    Directory of Open Access Journals (Sweden)

    Renato Giovanetti Vieira

    2011-06-01

    Full Text Available The chemical composition of apple juices may be used to discriminate between the varieties for consumption and those for raw material. Fuji and Gala have a chemical pattern that can be used for this classification. Multivariate methods correlate independent continuous chemical descriptors with the categorical apple variety. Three main descriptors of apple juice were selected: malic acid, total reducing sugar and total phenolic compounds. A chemometric approach, employing PCA and SIMCA, was used to classify apple juice samples. PCA was performed with 24 juices from Fuji and Gala, and SIMCA, with 15 juices. The exploratory and predictive models recognized 88% and 64%, respectively, as belonging to a mixed domain. The apple juice from commercial fruits shows a pattern related to cv. Fuji and Gala with boundaries from 0.18 to 0.389 g.100 mL-1 (malic acid, from 8.65 to 15.18 g.100 mL-1 (total reducing sugar and from 100 to 400 mg.L-1 (total phenolic compounds, but such boundaries were slightly shorter in the remaining set of commercial apple juices, specifically from 0.16 to 0.36 g.100 mL-1, from 9.25 to 15.5 g.100 mL-1 and from 180 to 606 mg.L-1 for acidity, reducing sugar and phenolic compounds, respectively, representing the acid, sweet and bitter tastes.

  4. Chemometric expertise of the quality of groundwater sources for domestic use.

    Science.gov (United States)

    Spanos, Thomas; Ene, Antoaneta; Simeonova, Pavlina

    2015-01-01

    In the present study 49 representative sites have been selected for the collection of water samples from central water supplies with different geographical locations in the region of Kavala, Northern Greece. Ten physicochemical parameters (pH, electric conductivity, nitrate, chloride, sodium, potassium, total alkalinity, total hardness, bicarbonate and calcium) were analyzed monthly, in the period from January 2010 to December 2010. Chemometric methods were used for monitoring data mining and interpretation (cluster analysis, principal components analysis and source apportioning by principal components regression). The clustering of the chemical indicators delivers two major clusters related to the water hardness and the mineral components (impacted by sea, bedrock and acidity factors). The sampling locations are separated into three major clusters corresponding to the spatial distribution of the sites - coastal, lowland and semi-mountainous. The principal components analysis reveals two latent factors responsible for the data structures, which are also an indication for the sources determining the groundwater quality of the region (conditionally named "mineral" factor and "water hardness" factor). By the apportionment approach it is shown what the contribution is of each of the identified sources to the formation of the total concentration of each one of the chemical parameters. The mean values of the studied physicochemical parameters were found to be within the limits given in the 98/83/EC Directive. The water samples are appropriate for human consumption. The results of this study provide an overview of the hydrogeological profile of water supply system for the studied area.

  5. Genetic and Psychosocial Predictors of Aggression: Variable Selection and Model Building With Component-Wise Gradient Boosting

    Directory of Open Access Journals (Sweden)

    Robert Suchting

    2018-05-01

    Full Text Available Rationale: Given datasets with a large or diverse set of predictors of aggression, machine learning (ML provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior.Objectives: The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5 polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults.Methods: The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a select variables from an initial set of 20 to build a model of trait aggression; and then (b reduce that model to maximize parsimony and generalizability.Results: From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ total score, with R2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect, childhood trauma (physical abuse and neglect, and the FKBP5_13 gene (rs1360780. The six-factor model approximated the initial eight-factor model at 99.4% of R2.Conclusions: Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for

  6. Genetic and Psychosocial Predictors of Aggression: Variable Selection and Model Building With Component-Wise Gradient Boosting.

    Science.gov (United States)

    Suchting, Robert; Gowin, Joshua L; Green, Charles E; Walss-Bass, Consuelo; Lane, Scott D

    2018-01-01

    Rationale : Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. Objectives : The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. Methods : The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. Results : From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with R 2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4% of R 2 . Conclusions : Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for

  7. THE HOST GALAXY PROPERTIES OF VARIABILITY SELECTED AGN IN THE PAN-STARRS1 MEDIUM DEEP SURVEY

    Energy Technology Data Exchange (ETDEWEB)

    Heinis, S.; Gezari, S.; Kumar, S. [Department of Astronomy, University of Maryland, College Park, MD (United States); Burgett, W. S.; Flewelling, H.; Huber, M. E.; Kaiser, N.; Wainscoat, R. J.; Waters, C. [Institute for Astronomy, University of Hawaii at Manoa, Honolulu, HI 96822 (United States)

    2016-07-20

    We study the properties of 975 active galactic nuclei (AGNs) selected by variability in the Pan-STARRS1 Medium deep Survey. Using complementary multi-wavelength data from the ultraviolet to the far-infrared, we use spectral energy distribution fitting to determine the AGN and host properties at z < 1 and compare to a well-matched control sample. We confirm the trend previously observed: that the variability amplitude decreases with AGN luminosity, but we also observe that the slope of this relation steepens with wavelength, resulting in a “redder when brighter” trend at low luminosities. Our results show that AGNs are hosted by more massive hosts than control sample galaxies, while the rest frame dust-corrected NUV r color distribution of AGN hosts is similar to control galaxies. We find a positive correlation between the AGN luminosity and star formation rate (SFR), independent of redshift. AGN hosts populate the entire range of SFRs within and outside of the Main Sequence of star-forming galaxies. Comparing the distribution of AGN hosts and control galaxies, we show that AGN hosts are less likely to be hosted by quiescent galaxies and more likely to be hosted by Main Sequence or starburst galaxies.

  8. Relation of desert pupfish abundance to selected environmental variables in natural and manmade habitats in the Salton Sea basin

    Science.gov (United States)

    Martin, B.A.; Saiki, M.K.

    2005-01-01

    We assessed the relation between abundance of desert pupfish, Cyprinodon macularius, and selected biological and physicochemical variables in natural and manmade habitats within the Salton Sea Basin. Field sampling in a natural tributary, Salt Creek, and three agricultural drains captured eight species including pupfish (1.1% of the total catch), the only native species encountered. According to Bray-Curtis resemblance functions, fish species assemblages differed mostly between Salt Creek and the drains (i.e., the three drains had relatively similar species assemblages). Pupfish numbers and environmental variables varied among sites and sample periods. Canonical correlation showed that pupfish abundance was positively correlated with abundance of western mosquitofish, Gambusia affinis, and negatively correlated with abundance of porthole livebearers, Poeciliopsis gracilis, tilapias (Sarotherodon mossambica and Tilapia zillii), longjaw mudsuckers, Gillichthys mirabilis, and mollies (Poecilia latipinnaandPoecilia mexicana). In addition, pupfish abundance was positively correlated with cover, pH, and salinity, and negatively correlated with sediment factor (a measure of sediment grain size) and dissolved oxygen. Pupfish abundance was generally highest in habitats where water quality extremes (especially high pH and salinity, and low dissolved oxygen) seemingly limited the occurrence of nonnative fishes. This study also documented evidence of predation by mudsuckers on pupfish. These findings support the contention of many resource managers that pupfish populations are adversely influenced by ecological interactions with nonnative fishes. ?? Springer 2005.

  9. Managing anthelmintic resistance-Variability in the dose of drug reaching the target worms influences selection for resistance?

    Science.gov (United States)

    Leathwick, Dave M; Luo, Dongwen

    2017-08-30

    The concentration profile of anthelmintic reaching the target worms in the host can vary between animals even when administered doses are tailored to individual liveweight at the manufacturer's recommended rate. Factors contributing to variation in drug concentration include weather, breed of animal, formulation and the route by which drugs are administered. The implications of this variability for the development of anthelmintic resistance was investigated using Monte-Carlo simulation. A model framework was established where 100 animals each received a single drug treatment. The 'dose' of drug allocated to each animal (i.e. the concentration-time profile of drug reaching the target worms) was sampled at random from a distribution of doses with mean m and standard deviation s. For each animal the dose of drug was used in conjunction with pre-determined dose-response relationships, representing single and poly-genetic inheritance, to calculate efficacy against susceptible and resistant genotypes. These data were then used to calculate the overall change in resistance gene frequency for the worm population as a result of the treatment. Values for m and s were varied to reflect differences in both mean dose and the variability in dose, and for each combination of these 100,000 simulations were run. The resistance gene frequency in the population after treatment increased as m decreased and as s increased. This occurred for both single and poly-gene models and for different levels of dominance (survival under treatment) of the heterozygote genotype(s). The results indicate that factors which result in lower and/or more variable concentrations of active reaching the target worms are more likely to select for resistance. The potential of different routes of anthelmintic administration to play a role in the development of anthelmintic resistance is discussed. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. A selective review of the first 20 years of instrumental variables models in health-services research and medicine.

    Science.gov (United States)

    Cawley, John

    2015-01-01

    The method of instrumental variables (IV) is useful for estimating causal effects. Intuitively, it exploits exogenous variation in the treatment, sometimes called natural experiments or instruments. This study reviews the literature in health-services research and medical research that applies the method of instrumental variables, documents trends in its use, and offers examples of various types of instruments. A literature search of the PubMed and EconLit research databases for English-language journal articles published after 1990 yielded a total of 522 original research articles. Citations counts for each article were derived from the Web of Science. A selective review was conducted, with articles prioritized based on number of citations, validity and power of the instrument, and type of instrument. The average annual number of papers in health services research and medical research that apply the method of instrumental variables rose from 1.2 in 1991-1995 to 41.8 in 2006-2010. Commonly-used instruments (natural experiments) in health and medicine are relative distance to a medical care provider offering the treatment and the medical care provider's historic tendency to administer the treatment. Less common but still noteworthy instruments include randomization of treatment for reasons other than research, randomized encouragement to undertake the treatment, day of week of admission as an instrument for waiting time for surgery, and genes as an instrument for whether the respondent has a heritable condition. The use of the method of IV has increased dramatically in the past 20 years, and a wide range of instruments have been used. Applications of the method of IV have in several cases upended conventional wisdom that was based on correlations and led to important insights about health and healthcare. Future research should pursue new applications of existing instruments and search for new instruments that are powerful and valid.

  11. Discrimination of Brazilian propolis according to the seasoning using chemometrics and machine learning based on UV-Vis scanning data.

    Science.gov (United States)

    Tomazzoli, Maíra M; Pai Neto, Remi D; Moresco, Rodolfo; Westphal, Larissa; Zeggio, Amelia R S; Specht, Leandro; Costa, Christopher; Rocha, Miguel; Maraschin, Marcelo

    2015-12-01

    Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plant's resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds ( λ= 280-400 ηm), suggesting that besides the biological activities of those

  12. Genetic variability, partial regression, Co-heritability studies and their implication in selection of high yielding potato gen

    International Nuclear Information System (INIS)

    Iqbal, Z.M.; Khan, S.A.

    2003-01-01

    Partial regression coefficient, genotypic and phenotypic variabilities, heritability co-heritability and genetic advance were studied in 15 Potato varieties of exotic and local origin. Both genotypic and phenotypic coefficients of variations were high for scab and rhizoctonia incidence percentage. Significant partial regression coefficient for emergence percentage indicated its relative importance in tuber yield. High heritability (broadsense) estimates coupled with high genetic advance for plant height, number of stems per plant and scab percentage revealed substantial contribution of additive genetic variance in the expression of these traits. Hence, the selection based on these characters could play a significant role in their improvement the dominance and epistatic variance was more important for character expression of yield ha/sup -1/, emergence and rhizoctonia percentage. This phenomenon is mainly due to the accumulative effects of low heritability and low to moderate genetic advance. The high co-heritability coupled with negative genotypic and phenotypic covariance revealed that selection of varieties having low scab and rhizoctonia percentage resulted in more potato yield. (author)

  13. Risk estimates for hip fracture from clinical and densitometric variables and impact of database selection in Lebanese subjects.

    Science.gov (United States)

    Badra, Mohammad; Mehio-Sibai, Abla; Zeki Al-Hazzouri, Adina; Abou Naja, Hala; Baliki, Ghassan; Salamoun, Mariana; Afeiche, Nadim; Baddoura, Omar; Bulos, Suhayl; Haidar, Rachid; Lakkis, Suhayl; Musharrafieh, Ramzi; Nsouli, Afif; Taha, Assaad; Tayim, Ahmad; El-Hajj Fuleihan, Ghada

    2009-01-01

    Bone mineral density (BMD) and fracture incidence vary greatly worldwide. The data, if any, on clinical and densitometric characteristics of patients with hip fractures from the Middle East are scarce. The objective of the study was to define risk estimates from clinical and densitometric variables and the impact of database selection on such estimates. Clinical and densitometric information were obtained in 60 hip fracture patients and 90 controls. Hip fracture subjects were 74 yr (9.4) old, were significantly taller, lighter, and more likely to be taking anxiolytics and sleeping pills than controls. National Health and Nutrition Examination Survey (NHANES) database selection resulted in a higher sensitivity and almost equal specificity in identifying patients with a hip fracture compared with the Lebanese database. The odds ratio (OR) and its confidence interval (CI) for hip fracture per standard deviation (SD) decrease in total hip BMD was 2.1 (1.45-3.05) with the NHANES database, and 2.11 (1.36-2.37) when adjusted for age and body mass index (BMI). Risk estimates were higher in male compared with female subjects. In Lebanese subjects, BMD- and BMI-derived hip fracture risk estimates are comparable to western standards. The study validates the universal use of the NHANES database, and the applicability of BMD- and BMI-derived risk fracture estimates in the World Health Organization (WHO) global fracture risk model, to the Lebanese.

  14. Dynamic surface-enhanced Raman spectroscopy and Chemometric methods for fast detection and intelligent identification of methamphetamine and 3, 4-Methylenedioxy methamphetamine in human urine

    Science.gov (United States)

    Weng, Shizhuang; Dong, Ronglu; Zhu, Zede; Zhang, Dongyan; Zhao, Jinling; Huang, Linsheng; Liang, Dong

    2018-01-01

    Conventional Surface-Enhanced Raman Spectroscopy (SERS) for fast detection of drugs in urine on the portable Raman spectrometer remains challenges because of low sensitivity and unreliable Raman signal, and spectra process with manual intervention. Here, we develop a novel detection method of drugs in urine using chemometric methods and dynamic SERS (D-SERS) with mPEG-SH coated gold nanorods (GNRs). D-SERS combined with the uniform GNRs can obtain giant enhancement, and the signal is also of high reproducibility. On the basis of the above advantages, we obtained the spectra of urine, urine with methamphetamine (MAMP), urine with 3, 4-Methylenedioxy Methamphetamine (MDMA) using D-SERS. Simultaneously, some chemometric methods were introduced for the intelligent and automatic analysis of spectra. Firstly, the spectra at the critical state were selected through using K-means. Then, the spectra were proposed by random forest (RF) with feature selection and principal component analysis (PCA) to develop the recognition model. And the identification accuracy of model were 100%, 98.7% and 96.7%, respectively. To validate the effect in practical issue further, the drug abusers'urine samples with 0.4, 3, 30 ppm MAMP were detected using D-SERS and identified by the classification model. The high recognition accuracy of > 92.0% can meet the demand of practical application. Additionally, the parameter optimization of RF classification model was simple. Compared with the general laboratory method, the detection process of urine's spectra using D-SERS only need 2 mins and 2 μL samples volume, and the identification of spectra based on chemometric methods can be finish in seconds. It is verified that the proposed approach can provide the accurate, convenient and rapid detection of drugs in urine.

  15. NUMBER OF SUCCESSIVE CYCLES NECESSARY TO ACHIEVE STABILITY OF SELECTED GROUND REACTION FORCE VARIABLES DURING CONTINUOUS JUMPING

    Directory of Open Access Journals (Sweden)

    Jasmes M.W. Brownjohn

    2009-12-01

    Full Text Available Because of inherent variability in all human cyclical movements, such as walking, running and jumping, data collected across a single cycle might be atypical and potentially unable to represent an individual's generalized performance. The study described here was designed to determine the number of successive cycles due to continuous, repetitive countermovement jumping which a test subject should perform in a single experimental session to achieve stability of the mean of the corresponding continuously measured ground reaction force (GRF variables. Seven vertical GRF variables (period of jumping cycle, duration of contact phase, peak force amplitude and its timing, average rate of force development, average rate of force relaxation and impulse were extracted on the cycle-by-cycle basis from vertical jumping force time histories generated by twelve participants who were jumping in response to regular electronic metronome beats in the range 2-2.8 Hz. Stability of the selected GRF variables across successive jumping cycles was examined for three jumping rates (2, 2.4 and 2.8 Hz using two statistical methods: intra-class correlation (ICC analysis and segmental averaging technique (SAT. Results of the ICC analysis indicated that an average of four successive cycles (mean 4.5 ± 2.7 for 2 Hz; 3.9 ± 2.6 for 2.4 Hz; 3.3 ± 2.7 for 2.8 Hz were necessary to achieve maximum ICC values. Except for jumping period, maximum ICC values took values from 0.592 to 0.991 and all were significantly (p < 0.05 different from zero. Results of the SAT revealed that an average of ten successive cycles (mean 10.5 ± 3.5 for 2 Hz; 9.2 ± 3.8 for 2.4 Hz; 9.0 ± 3.9 for 2.8 Hz were necessary to achieve stability of the selected parameters using criteria previously reported in the literature. Using 10 reference trials, the SAT required standard deviation criterion values of 0.49, 0.41 and 0.55 for 2 Hz, 2.4 Hz and 2.8 Hz jumping rates, respectively, in order to approximate

  16. Chemometric, physicomechanical and rheological analysis of the sol-gel dynamics and degree of crosslinking of glycosidic polymers

    International Nuclear Information System (INIS)

    Choonara, Y E; Pillay, V; Singh, N; Ndesendo, V M K; Khan, R A

    2008-01-01

    The influence of calcium (Ca 2+ ), zinc (Zn 2+ ) and barium (Ba 2+ ) ions on the sol-gel interconversion dynamics, degree of crosslinking and the matrix resilience of crosslinked alginate gelispheres was determined. The dependent compositional and operational variables of crosslinking make it a challenging task to optimize the degree of crosslinking and the physicomechanical properties of alginate gelispheres. The combinatory approach of textural profiling, assessing pertinent rheological descriptors and chemometric model analysis of the sol-gel interconversion mechanisms and energy paradigms involved during crosslinking, hydration and erosion of gelispheres was explored. Molecular structural modelling of the gelispheres provided a mechanistic understanding of the sol-gel interconversion phenomena and their influence on the degree of crosslinking, the hydrational dynamics and gelisphere formation. Rheological analysis revealed offset yield point values of 6.1 mg ml -1 and 8.0 mg ml -1 were computed from fitted regression curves for determining the crosslinker concentration required for combinatory crosslinkers such as Ca/Zn/Ba ions and Ba/Zn, respectively. The influence of hydration on the erosion was a direct function of the gelispheres physicomechanical strength. Textural profiling characterized the gelisphere matrices for their resilience. The various crosslinkers interacted with monomeric units at varying intensities. Ba-crosslinked gelispheres were brittle with dense polymeric networks. Zn-crosslinked gelispheres produced permeable resilient matrices when hydrated and Ca-crosslinked gelispheres demonstrated intermediate resilience with greater G/M ratio alginate grades. Chemometrical analysis explicated a potential link between several phenomena such as the type of crosslinkers employed, the static shear-rate viscosity attained, the matrix resilience and the associated sol-gel mechanisms and energy paradigms of crosslinked gelispheres

  17. Rapid detection of Listeria monocytogenes in milk using confocal micro-Raman spectroscopy and chemometric analysis.

    Science.gov (United States)

    Wang, Junping; Xie, Xinfang; Feng, Jinsong; Chen, Jessica C; Du, Xin-jun; Luo, Jiangzhao; Lu, Xiaonan; Wang, Shuo

    2015-07-02

    Listeria monocytogenes is a facultatively anaerobic, Gram-positive, rod-shape foodborne bacterium causing invasive infection, listeriosis, in susceptible populations. Rapid and high-throughput detection of this pathogen in dairy products is critical as milk and other dairy products have been implicated as food vehicles in several outbreaks. Here we evaluated confocal micro-Raman spectroscopy (785 nm laser) coupled with chemometric analysis to distinguish six closely related Listeria species, including L. monocytogenes, in both liquid media and milk. Raman spectra of different Listeria species and other bacteria (i.e., Staphylococcus aureus, Salmonella enterica and Escherichia coli) were collected to create two independent databases for detection in media and milk, respectively. Unsupervised chemometric models including principal component analysis and hierarchical cluster analysis were applied to differentiate L. monocytogenes from Listeria and other bacteria. To further evaluate the performance and reliability of unsupervised chemometric analyses, supervised chemometrics were performed, including two discriminant analyses (DA) and soft independent modeling of class analogies (SIMCA). By analyzing Raman spectra via two DA-based chemometric models, average identification accuracies of 97.78% and 98.33% for L. monocytogenes in media, and 95.28% and 96.11% in milk were obtained, respectively. SIMCA analysis also resulted in satisfied average classification accuracies (over 93% in both media and milk). This Raman spectroscopic-based detection of L. monocytogenes in media and milk can be finished within a few hours and requires no extensive sample preparation. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. Selective dopamine D3 receptor antagonism by SB-277011A attenuates cocaine reinforcement as assessed by progressive-ratio and variable-cost–variable-payoff fixed-ratio cocaine self-administration in rats

    OpenAIRE

    Xi, Zheng-Xiong; Gilbert, Jeremy G.; Pak, Arlene C.; Ashby, Charles R.; Heidbreder, Christian A.; Gardner, Eliot L.

    2005-01-01

    In rats, acute administration of SB-277011A, a highly selective dopamine (DA) D3 receptor antagonist, blocks cocaine-enhanced brain stimulation reward, cocaine-seeking behaviour and reinstatement of cocaine-seeking behaviour. Here, we investigated whether SB-277011A attenuates cocaine reinforcement as assessed by cocaine self-administration under variable-cost–variable-payoff fixed-ratio (FR) and progressive-ratio (PR) reinforcement schedules. Acute i.p. administration of SB-277011A (3–24 mg/...

  19. Near infrared spectroscopy combined with chemometrics for growth stage classification of cannabis cultivated in a greenhouse from seized seeds

    Science.gov (United States)

    Borille, Bruna Tassi; Marcelo, Marcelo Caetano Alexandre; Ortiz, Rafael Scorsatto; Mariotti, Kristiane de Cássia; Ferrão, Marco Flôres; Limberger, Renata Pereira

    2017-02-01

    Cannabis sativa L. (cannabis, Cannabaceae), popularly called marijuana, is one of the oldest plants known to man and it is the illicit drug most used worldwide. It also has been the subject of increasing discussions from the scientific and political points of view due to its medicinal properties. In recent years in Brazil, the form of cannabis drug trafficking has been changing and the Brazilian Federal Police has exponentially increased the number of seizures of cannabis seeds sent by the mail. This new form of trafficking encouraged the study of cannabis seeds seized germinated in a greenhouse through NIR spectroscopy combined with chemometrics. The plants were cultivated in a homemade greenhouse under controlled conditions. In three different growth periods (5.5 weeks, 7.5 weeks and 10 weeks), they were harvested, dried, ground and directly analyzed. The iPCA was used to select the best NIR spectral range (4000-4375 cm- 1) in order to develop unsupervised and supervised methods. The PCA and HCA showed a good separation between the three groups of cannabis samples at different growth stages. The PLS-DA and SVM-DA classified the samples with good results in terms of sensitivity and specificity. The sensitivity and specificity for SVM-DA classification were equal to unity. This separation may be due to the correlation of cannabinoids and volatile compounds concentration during the growth of the cannabis plant. Therefore, the growth stage of cannabis can be predicted by NIR spectroscopy and chemometric tools in the early stages of indoor cannabis cultivation.

  20. Towards the determination of the geographical origin of yellow cake samples by laser-induced breakdown spectroscopy and chemometrics

    International Nuclear Information System (INIS)

    Sirven, J.B.; Pailloux, A.; M'Baye, Y.; Coulon, N.; Alpettaz, Th.; Gosse, St.

    2009-01-01

    Yellow cake is a commonly used name for powdered uranium concentrate, produced with the uranium ore. It is the first step in the fabrication of nuclear fuel. As it contains fissile material its circulation needs to be controlled in order to avoid proliferation. In particular there is an interest in onsite determination of the geographical origin of a sample. The yellow cake elemental composition depends on its production site and can therefore be used to identify its origin. In this work laser-induced breakdown spectroscopy (LIBS) associated with chemometrics techniques is used to discriminate yellow cake samples of different geographical origin. 11 samples, one per origin, are analyzed by a commercial equipment in laboratory experimental conditions. Spectra are then processed by multivariate techniques like Principal Components Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA). Successive global PCAs are first performed on the whole spectra and enable one to discriminate all samples. The method is then refined by selecting several emission lines in the spectra and by using them as input data of the chemometric treatments. With a SIMCA model applied to these data a rate of correct identification of 100% is obtained for all classes. Then to define the specifications of a future onsite LIBS system, the use of a more compact spectrometer is simulated by a numerical treatment of experimental spectra. Simultaneously the reduction of spectral data used by the model is also investigated to decrease the spectral bandwidth of the measurement. The rate of correct identification remains very high. This work shows the very good ability of SIMCA associated with LIBS to discriminate yellow cake samples with a very high rate of success, in controlled laboratory conditions. (authors)

  1. A spatio-temporal nonparametric Bayesian variable selection model of fMRI data for clustering correlated time courses.

    Science.gov (United States)

    Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina

    2014-07-15

    In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. At-line determination of pharmaceuticals small molecule's blending end point using chemometric modeling combined with Fourier transform near infrared spectroscopy

    Science.gov (United States)

    Tewari, Jagdish; Strong, Richard; Boulas, Pierre

    2017-02-01

    This article summarizes the development and validation of a Fourier transform near infrared spectroscopy (FT-NIR) method for the rapid at-line prediction of active pharmaceutical ingredient (API) in a powder blend to optimize small molecule formulations. The method was used to determine the blend uniformity end-point for a pharmaceutical solid dosage formulation containing a range of API concentrations. A set of calibration spectra from samples with concentrations ranging from 1% to 15% of API (w/w) were collected at-line from 4000 to 12,500 cm- 1. The ability of the FT-NIR method to predict API concentration in the blend samples was validated against a reference high performance liquid chromatography (HPLC) method. The prediction efficiency of four different types of multivariate data modeling methods such as partial least-squares 1 (PLS1), partial least-squares 2 (PLS2), principal component regression (PCR) and artificial neural network (ANN), were compared using relevant multivariate figures of merit. The prediction ability of the regression models were cross validated against results generated with the reference HPLC method. PLS1 and ANN showed excellent and superior prediction abilities when compared to PLS2 and PCR. Based upon these results and because of its decreased complexity compared to ANN, PLS1 was selected as the best chemometric method to predict blend uniformity at-line. The FT-NIR measurement and the associated chemometric analysis were implemented in the production environment for rapid at-line determination of the end-point of the small molecule blending operation. FIGURE 1: Correlation coefficient vs Rank plot FIGURE 2: FT-NIR spectra of different steps of Blend and final blend FIGURE 3: Predictions ability of PCR FIGURE 4: Blend uniformity predication ability of PLS2 FIGURE 5: Prediction efficiency of blend uniformity using ANN FIGURE 6: Comparison of prediction efficiency of chemometric models TABLE 1: Order of Addition for Blending Steps

  3. Discrimination of sugarcane according to cultivar by 1H NMR and chemometric analyses

    Energy Technology Data Exchange (ETDEWEB)

    Alves Filho, Elenilson G.; Silva, Lorena M.A.; Choze, Rafael; Liao, Luciano M. [Laboratorio de Ressonancia Magnetica Nuclear, Instituto de Quimica, Universidade Federal de Goias (UFG), Goiania, GO (Brazil); Honda, Neli K.; Alcantara, Glaucia B. [Departamento de Quimica, Universidade Federal de Mato Grosso do Sul (UFMS), Campo Grande, MS (Brazil)

    2012-07-01

    Several technologies for the development of new sugarcane cultivars have mainly focused on the increase in productivity and greater disease resistance. Sugarcane cultivars are usually identified by the organography of the leaves and stems, the analysis of peroxidase and esterase isoenzyme activities and the total soluble protein as well as soluble solid content. Nuclear magnetic resonance (NMR) associated with chemometric analysis has proven to be a valuable tool for cultivar assessment. Thus, this article describes the potential of chemometric analysis applied to 1H high resolution magic angle spinning (HRMAS) and NMR in solution for the investigation of sugarcane cultivars. For this purpose, leaves from eight different cultivars of sugarcane were investigated by {sup 1}H NMR spectroscopy in combination with chemometric analysis. The approach shows to be a useful tool for the distinction and classification of different sugarcane cultivars as well as to access the differences on its chemical composition. (author)

  4. Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

    Directory of Open Access Journals (Sweden)

    Calvo-Dmgz D.

    2012-12-01

    Full Text Available DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS. The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significat differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.

  5. Chromatography methods and chemometrics for determination of milk fat adulterants

    Science.gov (United States)

    Trbović, D.; Petronijević, R.; Đorđević, V.

    2017-09-01

    Milk and milk-based products are among the leading food categories according to reported cases of food adulteration. Although many authentication problems exist in all areas of the food industry, adequate control methods are required to evaluate the authenticity of milk and milk products in the dairy industry. Moreover, gas chromatography (GC) analysis of triacylglycerols (TAGs) or fatty acid (FA) profiles of milk fat (MF) in combination with multivariate statistical data processing have been used to detect adulterations of milk and dairy products with foreign fats. The adulteration of milk and butter is a major issue for the dairy industry. The major adulterants of MF are vegetable oils (soybean, sunflower, groundnut, coconut, palm and peanut oil) and animal fat (cow tallow and pork lard). Multivariate analysis enables adulterated MF to be distinguished from authentic MF, while taking into account many analytical factors. Various multivariate analysis methods have been proposed to quantitatively detect levels of adulterant non-MFs, with multiple linear regression (MLR) seemingly the most suitable. There is a need for increased use of chemometric data analyses to detect adulterated MF in foods and for their expanded use in routine quality assurance testing.

  6. Chemometric approach for prediction of uranium pathways in the soil

    International Nuclear Information System (INIS)

    Stojanovic, Mirjana; Nihajlovic, Marija; Petrovic, Jelena; Petrovic, Marija; Sostaric, Tanja; Milojkovic, Jelena; Pezo, Lato

    2014-01-01

    Understanding the effect of soil parameters (pH, Eh and organic and inorganic ligands availability) on uranium mobility under different geochemical conditions is fundamental for reliable prediction of its behaviour and fate in the environment. In this study, the impact of total and available phosphorus content, humus and acidity of Serbian agricultural soils on the content of total and available uranium were evaluated by Response Surface Methodology (RSM), second order polynomial regression models (SOPs) and artificial neural networks (ANNs). The performance of ANNs was compared with the performance of SOPs and experimental results. SOPs showed high coefficients of determination (0.785-0.956), while ANN model performed high prediction accuracy: 0.8893-0.904. According to the results, total and available uranium content in the soil were mostly affected by pH, statistically significant at p < 0.05 level. For the same responses the total phosphorus was found to be also very influential, statistically significant at p < 0.05 and p < 0.10 levels. The impact of available phosphorus and humus was much more influential on total and available uranium content, compared to total phosphorus content. Proposed chemometric approach will be very helpful in preserving the natural resources and practical application for risk assessment modeling of uranium environmental pathways.

  7. Chemometric evaluation of trace elements in Brazilian medicinal plants

    International Nuclear Information System (INIS)

    Silva, Paulo S.C. da; Francisconi, Lucilaine S.; Goncalves, Rodolfo D.M.R.

    2013-01-01

    The growing interest in herbal medicines has required standardization in order to ensure their safe use, therapeutic efficacy and quality of the products. Despite the vast flora and the extensive use of medicinal plants by the Brazilian population, scientific studies on the subject are still insufficiency In this study, 59 medicinal plans were analyzed for the determination of As, Ba, Br, Ca, Cl, Cs, Co, Cr, Fe, Hf, K, Mg, Mn, Na, Rb, Sb, Sc, Se, Ta, Th, U, Zn and Zr by neutron activation analysis and Cu, Ni, Pb, Cd and Hg by atomic absorption. The results were analyzed by chemometric methods: correlation analysis, principal component analysis and cluster analysis, in order to verify whether or not there is similarity with respect to their mineral and trace metal contents. Results obtained permitted to classify distinct groups among the analyzed plants and extracts so that these data can be useful in future studies, concerning the therapeutic action the elements here determined may exert. (author)

  8. Chemometric evaluation of trace elements in Brazilian medicinal plants

    Energy Technology Data Exchange (ETDEWEB)

    Silva, Paulo S.C. da; Francisconi, Lucilaine S.; Goncalves, Rodolfo D.M.R., E-mail: pscsilva@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil). Centro do Reator de Pesquisas

    2013-07-01

    The growing interest in herbal medicines has required standardization in order to ensure their safe use, therapeutic efficacy and quality of the products. Despite the vast flora and the extensive use of medicinal plants by the Brazilian population, scientific studies on the subject are still insufficiency In this study, 59 medicinal plans were analyzed for the determination of As, Ba, Br, Ca, Cl, Cs, Co, Cr, Fe, Hf, K, Mg, Mn, Na, Rb, Sb, Sc, Se, Ta, Th, U, Zn and Zr by neutron activation analysis and Cu, Ni, Pb, Cd and Hg by atomic absorption. The results were analyzed by chemometric methods: correlation analysis, principal component analysis and cluster analysis, in order to verify whether or not there is similarity with respect to their mineral and trace metal contents. Results obtained permitted to classify distinct groups among the analyzed plants and extracts so that these data can be useful in future studies, concerning the therapeutic action the elements here determined may exert. (author)

  9. Grape juice quality control by means of {sup 1}H NMR spectroscopy and chemometric analyses

    Energy Technology Data Exchange (ETDEWEB)

    Grandizoli, Caroline Werner Pereira da Silva; Campos, Francinete Ramos; Simonelli, Fabio; Barison, Andersson [Universidade Federal do Paraná (UFPR), Curitiba (Brazil). Departamento de Química

    2014-07-01

    This work shows the application of {sup 1}H NMR spectroscopy and chemometrics for quality control of grape juice. A wide range of quality assurance parameters were assessed by single {sup 1}H NMR experiments acquired directly from juice. The investigation revealed that conditions and time of storage should be revised and indicated on all labels. The sterilization process of homemade grape juices was efficient, making it possible to store them for long periods without additives. Furthermore, chemometric analysis classified the best commercial grape juices to be similar to homemade grape juices, indicating that this approach can be used to determine the authenticity after adulteration. (author)

  10. Authenticity study of Phyllanthus species by NMR and FT-IR techniques coupled with chemometric methods

    International Nuclear Information System (INIS)

    Santos, Maiara S.; Pereira-Filho, Edenir R.; Ferreira, Antonio G.; Boffo, Elisangela F.; Figueira, Glyn M.

    2012-01-01

    The importance of medicinal plants and their use in industrial applications is increasing worldwide, especially in Brazil. Phyllanthus species, popularly known as 'quebra-pedras' in Brazil, are used in folk medicine for treating urinary infections and renal calculus. This paper reports an authenticity study, based on herbal drugs from Phyllanthus species, involving commercial and authentic samples using spectroscopic techniques: FT-IR, 1 H HR-MAS NMR and 1 H NMR in solution, combined with chemometric analysis. The spectroscopic techniques evaluated, coupled with chemometric methods, have great potential in the investigation of complex matrices. Furthermore, several metabolites were identified by the NMR techniques. (author)

  11. Authenticity study of Phyllanthus species by NMR and FT-IR Techniques coupled with chemometric methods

    Directory of Open Access Journals (Sweden)

    Maiara S. Santos

    2012-01-01

    Full Text Available The importance of medicinal plants and their use in industrial applications is increasing worldwide, especially in Brazil. Phyllanthus species, popularly known as "quebra-pedras" in Brazil, are used in folk medicine for treating urinary infections and renal calculus. This paper reports an authenticity study, based on herbal drugs from Phyllanthus species, involving commercial and authentic samples using spectroscopic techniques: FT-IR, ¹H HR-MAS NMR and ¹H NMR in solution, combined with chemometric analysis. The spectroscopic techniques evaluated, coupled with chemometric methods, have great potential in the investigation of complex matrices. Furthermore, several metabolites were identified by the NMR techniques.

  12. Authenticity study of Phyllanthus species by NMR and FT-IR techniques coupled with chemometric methods

    Energy Technology Data Exchange (ETDEWEB)

    Santos, Maiara S.; Pereira-Filho, Edenir R.; Ferreira, Antonio G. [Universidade Federal de Sao Carlos (UFSCAR), SP (Brazil). Dept. de Quimica; Boffo, Elisangela F. [Universidade Federal da Bahia (UFBA), Salvador, BA (Brazil). Inst. de Quimica; Figueira, Glyn M., E-mail: maiarassantos@yahoo.com.br [Universidade Estadual de Campinas (UNICAMP), Campinas, SP (Brazil). Centro Pluridisciplinar de Pesquisas Quimicas, Biologicas e Agricolas

    2012-07-01

    The importance of medicinal plants and their use in industrial applications is increasing worldwide, especially in Brazil. Phyllanthus species, popularly known as 'quebra-pedras' in Brazil, are used in folk medicine for treating urinary infections and renal calculus. This paper reports an authenticity study, based on herbal drugs from Phyllanthus species, involving commercial and authentic samples using spectroscopic techniques: FT-IR, {sup 1}H HR-MAS NMR and {sup 1}H NMR in solution, combined with chemometric analysis. The spectroscopic techniques evaluated, coupled with chemometric methods, have great potential in the investigation of complex matrices. Furthermore, several metabolites were identified by the NMR techniques. (author)

  13. A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies.

    Science.gov (United States)

    Roggo, Yves; Chalus, Pascal; Maurer, Lene; Lema-Martinez, Carmen; Edmond, Aurélie; Jent, Nadine

    2007-07-27

    Near-infrared spectroscopy (NIRS) is a fast and non-destructive analytical method. Associated with chemometrics, it becomes a powerful tool for the pharmaceutical industry. Indeed, NIRS is suitable for analysis of solid, liquid and biotechnological pharmaceutical forms. Moreover, NIRS can be implemented during pharmaceutical development, in production for process monitoring or in quality control laboratories. This review focuses on chemometric techniques and pharmaceutical NIRS applications. The following topics are covered: qualitative analyses, quantitative methods and on-line applications. Theoretical and practical aspects are described with pharmaceutical examples of NIRS applications.

  14. Detection of Genetically Modified Sugarcane by Using Terahertz Spectroscopy and Chemometrics

    Science.gov (United States)

    Liu, J.; Xie, H.; Zha, B.; Ding, W.; Luo, J.; Hu, C.

    2018-03-01

    A methodology is proposed to identify genetically modified sugarcane from non-genetically modified sugarcane by using terahertz spectroscopy and chemometrics techniques, including linear discriminant analysis (LDA), support vector machine-discriminant analysis (SVM-DA), and partial least squares-discriminant analysis (PLS-DA). The classification rate of the above mentioned methods is compared, and different types of preprocessing are considered. According to the experimental results, the best option is PLS-DA, with an identification rate of 98%. The results indicated that THz spectroscopy and chemometrics techniques are a powerful tool to identify genetically modified and non-genetically modified sugarcane.

  15. Authenticity study of Phyllanthus species by NMR and FT-IR techniques coupled with chemometric methods

    Energy Technology Data Exchange (ETDEWEB)

    Santos, Maiara S.; Pereira-Filho, Edenir R.; Ferreira, Antonio G. [Universidade Federal de Sao Carlos (UFSCAR), SP (Brazil). Dept. de Quimica; Boffo, Elisangela F. [Universidade Federal da Bahia (UFBA), Salvador, BA (Brazil). Inst. de Quimica; Figueira, Glyn M., E-mail: maiarassantos@yahoo.com.br [Universidade Estadual de Campinas (UNICAMP), Campinas, SP (Brazil). Centro Pluridisciplinar de Pesquisas Quimicas, Biologicas e Agricolas

    2012-07-01

    The importance of medicinal plants and their use in industrial applications is increasing worldwide, especially in Brazil. Phyllanthus species, popularly known as 'quebra-pedras' in Brazil, are used in folk medicine for treating urinary infections and renal calculus. This paper reports an authenticity study, based on herbal drugs from Phyllanthus species, involving commercial and authentic samples using spectroscopic techniques: FT-IR, {sup 1}H HR-MAS NMR and {sup 1}H NMR in solution, combined with chemometric analysis. The spectroscopic techniques evaluated, coupled with chemometric methods, have great potential in the investigation of complex matrices. Furthermore, several metabolites were identified by the NMR techniques. (author)

  16. Attempt to separate the fluorescence spectra of adrenaline and noradrenaline using chemometrics

    DEFF Research Database (Denmark)

    Nikolajsen, Rikke P; Hansen, Åse Marie; Bro, R

    2000-01-01

    An investigation was conducted on whether the fluorescence spectra of the very similar catecholamines adrenaline and noradrenaline could be separated using chemometric methods. The fluorescence landscapes (several excitation and emission spectra were measured) of two data sets with respectively 16...... regression (Unfold-PLSR) on the larger data set and parallel factor analysis (PARAFAC) of the six samples of the smaller set showed that there was no difference between the fluorescence landscapes of adrenaline and noradrenaline. It can be concluded that chemometric separation of adrenaline and noradrenaline...

  17. Experimental variability and data pre-processing as factors affecting the discrimination power of some chemometric approaches (PCA, CA and a new algorithm based on linear regression) applied to (+/-)ESI/MS and RPLC/UV data: Application on green tea extracts.

    Science.gov (United States)

    Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A

    2016-08-01

    The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data

  18. Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents

    Science.gov (United States)

    Guo, Pi; Zeng, Fangfang; Hu, Xiaomin; Zhang, Dingmei; Zhu, Shuming; Deng, Yu; Hao, Yuantao

    2015-01-01

    Objectives In epidemiological studies, it is important to identify independent associations between collective exposures and a health outcome. The current stepwise selection technique ignores stochastic errors and suffers from a lack of stability. The alternative LASSO-penalized regression model can be applied to detect significant predictors from a pool of candidate variables. However, this technique is prone to false positives and tends to create excessive biases. It remains challenging to develop robust variable selection methods and enhance predictability. Material and methods Two improved algorithms denoted the two-stage hybrid and bootstrap ranking procedures, both using a LASSO-type penalty, were developed for epidemiological association analysis. The performance of the proposed procedures and other methods including conventional LASSO, Bolasso, stepwise and stability selection models were evaluated using intensive simulation. In addition, methods were compared by using an empirical analysis based on large-scale survey data of hepatitis B infection-relevant factors among Guangdong residents. Results The proposed procedures produced comparable or less biased selection results when compared to conventional variable selection models. In total, the two newly proposed procedures were stable with respect to various scenarios of simulation, demonstrating a higher power and a lower false positive rate during variable selection than the compared methods. In empirical analysis, the proposed procedures yielding a sparse set of hepatitis B infection-relevant factors gave the best predictive performance and showed that the procedures were able to select a more stringent set of factors. The individual history of hepatitis B vaccination, family and individual history of hepatitis B infection were associated with hepatitis B infection in the studied residents according to the proposed procedures. Conclusions The newly proposed procedures improve the identification of

  19. Variable Selection in Heterogeneous Datasets: A Truncated-rank Sparse Linear Mixed Model with Applications to Genome-wide Association Studies.

    Science.gov (United States)

    Wang, Haohan; Aragam, Bryon; Xing, Eric P

    2018-04-26

    A fundamental and important challenge in modern datasets of ever increasing dimensionality is variable selection, which has taken on renewed interest recently due to the growth of biological and medical datasets with complex, non-i.i.d. structures. Naïvely applying classical variable selection methods such as the Lasso to such datasets may lead to a large number of false discoveries. Motivated by genome-wide association studies in genetics, we study the problem of variable selection for datasets arising from multiple subpopulations, when this underlying population structure is unknown to the researcher. We propose a unified framework for sparse variable selection that adaptively corrects for population structure via a low-rank linear mixed model. Most importantly, the proposed method does not require prior knowledge of sample structure in the data and adaptively selects a covariance structure of the correct complexity. Through extensive experiments, we illustrate the effectiveness of this framework over existing methods. Further, we test our method on three different genomic datasets from plants, mice, and human, and discuss the knowledge we discover with our method. Copyright © 2018. Published by Elsevier Inc.

  20. Impact of strong selection for the PrP major gene on genetic variability of four French sheep breeds (Open Access publication

    Directory of Open Access Journals (Sweden)

    Pantano Thais

    2008-11-01

    Full Text Available Abstract Effective selection on the PrP gene has been implemented since October 2001 in all French sheep breeds. After four years, the ARR "resistant" allele frequency increased by about 35% in young males. The aim of this study was to evaluate the impact of this strong selection on genetic variability. It is focussed on four French sheep breeds and based on the comparison of two groups of 94 animals within each breed: the first group of animals was born before the selection began, and the second, 3–4 years later. Genetic variability was assessed using genealogical and molecular data (29 microsatellite markers. The expected loss of genetic variability on the PrP gene was confirmed. Moreover, among the five markers located in the PrP region, only the three closest ones were affected. The evolution of the number of alleles, heterozygote deficiency within population, expected heterozygosity and the Reynolds distances agreed with the criteria from pedigree and pointed out that neutral genetic variability was not much affected. This trend depended on breed, i.e. on their initial states (population size, PrP frequencies and on the selection strategies for improving scrapie resistance while carrying out selection for production traits.

  1. Exploring 5-nitrofuran derivatives against nosocomial pathogens: synthesis, antimicrobial activity and chemometric analysis.

    Science.gov (United States)

    Zorzi, Rodrigo Rocha; Jorge, Salomão Dória; Palace-Berl, Fanny; Pasqualoto, Kerly Fernanda Mesquita; Bortolozzo, Leandro de Sá; de Castro Siqueira, André Murillo; Tavares, Leoberto Costa

    2014-05-15

    The burden of nosocomial or health care-associated infection (HCAI) is increasing worldwide. According to the World Health Organization (WHO), it is several fold higher in low- and middle-income countries. Considering the multidrug-resistant infections, the development of new and more effective drugs is crucial. Herein, two series (I and II) of 5-nitrofuran derivatives were designed, synthesized and assayed against microorganisms, including Gram-positive and -negative bacteria, and fungi. The pathogens screened was directly related to either the most currently relevant HCAI, or to multidrug-resistant infection caused by MRSA/VRSA strains, for instance. The sets I and II were composed by substituted-[N'-(5-nitrofuran-2-yl)methylene]benzhydrazide and 3-acetyl-5-(substituted-phenyl)-2-(5-nitro-furan-2-yl)-2,3-dihydro-1,3,4-oxadiazole compounds, respectively. The selection of the substituent groups was based upon physicochemical properties, such as hydrophobicity and electronic effect. The compounds have showed better activity against Staphylococcus aureus, Escherichia coli, and Enterococcus faecalis. The findings from S. aureus strain, which was more susceptible, were used to investigate the intersamples and intervariables relationships by applying chemometric methods. It is noteworthy that the compound 4-butyl-[N'-(5-nitrofuran-2-yl)methylene]benzhydrazide has showed similar MIC value to vancomycin, which is the reference drug for multidrug-resistant S. aureus infections. Taken the findings together, the 5-nitrofuran derivatives might be indeed considered as promising hits to develop novel antimicrobial drugs to fight against nosocomial infection. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Diagnosis of human malignancies using laser-induced breakdown spectroscopy in combination with chemometric methods

    Science.gov (United States)

    Chen, Xue; Li, Xiaohui; Yu, Xin; Chen, Deying; Liu, Aichun

    2018-01-01

    Diagnosis of malignancies is a challenging clinical issue. In this work, we present quick and robust diagnosis and discrimination of lymphoma and multiple myeloma (MM) using laser-induced breakdown spectroscopy (LIBS) conducted on human serum samples, in combination with chemometric methods. The serum samples collected from lymphoma and MM cancer patients and healthy controls were deposited on filter papers and ablated with a pulsed 1064 nm Nd:YAG laser. 24 atomic lines of Ca, Na, K, H, O, and N were selected for malignancy diagnosis. Principal component analysis (PCA), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and k nearest neighbors (kNN) classification were applied to build the malignancy diagnosis and discrimination models. The performances of the models were evaluated using 10-fold cross validation. The discrimination accuracy, confusion matrix and receiver operating characteristic (ROC) curves were obtained. The values of area under the ROC curve (AUC), sensitivity and specificity at the cut-points were determined. The kNN model exhibits the best performances with overall discrimination accuracy of 96.0%. Distinct discrimination between malignancies and healthy controls has been achieved with AUC, sensitivity and specificity for healthy controls all approaching 1. For lymphoma, the best discrimination performance values are AUC = 0.990, sensitivity = 0.970 and specificity = 0.956. For MM, the corresponding values are AUC = 0.986, sensitivity = 0.892 and specificity = 0.994. The results show that the serum-LIBS technique can serve as a quick, less invasive and robust method for diagnosis and discrimination of human malignancies.

  3. Chemometric approach for development, optimization, and validation of different chromatographic methods for separation of opium alkaloids.

    Science.gov (United States)

    Acevska, J; Stefkov, G; Petkovska, R; Kulevanova, S; Dimitrovska, A

    2012-05-01

    The excessive and continuously growing interest in the simultaneous determination of poppy alkaloids imposes the development and optimization of convenient high-throughput methods for the assessment of the qualitative and quantitative profile of alkaloids in poppy straw. Systematic optimization of two chromatographic methods (gas chromatography (GC)/flame ionization detector (FID)/mass spectrometry (MS) and reversed-phase (RP)-high-performance liquid chromatography (HPLC)/diode array detector (DAD)) for the separation of alkaloids from Papaver somniferum L. (Papaveraceae) was carried out. The effects of various conditions on the predefined chromatographic descriptors were investigated using chemometrics. A full factorial linear design of experiments for determining the relationship between chromatographic conditions and the retention behavior of the analytes was used. Central composite circumscribed design was utilized for the final method optimization. By conducting the optimization of the methods in very rational manner, a great deal of excessive and unproductive laboratory research work was avoided. The developed chromatographic methods were validated and compared in line with the resolving power, sensitivity, accuracy, speed, cost, ecological aspects, and compatibility with the poppy straw extraction procedure. The separation of the opium alkaloids using the GC/FID/MS method was achieved within 10 min, avoiding any derivatization step. This method has a stronger resolving power, shorter analysis time, better cost/effectiveness factor than the RP-HPLC/DAD method and is in line with the "green trend" of the analysis. The RP-HPLC/DAD method on the other hand displayed better sensitivity for all tested alkaloids. The proposed methods provide both fast screening and an accurate content assessment of the six alkaloids in the poppy samples obtained from the selection program of Papaver strains.

  4. Energy-efficient relay selection and optimal power allocation for performance-constrained dual-hop variable-gain AF relaying

    KAUST Repository

    Zafar, Ammar; Radaydeh, Redha Mahmoud Mesleh; Chen, Yunfei; Alouini, Mohamed-Slim

    2013-01-01

    This paper investigates the energy-efficiency enhancement of a variable-gain dual-hop amplify-and-forward (AF) relay network utilizing selective relaying. The objective is to minimize the total consumed power while keeping the end-to-end signal

  5. Chemometric Analysis of High Molecular Mass Glutenin Subunits and Image Data of Bread Crumb Structure from Croatian Wheat Cultivars

    Directory of Open Access Journals (Sweden)

    Zorica Jurković

    2002-01-01

    Full Text Available The aim of this work is to investigate functional relationships among wheat properties, high molecular mass (weight (HMW glutenin subunits and bread quality produced from eleven Croatian wheat cultivars by chemometric analysis. HMW glutenin subunits were fractionated by sodium dodecylsulfate polyacrylamid gel electrophoresis (SDS-PAGE and subsequently analysed by scanning densitometry in order to quantify HMW glutenin fractions. Wheat properties are characterised by four variables: protein content, sedimentation value, wet gluten and gluten index. Bread quality is assessed by the standard measurement of loaf volume, and visual quality of bread slice is quantified by 8 parameters by the use of computer image analysis. The data matrix with 21 columns (measured variables and 11 rows (cultivars is analysed for determination of number of latent variables. It was found that the first two latent variables account for 92, 85 and 87 % of variance of wheat quality properties, HMW glutenin fractions, and the bread quality parameters, respectively. Classification and functional relationships are discussed from the case data (cultivars and variable projections to the planes of the first two latent variables. Between Glu-D1y proportion and the bread quality parameters (standard parameter loaf volume and bread crumb cell area fraction determined by image analysis the strongest positive correlations are found r = 0.651 and r = 0.885, respectively. Between Glu-B1x proportion and the bread quality parameters the strongest negative correlations are found r =-0.535 and r = –0.841, respectively. The results are discussed in view of possible development of new and improvement of existing wheat cultivars and optimisation of bread production.

  6. An explorative chemometric approach applied to hyperspectral images for the study of illuminated manuscripts

    Science.gov (United States)

    Catelli, Emilio; Randeberg, Lise Lyngsnes; Alsberg, Bjørn Kåre; Gebremariam, Kidane Fanta; Bracci, Silvano

    2017-04-01

    Hyperspectral imaging (HSI) is a fast non-invasive imaging technology recently applied in the field of art conservation. With the help of chemometrics, important information about the spectral properties and spatial distribution of pigments can be extracted from HSI data. With the intent of expanding the applications of chemometrics to the interpretation of hyperspectral images of historical documents, and, at the same time, to study the colorants and their spatial distribution on ancient illuminated manuscripts, an explorative chemometric approach is here presented. The method makes use of chemometric tools for spectral de-noising (minimum noise fraction (MNF)) and image analysis (multivariate image analysis (MIA) and iterative key set factor analysis (IKSFA)/spectral angle mapper (SAM)) which have given an efficient separation, classification and mapping of colorants from visible-near-infrared (VNIR) hyperspectral images of an ancient illuminated fragment. The identification of colorants was achieved by extracting and interpreting the VNIR spectra as well as by using a portable X-ray fluorescence (XRF) spectrometer.

  7. [Application of chemometrics in composition-activity relationship research of traditional Chinese medicine].

    Science.gov (United States)

    Han, Sheng-Nan

    2014-07-01

    Chemometrics is a new branch of chemistry which is widely applied to various fields of analytical chemistry. Chemometrics can use theories and methods of mathematics, statistics, computer science and other related disciplines to optimize the chemical measurement process and maximize access to acquire chemical information and other information on material systems by analyzing chemical measurement data. In recent years, traditional Chinese medicine has attracted widespread attention. In the research of traditional Chinese medicine, it has been a key problem that how to interpret the relationship between various chemical components and its efficacy, which seriously restricts the modernization of Chinese medicine. As chemometrics brings the multivariate analysis methods into the chemical research, it has been applied as an effective research tool in the composition-activity relationship research of Chinese medicine. This article reviews the applications of chemometrics methods in the composition-activity relationship research in recent years. The applications of multivariate statistical analysis methods (such as regression analysis, correlation analysis, principal component analysis, etc. ) and artificial neural network (such as back propagation artificial neural network, radical basis function neural network, support vector machine, etc. ) are summarized, including the brief fundamental principles, the research contents and the advantages and disadvantages. Finally, the existing main problems and prospects of its future researches are proposed.

  8. Chemometrics for ion mobility spectrometry data: recent advances and future prospects

    NARCIS (Netherlands)

    Szymanska, E.; Davies, Antony N.; Buydens, L.M.C.

    2016-01-01

    Historically, advances in the field of ion mobility spectrometry have been hindered by the variation in measured signals between instruments developed by different research laboratories or manufacturers. This has triggered the development and application of chemometric techniques able to reveal and

  9. Big (Bio)Chemical Data Mining Using Chemometric Methods: A Need for Chemists.

    Science.gov (United States)

    Tauler, Roma; Parastar, Hadi

    2018-03-23

    This review aims to demonstrate abilities to analyze Big (Bio)Chemical Data (BBCD) with multivariate chemometric methods and to show some of the more important challenges of modern analytical researches. In this review, the capabilities and versatility of chemometric methods will be discussed in light of the BBCD challenges that are being encountered in chromatographic, spectroscopic and hyperspectral imaging measurements, with an emphasis on their application to omics sciences. In addition, insights and perspectives on how to address the analysis of BBCD are provided along with a discussion of the procedures necessary to obtain more reliable qualitative and quantitative results. In this review, the importance of Big Data and of their relevance to (bio)chemistry are first discussed. Then, analytical tools which can produce BBCD are presented as well as some basics needed to understand prospects and limitations of chemometric techniques when they are applied to BBCD are given. Finally, the significance of the combination of chemometric approaches with BBCD analysis in different chemical disciplines is highlighted with some examples. In this paper, we have tried to cover some of the applications of big data analysis in the (bio)chemistry field. However, this coverage is not extensive covering everything done in the field. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Experimental Design, Near-Infrared Spectroscopy, and Multivariate Calibration: An Advanced Project in a Chemometrics Course

    Science.gov (United States)

    de Oliveira, Rodrigo R.; das Neves, Luiz S.; de Lima, Kassio M. G.

    2012-01-01

    A chemometrics course is offered to students in their fifth semester of the chemistry undergraduate program that includes an in-depth project. Students carry out the project over five weeks (three 8-h sessions per week) and conduct it in parallel to other courses or other practical work. The students conduct a literature search, carry out…

  11. The dimerization study of some cationic monomethine cyanine dyes by chemometrics method

    Czech Academy of Sciences Publication Activity Database

    Ahmadi, S.; Deligeorgiev, T.G.; Vasilev, A.; Kubista, Mikael

    2012-01-01

    Roč. 86, č. 13 (2012), s. 1974-1981 ISSN 0036-0244 Institutional research plan: CEZ:AV0Z50520701 Keywords : dimerization * chemometrics * UV-vis spectroscopy * monomethine cyanine dyes Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 0.386, year: 2012

  12. Use of chemometric and quantum-mechanical methods in the analysis of bioactive terpenoids and phenylpropanoids against the Aedes aegypti

    Directory of Open Access Journals (Sweden)

    Reginaldo Bezerra dos Santos

    2010-01-01

    Full Text Available Dengue fever is one of the main public health problems in the world. Many mosquitoes have developed resistance to the conventional insecticides used. Thus, the search for vegetable extracts and natural substances as alternative insecticides has increased. In this study, chemometric methods were employed to classify a group of terpenoid and phenylpropanoid compounds with biological activity against the larval of the A. aegypti mosquitoes. The AM1 (Austin Model 1 method was used to calculate a set of molecular descriptors (properties for the studied compounds. Then, the descriptors were analyzed using the following methods of pattern recognition: Principal Component Analysis (PCA and Hierarchical Clustering Analysis (HCA. The PCA and HCA methods have shown to be very effective for the classification of the study compounds in two groups (active and inactive. The electronic variables EHOMO-1, EHOMO-2, ELUMO, ELUMO+2, and the structural LogP were used to classify as active and inactive compounds. In most studied compounds, the variables responsible for separating active from inactive compounds were electronic descriptors. Thus, it can be concluded that electronic effects play a fundamental role in the interaction between biological receptor and terpenoid and phenylpropanoid compounds with activity against larval A. aegypti mosquitoes.

  13. Control of Variability in the Performance of Selective Laser Melting (SLM) Parts through Microstructure Control and Design

    Data.gov (United States)

    National Aeronautics and Space Administration — The high variability and low repeatability of metal parts produced using Additive Manufacturing (AM) represent a major barrier in getting AM into the mainstream....

  14. Non-destructive geographical traceability of sea cucumber (Apostichopus japonicus) using near infrared spectroscopy combined with chemometric methods.

    Science.gov (United States)

    Guo, Xiuhan; Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie

    2018-01-01

    Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber ( Apostichopus japonicus ) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China.

  15. Detecting correlation between allele frequencies and environmental variables as a signature of selection. A fast computational approach for genome-wide studies

    DEFF Research Database (Denmark)

    Guillot, Gilles; Vitalis, Renaud; Rouzic, Arnaud le

    2014-01-01

    to disentangle the potential effect of environmental variables from the confounding effect of population history. For the routine analysis of genome-wide datasets, one also needs fast inference and model selection algorithms. We propose a method based on an explicit spatial model which is an instance of spatial...... for the most common types of genetic markers, obtained either at the individual or at the population level. Analyzing the simulated data produced under a geostatistical model then under an explicit model of selection, we show that the method is efficient. We also re-analyze a dataset relative to nineteen pine...

  16. A flow system for generation of concentration perturbation in two-dimensional correlation near-infrared spectroscopy: application to variable selection in multivariate calibration.

    Science.gov (United States)

    Pereira, Claudete Fernandes; Pasquini, Celio

    2010-05-01

    A flow system is proposed to produce a concentration perturbation in liquid samples, aiming at the generation of two-dimensional correlation near-infrared spectra. The system presents advantages in relation to batch systems employed for the same purpose: the experiments are accomplished in a closed system; application of perturbation is rapid and easy; and the experiments can be carried out with micro-scale volumes. The perturbation system has been evaluated in the investigation and selection of relevant variables for multivariate calibration models for the determination of quality parameters of gasoline, including ethanol content, MON (motor octane number), and RON (research octane number). The main advantage of this variable selection approach is the direct association between spectral features and chemical composition, allowing easy interpretation of the regression models.

  17. Combination of Analytical and Chemometric Methods as a Useful Tool for the Characterization of Extra Virgin Argan Oil and Other Edible Virgin Oils. Role of Polyphenols and Tocopherols.

    Science.gov (United States)

    Rueda, Ascensión; Samaniego-Sánchez, Cristina; Olalla, Manuel; Giménez, Rafael; Cabrera-Vique, Carmen; Seiquer, Isabel; Lara, Luis

    2016-01-01

    Analysis of phenolic profile and tocopherol fractions in conjunction with chemometrics techniques were used for the accurate characterization of extra virgin argan oil and eight other edible vegetable virgin oils (olive, soybean, wheat germ, walnut, almond, sesame, avocado, and linseed) and to establish similarities among them. Phenolic profile and tocopherols were determined by HPLC coupled with diode-array and fluorescence detectors, respectively. Multivariate factor analysis (MFA) and linear correlations were applied. Significant negative correlations were found between tocopherols and some of the polyphenols identified, but more intensely (P tocopherol and oleuropein, pinoresinol, and luteolin. MFA revealed that tocopherols, especially γ-fraction, most strongly influenced the oil characterization. Among the phenolic compounds, syringic acid, dihydroxybenzoic acid, oleuropein, pinoresinol, and luteolin also contributed to the discrimination of the oils. According to the variables analyzed in the present study, argan oil presented the greatest similarity with walnut oil, followed by sesame and linseed oils. Olive, avocado, and almond oils showed close similarities.

  18. An Analysis of the Effectiveness of Supplemental Instruction: The Problem of Selection Bias and Limited Dependent Variables

    Science.gov (United States)

    Bowles, Tyler J.; Jones, Jason

    2004-01-01

    Single equation regression models have been used rather extensively to test the effectiveness of Supplemental Instruction (SI). This approach, however, fails to account for the possibility that SI attendance and the outcome of SI attendance are jointly determined endogenous variables. Moreover, the standard approach fails to account for the fact…

  19. An Application of Supervised Learning Methods to Search for Variable Stars in a Selected Field of the VVV Survey

    Science.gov (United States)

    Rodríguez-Feliciano, B.; García-Varela, A.; Pérez-Ortiz, M. F.; Sabogal, B. E.; Minniti, D.

    2017-07-01

    We characterize properties of time series of variable stars in the B278 field of the VVV survey, using robust statistics. Using random forest and support vector machines classifiers we propose 47 candidates to RR Lyraae, and 12 candidates to WU Ursae Majoris eclipsing binaries.

  20. Clonal variability for water use efficiency and carbon isotope discrimination ( 13C) in selected clones of a few Eucalyptus species

    CSIR Research Space (South Africa)

    Mohan Raju, B

    2011-11-01

    Full Text Available and develop high water use efficient clones to cultivate under water limited environments. The major objective was to assess the eucalyptus clones for variability in WUE and to determine the relationship between WUE and carbon isotope discrimination ( 13C...

  1. Positive selection in the chromosome 16 VKORC1 genomic region has contributed to the variability of anticoagulant response in humans.

    Directory of Open Access Journals (Sweden)

    Blandine Patillon

    Full Text Available VKORC1 (vitamin K epoxide reductase complex subunit 1, 16p11.2 is the main genetic determinant of human response to oral anticoagulants of antivitamin K type (AVK. This gene was recently suggested to be a putative target of positive selection in East Asian populations. In this study, we genotyped the HGDP-CEPH Panel for six VKORC1 SNPs and downloaded chromosome 16 genotypes from the HGDP-CEPH database in order to characterize the geographic distribution of footprints of positive selection within and around this locus. A unique VKORC1 haplotype carrying the promoter mutation associated with AVK sensitivity showed especially high frequencies in all the 17 HGDP-CEPH East Asian population samples. VKORC1 and 24 neighboring genes were found to lie in a 505 kb region of strong linkage disequilibrium in these populations. Patterns of allele frequency differentiation and haplotype structure suggest that this genomic region has been submitted to a near complete selective sweep in all East Asian populations and only in this geographic area. The most extreme scores of the different selection tests are found within a smaller 45 kb region that contains VKORC1 and three other genes (BCKDK, MYST1 (KAT8, and PRSS8 with different functions. Because of the strong linkage disequilibrium, it is not possible to determine if VKORC1 or one of the three other genes is the target of this strong positive selection that could explain present-day differences among human populations in AVK dose requirement. Our results show that the extended region surrounding a presumable single target of positive selection should be analyzed for genetic variation in a wide range of genetically diverse populations in order to account for other neighboring and confounding selective events and the hitchhiking effect.

  2. Isotope ratio mass spectrometry in combination with chemometrics for characterization of geographical origin and agronomic practices of table grape.

    Science.gov (United States)

    Longobardi, Francesco; Casiello, Grazia; Centonze, Valentina; Catucci, Lucia; Agostiano, Angela

    2017-08-01

    Although table grape is one of the most cultivated and consumed fruits worldwide, no study has been reported on its geographical origin or agronomic practice based on stable isotope ratios. This study aimed to evaluate the usefulness of isotopic ratios (i.e. 2 H/ 1 H, 13 C/ 12 C, 15 N/ 14 N and 18 O/ 16 O) as possible markers to discriminate the agronomic practice (conventional versus organic farming) and provenance of table grape. In order to quantitatively evaluate which of the isotopic variables were more discriminating, a t test was carried out, in light of which only δ 13 C and δ 18 O provided statistically significant differences (P ≤ 0.05) for the discrimination of geographical origin and farming method. Principal component analysis (PCA) showed no good separation of samples differing in geographical area and agronomic practice; thus, for classification purposes, supervised approaches were carried out. In particular, general discriminant analysis (GDA) was used, resulting in prediction abilities of 75.0 and 92.2% for the discrimination of farming method and origin respectively. The present findings suggest that stable isotopes (i.e. δ 18 O, δ 2 H and δ 13 C) combined with chemometrics can be successfully applied to discriminate the provenance of table grape. However, the use of bulk nitrogen isotopes was not effective for farming method discrimination. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  3. Utilization of Chemometric Technique to Determine the Quality of Fresh and Used Palm, Corn and Coconut Oil

    International Nuclear Information System (INIS)

    Hamizah Mat Agil; Mohd Zuli Jaafar; Suzeren Jamil; Azwan Mat Lazim

    2014-01-01

    This study was conducted to evaluate the quality of natural oil and the deterioration of frying oil. A total of 12 different oil samples from palm oil, corn oil and coconut oil were used. The frying process was repeated four times at 180 degree Celsius in order to observe the stability of the oil towards oxidation. Three main parameters have been studied to determine oil qualities which were peroxide value, iodine value and acid value. This study emphasized on the usage of FTIR in the range of 4000-700 cm -1 . Alternatively, the chemometrics method based on pattern recognition has been used to determination the oil quality. Data analysis were conducted by using PCA and PLS method in the Matlab modeling. The PCA provided data classification according to types of oil while PLS predicted the oil quality of the parameters studied. For the classification of pure oil, the variance for PC1 was 70 % while PC2 was 15 %. For the fried/ used oil, PC1 gave 57 % while PC2 gave 25 %. By using PLS, the iodine the best model for pure oils value model variable based on correlation with R2CV > 0.984. Whereas, the peroxide value model for fried/ used oils, was the best obtained R 2 CV > 0.7423. (author)

  4. Swift Observations of Mrk 421 in Selected Epochs. II. An Extreme Spectral Flux Variability in 2009–2012

    Science.gov (United States)

    Kapanadze, B.; Vercellone, S.; Romano, P.; Hughes, P.; Aller, M.; Aller, H.; Kharshiladze, O.; Tabagari, L.

    2018-05-01

    We present the results from a detailed spectral and timing study of Mrk 421 based on the rich archival Swift data obtained during 2009–2012. Best fits of the 0.3–10 keV spectra were mostly obtained using the log-parabolic model showing the relatively low spectral curvature that is expected in the case of efficient stochastic acceleration of particles. The position of the synchrotron spectral energy density peak E p of 173 spectra is found at energies higher than 2 keV. The photon index at 1 keV exhibited a very broad range of values a = 1.51–3.02, and very hard spectra with a historical state and that corresponding to a rate higher than 100 cts s‑1. Moreover, 113 instances of intraday variability were revealed, exhibiting shortest flux-doubling/halving times of about 1.2 hr, as well as brightenings by 7%–24% in 180–720 s and declines by 68%–22% in 180–900 s. The X-ray and very high-energy fluxes generally showed a correlated variability, although one incidence of a more complicated variability was also detected, indicating that the multifrequency emission of Mrk 421 could not be generated in a single zone.

  5. Energy-efficient relay selection and optimal power allocation for performance-constrained dual-hop variable-gain AF relaying

    KAUST Repository

    Zafar, Ammar

    2013-12-01

    This paper investigates the energy-efficiency enhancement of a variable-gain dual-hop amplify-and-forward (AF) relay network utilizing selective relaying. The objective is to minimize the total consumed power while keeping the end-to-end signal-to-noise-ratio (SNR) above a certain peak value and satisfying the peak power constraints at the source and relay nodes. To achieve this objective, an optimal relay selection and power allocation strategy is derived by solving the power minimization problem. Numerical results show that the derived optimal strategy enhances the energy-efficiency as compared to a benchmark scheme in which both the source and the selected relay transmit at peak power. © 2013 IEEE.

  6. Do birds of a feather flock together? The variable bases for African American, Asian American, and European American adolescents' selection of similar friends.

    Science.gov (United States)

    Hamm, J V

    2000-03-01

    Variability in adolescent-friend similarity is documented in a diverse sample of African American, Asian American, and European American adolescents. Similarity was greatest for substance use, modest for academic orientations, and low for ethnic identity. Compared with Asian American and European American adolescents, African American adolescents chose friends who were less similar with respect to academic orientation or substance use but more similar with respect to ethnic identity. For all three ethnic groups, personal endorsement of the dimension in question and selection of cross-ethnic-group friends heightened similarity. Similarity was a relative rather than an absolute selection criterion: Adolescents did not choose friends with identical orientations. These findings call for a comprehensive theory of friendship selection sensitive to diversity in adolescents' experiences. Implications for peer influence and self-development are discussed.

  7. A comparison of small-area estimation techniques to estimate selected stand attributes using LiDAR-derived auxiliary variables

    Science.gov (United States)

    Michael E. Goerndt; Vicente J. Monleon; Hailemariam. Temesgen

    2011-01-01

    One of the challenges often faced in forestry is the estimation of forest attributes for smaller areas of interest within a larger population. Small-area estimation (SAE) is a set of techniques well suited to estimation of forest attributes for small areas in which the existing sample size is small and auxiliary information is available. Selected SAE methods were...

  8. Optimal Selective Harmonic Mitigation Technique on Variable DC Link Cascaded H-Bridge Converter to Meet Power Quality Standards

    DEFF Research Database (Denmark)

    Najjar, Mohammad; Moeini, Amirhossein; Dowlatabadi, Mohammadkazem Bakhshizadeh

    2016-01-01

    In this paper, the power quality standards such as IEC 61000-3-6, IEC 61000-2-12, EN 50160, and CIGRE WG 36-05 are fulfilled for single- and three-phase medium voltage applications by using Selective Harmonic Mitigation-PWM (SHM-PWM) in a Cascaded H-Bridge (CHB) converter. Furthermore, the ER G5/...

  9. Simultaneous determination of three herbicides by differential pulse voltammetry and chemometrics.

    Science.gov (United States)

    Ni, Yongnian; Wang, Lin; Kokot, Serge

    2011-01-01

    A novel differential pulse voltammetry method (DPV) was researched and developed for the simultaneous determination of Pendimethalin, Dinoseb and sodium 5-nitroguaiacolate (5NG) with the aid of chemometrics. The voltammograms of these three compounds overlapped significantly, and to facilitate the simultaneous determination of the three analytes, chemometrics methods were applied. These included classical least squares (CLS), principal component regression (PCR), partial least squares (PLS) and radial basis function-artificial neural networks (RBF-ANN). A separately prepared verification data set was used to confirm the calibrations, which were built from the original and first derivative data matrices of the voltammograms. On the basis relative prediction errors and recoveries of the analytes, the RBF-ANN and the DPLS (D - first derivative spectra) models performed best and are particularly recommended for application. The DPLS calibration model was applied satisfactorily for the prediction of the three analytes from market vegetables and lake water samples.

  10. Chilean flour and wheat grain: tracing their origin using near infrared spectroscopy and chemometrics.

    Science.gov (United States)

    González-Martín, Ma Inmaculada; Wells Moncada, Guillermo; González-Pérez, Claudio; Zapata San Martín, Nelson; López-González, Fernando; Lobos Ortega, Iris; Hernández-Hierro, Jose-Miguel

    2014-02-15

    Instrumental techniques such a near-infrared spectroscopy (NIRS) are used in industry to monitor and establish product composition and quality. As occurs with other food industries, the Chilean flour industry needs simple, rapid techniques to objectively assess the origin of different products, which is often related to their quality. In this sense, NIRS has been used in combination with chemometric methods to predict the geographic origin of wheat grain and flour samples produced in different regions of Chile. Here, the spectral data obtained with NIRS were analysed using a supervised pattern recognition method, Discriminat Partial Least Squares (DPLS). The method correctly classified 76% of the wheat grain samples and between 90% and 96% of the flour samples according to their geographic origin. The results show that NIRS, together with chemometric methods, provides a rapid tool for the classification of wheat grain and flour samples according to their geographic origin. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. Circum-Arctic petroleum systems identified using decision-tree chemometrics

    Science.gov (United States)

    Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.; Gautier, D.L.

    2007-01-01

    Source- and age-related biomarker and isotopic data were measured for more than 1000 crude oil samples from wells and seeps collected above approximately 55??N latitude. A unique, multitiered chemometric (multivariate statistical) decision tree was created that allowed automated classification of 31 genetically distinct circumArctic oil families based on a training set of 622 oil samples. The method, which we call decision-tree chemometrics, uses principal components analysis and multiple tiers of K-nearest neighbor and SIMCA (soft independent modeling of class analogy) models to classify and assign confidence limits for newly acquired oil samples and source rock extracts. Geochemical data for each oil sample were also used to infer the age, lithology, organic matter input, depositional environment, and identity of its source rock. These results demonstrate the value of large petroleum databases where all samples were analyzed using the same procedures and instrumentation. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.

  12. GC/MS analysis of pesticides in the Ferrara area (Italy) surface water: a chemometric study.

    Science.gov (United States)

    Pasti, Luisa; Nava, Elisabetta; Morelli, Marco; Bignami, Silvia; Dondi, Francesco

    2007-01-01

    The development of a network to monitor surface waters is a critical element in the assessment, restoration and protection of water quality. In this study, concentrations of 42 pesticides--determined by GC-MS on samples from 11 points along the Ferrara area rivers--have been analyzed by chemometric tools. The data were collected over a three-year period (2002-2004). Principal component analysis of the detected pesticides was carried out in order to define the best spatial locations for the sampling points. The results obtained have been interpreted in view of agricultural land use. Time series data regarding pesticide contents in surface waters has been analyzed using the Autocorrelation function. This chemometric tool allows for seasonal trends and makes it possible to optimize sampling frequency in order to detect the effective maximum pesticide content.

  13. GC/MS Analysis of Pesticides in the Ferrara Area (Italy) Surface Water: A Chemometric Study

    International Nuclear Information System (INIS)

    Pasti, L.; Dondi, F.; Nava, E.; Morelli, M.; Bignami, S.

    2007-01-01

    The development of a network to monitor surface waters is a critical element in the assessment, restoration and protection of water quality. In this study, concentrations of 42 pesticides - determined by GC-MS on samples from 11 points along the Ferrara area rivers - have been analyzed by chemometric tools. The data were collected over a three-year period (2002-2004). Principal component analysis of the detected pesticides was carried out in order to define the best spatial locations for the sampling points. The results obtained have been interpreted in view of agricultural land use. Time series data regarding pesticide contents in surface waters has been analyzed using the Autocorrelation function. This chemometric tool allows for seasonal trends and makes it possible to optimize sampling frequency in order to detect the effective maximum pesticide content

  14. Chemometric strategy for automatic chromatographic peak detection and background drift correction in chromatographic data.

    Science.gov (United States)

    Yu, Yong-Jie; Xia, Qiao-Ling; Wang, Sheng; Wang, Bing; Xie, Fu-Wei; Zhang, Xiao-Bing; Ma, Yun-Ming; Wu, Hai-Long

    2014-09-12

    Peak detection and background drift correction (BDC) are the key stages in using chemometric methods to analyze chromatographic fingerprints of complex samples. This study developed a novel chemometric strategy for simultaneous automatic chromatographic peak detection and BDC. A robust statistical method was used for intelligent estimation of instrumental noise level coupled with first-order derivative of chromatographic signal to automatically extract chromatographic peaks in the data. A local curve-fitting strategy was then employed for BDC. Simulated and real liquid chromatographic data were designed with various kinds of background drift and degree of overlapped chromatographic peaks to verify the performance of the proposed strategy. The underlying chromatographic peaks can be automatically detected and reasonably integrated by this strategy. Meanwhile, chromatograms with BDC can be precisely obtained. The proposed method was used to analyze a complex gas chromatography dataset that monitored quality changes in plant extracts during storage procedure. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Emergency department documentation templates: variability in template selection and association with physical examination and test ordering in dizziness presentations

    Directory of Open Access Journals (Sweden)

    Meurer William J

    2011-03-01

    Full Text Available Abstract Background Clinical documentation systems, such as templates, have been associated with process utilization. The T-System emergency department (ED templates are widely used but lacking are analyses of the templates association with processes. This system is also unique because of the many different template options available, and thus the selection of the template may also be important. We aimed to describe the selection of templates in ED dizziness presentations and to investigate the association between items on templates and process utilization. Methods Dizziness visits were captured from a population-based study of EDs that use documentation templates. Two relevant process outcomes were assessed: head computerized tomography (CT scan and nystagmus examination. Multivariable logistic regression was used to estimate the probability of each outcome for patients who did or did not receive a relevant-item template. Propensity scores were also used to adjust for selection effects. Results The final cohort was 1,485 visits. Thirty-one different templates were used. Use of a template with a head CT item was associated with an increase in the adjusted probability of head CT utilization from 12.2% (95% CI, 8.9%-16.6% to 29.3% (95% CI, 26.0%-32.9%. The adjusted probability of documentation of a nystagmus assessment increased from 12.0% (95%CI, 8.8%-16.2% when a nystagmus-item template was not used to 95.0% (95% CI, 92.8%-96.6% when a nystagmus-item template was used. The associations remained significant after propensity score adjustments. Conclusions Providers use many different templates in dizziness presentations. Important differences exist in the various templates and the template that is used likely impacts process utilization, even though selection may be arbitrary. The optimal design and selection of templates may offer a feasible and effective opportunity to improve care delivery.

  16. The employment of FTIR spectroscopy in combination with chemometrics for analysis of rat meat in meatball formulation.

    Science.gov (United States)

    Rahmania, Halida; Sudjadi; Rohman, Abdul

    2015-02-01

    For Indonesian community, meatball is one of the favorite meat food products. In order to gain economical benefits, the substitution of beef meat with rat meat can happen due to the different prices between rat meat and beef. In this present research, the feasibility of FTIR spectroscopy in combination with multivariate calibration of partial least square (PLS) was used for the quantitative analysis of rat meat in the binary mixture of beef in meatball formulation. Meanwhile, the chemometrics of principal component analysis (PCA) was used for the classification between rat meat and beef meatballs. Some frequency regions in mid infrared region were optimized, and finally, the frequency region of 750-1000 cm(-1) was selected during PLS and PCA modeling.For quantitative analysis, the relationship between actual values (x-axis) and FTIR predicted values (y-axis) of rat meat is described by the equation of y= 0.9417x+ 2.8410 with coefficient of determination (R2) of 0.993, and root mean square error of calibration (RMSEC) of 1.79%. Furthermore, PCA was successfully used for the classification of rat meat meatball and beef meatball.

  17. Chemometric characterization of the hydrogen bonding complexes of secondary amides and aromatic hydrocarbons

    OpenAIRE

    Jović, Branislav; Nikolić, Aleksandar; Petrović, Slobodan

    2012-01-01

    The paper reports the results of the study of hydrogen bonding complexes between secondary amides and various aromatic hydrocarbons. The possibility of using chemometric methods was investigated in order to characterize N-H•••π hydrogen bonded complexes. Hierarchical clustering and Principal Component Analysis (PCA) have been applied on infrared spectroscopic and Taft parameters of 43 N-substituted amide complexes with different aromatic hydrocarbons. Results obtained in this report are...

  18. Application of FTIR Spectroscopy and Chemometrics for Halal Authentication of Beef Meatball Adulterated with Dog Meat

    OpenAIRE

    Rahayu, Wiranti Sri; Rohman, Abdul; Martono, Sudibyo; Sudjadi, Sudjadi

    2018-01-01

    Beef meatball is one of the favorite meat-based food products among Indonesian community. Currently, beef is very expensive in Indonesian market compared to other common meat types such as chicken and lamb. This situation has intrigued some unethical meatball producers to replace or adulterate beef with lower priced-meat like dog meat. The objective of this study was to evaluate the capability of FTIR spectroscopy combined with chemometrics for identification and quantification of dog meat (D...

  19. Prediction of physicochemical properties of FCC feedstock by Chemometric analysis of their ultraviolet spectrum

    International Nuclear Information System (INIS)

    Baldrich Ferrer, Carlos A

    2008-01-01

    Chemometric analysis by Partial Least Squares (PLS) has been applied in this work to correlate the ultraviolet spectrum of combined Fluid Catalytic Cracking (FCC) feedstock with their physicochemical properties. The prediction errors obtained in the validation process using refinery samples demonstrate the accuracy of the predicted properties. This new analytical methodology allows obtaining in one analysis detailed information about the most important physicochemical properties of FCC feedstock and could be used as a valuable tool for operational analysis

  20. Characterization and authentication of Spanish PDO wine vinegars using multidimensional fluorescence and chemometrics

    DEFF Research Database (Denmark)

    Ríos-Reina, Rocío; Elcoroaristizabal, Saioa; Ocaña-Gonzalez, Juan A.

    2017-01-01

    This work assesses the potential of multidimensional fluorescence spectroscopy combined with chemometrics for characterization and authentication of Spanish Protected Designation of Origin (PDO) wine vinegars. Seventy-nine vinegars of different categories (aged and sweet) belonging to the Spanish...... obtained better results (>92% of classification). In each category, SVM also allows the differentiation between PDOs. The proposed methodology could be used as an analysis method for the authentication of Spanish PDO wine vinegars....

  1. New liquid chromatographic-chemometric approach for the determination of sunset yellow and tartrazine in commercial preparation.

    Science.gov (United States)

    Dinç, Erdal; Aktaş, A Hakan; Ustündağ, Ozgür

    2005-01-01

    A new liquid chromatographic (LC)-chemometric approach was developed for the determination of sunset yellow (SUN) and tartrazine (TAR) in commercial preparations. This approach uses LC and chemometric calibration methods, i.e., classical least-squares (CLS), principal component regression (PCR), and partial-least squares (PLS), simultaneously. The combined LC-chemometric approaches, denoted as LC-CLS, LC-PCR, and LC-PLS, are based on photodiode array (PDA) detection at multiple wavelengths. Optimum chromatographic separation of SUN and TAR with allura red as the internal standard (IS) was obtained by using a Waters Symmetry C18 column, 5 microm, 4.6 x 250 mm, and 0.2 M acetate buffer (pH 5)-acetonitrile-methano-bidistilled water (55 + 20 + 15 + 10, v/v) as the mobile phase at a flow rate of 1.9 mL/min. The LC data sets consisting of the ratios of analyte peak areas to the IS peak area were obtained by using PDA detection at 5 wavelengths (465, 470, 475, 480, and 485 nm). LC-chemometric calibrations for SUN and TAR were separately constructed by using the relationship between the peak-area ratio and the training sets for each colorant. LC-chemometric approaches were tested for different synthetic mixtures containing SUN and TAR in the presence of the IS. These LC-chemometric calibrations were applied to a commercial preparation of the 2 colorants. The experimental results of the LC-chemometric approaches were compared with those obtained by a developed classical LC method using single-wavelength detection.

  2. Input Selection for Return Temperature Estimation in Mixing Loops using Partial Mutual Information with Flow Variable Delay

    DEFF Research Database (Denmark)

    Overgaard, Anders; Kallesøe, Carsten Skovmose; Bendtsen, Jan Dimon

    2017-01-01

    adgang til data, er ønsker at skabe en datadreven model til kontrol. Grundet den store mængde tilgængelig data anvendes der en metode til valg af inputs kaldet "Partial Mutual Information" (PMI). Denne artikel introducerer en metode til at inkluderer flow variable forsinkelser i PMI. Data fra en...... kontorbygning i Bjerringbro anvendes til analyse. Det vises at "Mutual Information" og et "Generalized Regression Neural Network" begge forbedres ved at anvende flow variabelt forsinkelse i forhold til at anvende konstante delay....

  3. Chemometrics-assisted spectrophotometry method for the determination of chemical oxygen demand in pulping effluent.

    Science.gov (United States)

    Chen, Honglei; Chen, Yuancai; Zhan, Huaiyu; Fu, Shiyu

    2011-04-01

    A new method has been developed for the determination of chemical oxygen demand (COD) in pulping effluent using chemometrics-assisted spectrophotometry. Two calibration models were established by inducing UV-visible spectroscopy (model 1) and derivative spectroscopy (model 2), combined with the chemometrics software Smica-P. Correlation coefficients of the two models are 0.9954 (model 1) and 0.9963 (model 2) when COD of samples is in the range of 0 to 405 mg/L. Sensitivities of the two models are 0.0061 (model 1) and 0.0056 (model 2) and method detection limits are 2.02-2.45 mg/L (model 1) and 2.13-2.51 mg/L (model 2). Validation experiment showed that the average standard deviation of model 2 was 1.11 and that of model 1 was 1.54. Similarly, average relative error of model 2 (4.25%) was lower than model 1 (5.00%), which indicated that the predictability of model 2 was better than that of model 1. Chemometrics-assisted spectrophotometry method did not need chemical reagents and digestion which were required in the conventional methods, and the testing time of the new method was significantly shorter than the conventional ones. The proposed method can be used to measure COD in pulping effluent as an environmentally friendly approach with satisfactory results.

  4. Classification of java tea ( Orthosiphon aristatus ) quality using FTIR spectroscopy and chemometrics

    International Nuclear Information System (INIS)

    Heryanto, R; Pradono, D I; Darusman, L K; Marlina, E

    2017-01-01

    Java tea ( Orthosiphon aristatus ) is a plant that widely used as a medicinal herb in Indonesia. Its quality is varying depends on various factors, such as cultivating area, climate and harvesting time. This study aimed to investigate the effectiveness of FTIR spectroscopy coupled with chemometrics for discriminating the quality of java tea from different cultivating area. FTIR spectra of ethanolic extracts were collected from five different regions of origin of java tea. Prior to chemometrics evaluation, spectra were pre-processed by using baselining, normalization and derivatization. Principal Components Analysis (PCA) was used to reduce the spectra to two PCs, which explained 73% of the total variance. Score plot of two PCs showed groupings of the samples according to their regions of origin. Furthermore, Partial Least Squares-Discriminant Analysis (PLSDA) was applied to the pre-processed data. The approach produced an external validation success rate of 100%. This study shows that FTIR analysis and chemometrics has discriminatory power to classify java tea based on its quality related to the region of origin. (paper)

  5. A Simple Photometer and Chemometrics Analysis for Quality Control of Sambiloto (Andrographis paniculata Raw Material

    Directory of Open Access Journals (Sweden)

    Rudi Heryanto

    2017-09-01

    Full Text Available In this paper, we described the use of a light emitting diode (LED-based photometer and chemometric analysis for quality control of king of bitter or sambiloto (Andrographis paniculata raw material. The quality of medicinal plants is determined by their chemical composition. The quantities of chemical components in medicinal plants can be assessed using spectroscopic technique. We used an “in house” photometer to generate spectra of sambiloto. The spectra were analyzed by chemometric methods, i.e. principal component analysis (PCA and partial least square discriminant analysis (PLS-DA, with the aim of herbal quality classification based on the harvesting time. From the results obtained, based on thin layer chromatography analysis, sambiloto with different collection times (1, 2, and 3 months contained different amounts of active compounds. Evaluation of sambiloto, using its spectra and chemometric analysis has successfully differentiated its quality based on harvesting time. PCA with the first two PC’s (PC-1 = 60% and PC-2 = 35% was able to differentiate according to the harvesting time of sambiloto. Three models were obtained by PLS-DA and could be used to predict unknown sample of sambiloto according to the harvesting time

  6. Spatial assessment and source identification of heavy metals pollution in surface water using several chemometric techniques.

    Science.gov (United States)

    Ismail, Azimah; Toriman, Mohd Ekhwan; Juahir, Hafizan; Zain, Sharifuddin Md; Habir, Nur Liyana Abdul; Retnam, Ananthy; Kamaruddin, Mohd Khairul Amri; Umar, Roslan; Azid, Azman

    2016-05-15

    This study presents the determination of the spatial variation and source identification of heavy metal pollution in surface water along the Straits of Malacca using several chemometric techniques. Clustering and discrimination of heavy metal compounds in surface water into two groups (northern and southern regions) are observed according to level of concentrations via the application of chemometric techniques. Principal component analysis (PCA) demonstrates that Cu and Cr dominate the source apportionment in northern region with a total variance of 57.62% and is identified with mining and shipping activities. These are the major contamination contributors in the Straits. Land-based pollution originating from vehicular emission with a total variance of 59.43% is attributed to the high level of Pb concentration in the southern region. The results revealed that one state representing each cluster (northern and southern regions) is significant as the main location for investigating heavy metal concentration in the Straits of Malacca which would save monitoring cost and time. The monitoring of spatial variation and source of heavy metals pollution at the northern and southern regions of the Straits of Malacca, Malaysia, using chemometric analysis. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Estimating relations between temperature, relative humidity as independed variables and selected water quality parameters in Lake Manzala, Egypt

    Directory of Open Access Journals (Sweden)

    Gehan A.H. Sallam

    2018-03-01

    Full Text Available In Egypt, Lake Manzala is the largest and the most productive lake of northern coastal lakes. In this study, the continuous measurements data of the Real Time Water Quality Monitoring stations in Lake Manzala were statistically analyzed to measure the regional and seasonal variations of the selected water quality parameters in relation to the change of air temperature and relative humidity. Simple formulas are elaborated using the DataFit software to predict the selected water quality parameters of the Lake including pH, Dissolved Oxygen (DO, Electrical Conductivity (EC, Total Dissolved Solids (TDS, Turbidity, and Chlorophyll as a function of air temperature, relative humidity and quantities and qualities of the drainage water that discharge into the lake. An empirical positive relation was found between air temperature and the relative humidity and pH, EC and TDS and negative relation with DO. There is no significant effect on the other two parameters of turbidity and chlorophyll.

  8. A chemometric approach to the evaluation of atmospheric and fluvial pollutant inputs in aquatic systems: The Guadalquivir River estuary as a case study

    International Nuclear Information System (INIS)

    Lopez-Lopez, Jose A.; Garcia-Vargas, Manuel; Moreno, Carlos

    2011-01-01

    To establish the quality of waters it is necessary to identify both point and non-point pollution sources. In this work, we propose the combination of clean analytical methodologies and chemometric tools to study discrete and diffuse pollution caused in a river by tributaries and precipitations, respectively. During a two-year period, water samples were taken in the Guadalquivir river (selected as a case study) and its main tributaries before and after precipitations. Samples were characterized by analysing nutrients, pH, dissolved oxygen, total and volatile suspended solids, carbon species, and heavy metals. Results were used to estimate fluvial and atmospheric inputs and as tracers for anthropic activities. Multivariate analysis was used to estimate the background pollution, and to identify pollution inputs. Principal Component Analysis and Cluster Analysis were used as data exploratory tools, while box-whiskers plots and Linear Discriminant Analysis were used to analyse and distinguish the different types of water samples. - Highlights: → Atmospheric and fluvial inputs of pollutants in Guadalquivir River were identified. → Point (tributary rivers) and non-point sources (rains) were studied. → Nature and extension of anthropogenic pollution in the river were established. - By combining trace environmental analysis and selected chemometric tools atmospheric and fluvial inputs of pollutants in rivers may be identified. The extension of the pollution originated by each anthropic activity developed along the River may be established, as well as the identification of the pollution introduced into the river by the tributary rivers (point sources) and by rains (non-point sources).

  9. Estimation of genetic variability and selection response for clutch length in dwarf brown-egg layers carrying or not the naked neck gene

    Directory of Open Access Journals (Sweden)

    Tixier-Boichard Michèle

    2003-03-01

    Full Text Available Abstract In order to investigate the possibility of using the dwarf gene for egg production, two dwarf brown-egg laying lines were selected for 16 generations on average clutch length; one line (L1 was normally feathered and the other (L2 was homozygous for the naked neck gene NA. A control line from the same base population, dwarf and segregating for the NA gene, was maintained during the selection experiment under random mating. The average clutch length was normalized using a Box-Cox transformation. Genetic variability and selection response were estimated either with the mixed model methodology, or with the classical methods for calculating genetic gain, as the deviation from the control line, and the realized heritability, as the ratio of the selection response on cumulative selection differentials. Heritability of average clutch length was estimated to be 0.42 ± 0.02, with a multiple trait animal model, whereas the estimates of the realized heritability were lower, being 0.28 and 0.22 in lines L1 and L2, respectively. REML estimates of heritability were found to decline with generations of selection, suggesting a departure from the infinitesimal model, either because a limited number of genes was involved, or their frequencies were changed. The yearly genetic gains in average clutch length, after normalization, were estimated to be 0.37 ± 0.02 and 0.33 ± 0.04 with the classical methods, 0.46 ± 0.02 and 0.43 ± 0.01 with animal model methodology, for lines L1 and L2 respectively, which represented about 30% of the genetic standard deviation on the transformed scale. Selection response appeared to be faster in line L2, homozygous for the NA gene, but the final cumulated selection response for clutch length was not different between the L1 and L2 lines at generation 16.

  10. 30 min of treadmill walking at self-selected speed does not increase gait variability in independent elderly.

    Science.gov (United States)

    Da Rocha, Emmanuel S; Kunzler, Marcos R; Bobbert, Maarten F; Duysens, Jacques; Carpes, Felipe P

    2018-06-01

    Walking is one of the preferred exercises among elderly, but could a prolonged walking increase gait variability, a risk factor for a fall in the elderly? Here we determine whether 30 min of treadmill walking increases coefficient of variation of gait in elderly. Because gait responses to exercise depend on fitness level, we included 15 sedentary and 15 active elderly. Sedentary participants preferred a lower gait speed and made smaller steps than the actives. Step length coefficient of variation decreased ~16.9% by the end of the exercise in both the groups. Stride length coefficient of variation decreased ~9% after 10 minutes of walking, and sedentary elderly showed a slightly larger step width coefficient of variation (~2%) at 10 min than active elderly. Active elderly showed higher walk ratio (step length/cadence) than sedentary in all times of walking, but the times did not differ in both the groups. In conclusion, treadmill gait kinematics differ between sedentary and active elderly, but changes over time are similar in sedentary and active elderly. As a practical implication, 30 min of walking might be a good strategy of exercise for elderly, independently of the fitness level, because it did not increase variability in step and stride kinematics, which is considered a risk of fall in this population.

  11. Interval ridge regression (iRR) as a fast and robust method for quantitative prediction and variable selection applied to edible oil adulteration.

    Science.gov (United States)

    Jović, Ozren; Smrečki, Neven; Popović, Zora

    2016-04-01

    A novel quantitative prediction and variable selection method called interval ridge regression (iRR) is studied in this work. The method is performed on six data sets of FTIR, two data sets of UV-vis and one data set of DSC. The obtained results show that models built with ridge regression on optimal variables selected with iRR significantly outperfom models built with ridge regression on all variables in both calibration (6 out of 9 cases) and validation (2 out of 9 cases). In this study, iRR is also compared with interval partial least squares regression (iPLS). iRR outperfomed iPLS in validation (insignificantly in 6 out of 9 cases and significantly in one out of 9 cases for poil, a well known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as soybean (So) oil, rapeseed (R) oil and sunflower (Su) oil. Binary mixture sets of hempseed oil with these three oils (HSo, HR and HSu) and a ternary mixture set of H oil, R oil and Su oil (HRSu) were considered. The obtained accuracy indicates that using iRR on FTIR and UV-vis data, each particular oil can be very successfully quantified (in all 8 cases RMSEPoil (R(2)>0.99). Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Evaluation of Phenolic Content Variability along with Antioxidant, Antimicrobial, and Cytotoxic Potential of Selected Traditional Medicinal Plants from India.

    Science.gov (United States)

    Singh, Garima; Passsari, Ajit K; Leo, Vincent V; Mishra, Vineet K; Subbarayan, Sarathbabu; Singh, Bhim P; Kumar, Brijesh; Kumar, Sunil; Gupta, Vijai K; Lalhlenmawia, Hauzel; Nachimuthu, Senthil K

    2016-01-01

    Plants have been used since ancient times as an important source of biologically active substances. The aim of the present study was to investigate the phytochemical constituents (flavonoids and phenolics), antioxidant potential, cytotoxicity against HepG2 (human hepato carcinoma) cancer cell lines, and the antimicrobial activity of the methanol extract of selected traditional medicinal plants collected from Mizoram, India. A number of phenolic compounds were detected using HPLC-DAD-ESI-TOF-MS, mainly Luteolin, Kaempferol, Myricetin, Gallic Acid, Quercetin and Rutin, some of which have been described for the first time in the selected plants. The total phenolic and flavonoid contents showed high variation ranging from 4.44 to 181.91 μg of Gallic Acid equivalent per milligram DW (GAE/mg DW) and 3.17 to 102.2 μg of Quercetin/mg, respectively. The antioxidant capacity was determined by DPPH (IC50 values ranges from 34.22 to 131.4 μg/mL), ABTS (IC50 values ranges from 24.08 to 513.4 μg/mL), and reducing power assays. Antimicrobial activity was assayed against gram positive (Staphylococcus aureus), gram negative (Escherichia coli, Pseudomonas aeruginosa), and yeast (Candida albicans) demonstrating that the methanol extracts of some plants were efficacious antimicrobial agents. Additionally, cytotoxicity was assessed on human hepato carcinoma (HepG2) cancer cell lines and found that the extracts of Albizia lebbeck, Dillenia indica, and Bombax ceiba significantly decreased the cell viability at low concentrations with IC50 values of 24.03, 25.09, and 29.66 μg/mL, respectively. This is the first report of detection of phenolic compounds along with antimicrobial, antioxidant and cytotoxic potential of selected medicinal plants from India, which indicates that these plants might be valuable source for human and animal health.

  13. Evaluation of phenolic content variability, antioxidant, antimicrobial and cytotoxic potential of selected traditional medicinal plants from India

    Directory of Open Access Journals (Sweden)

    Garima eSingh

    2016-03-01

    Full Text Available Plants have been used since ancient times as an important source of biologically active substances. The aim of the present study was to investigate the phytochemical constituents (flavonoids and phenolics, antioxidant potential, cytotoxicity against HepG2 (human hepato carcinoma cancer cell lines and the antimicrobial activity of the methanol extract of selected traditional medicinal plants collected from Mizoram, India. A number of phenolic compounds were detected using HPLC-DAD-ESI-TOF-MS, mainly Luteolin, Kaempferol, Myricetin, Gallic Acid, Quercetin and Rutin, some of which have been described for the first time in the selected plants. The total phenolic and flavonoid contents showed high variation ranging from 4.44 to 181.91 µg of Gallic Acid equivalent per milligram DW (GAE/mg DW and 3.17 to 102.2 µg of Quercetin/mg, respectively. The antioxidant capacity was determined by DPPH (IC50 values ranges from 34.22 to 131.4 µg/mL, ABTS (IC50 values ranges from 24.08 to 513.4 µg/mL and reducing power assays. Antimicrobial activity was assayed against gram positive (Staphylococcus aureus, gram negative (Escherichia coli, Pseudomonas aeruginosa and yeast (Candida albicans demonstrating that the methanol extracts of some plants were efficacious antimicrobial agents. Additionally, cytotoxicity was assessed on human hepato carcinoma (HepG2 cancer cell lines and found that the extracts of Albizia lebbeck, Dillenia indica and Bombax ceiba significantly decreased the cell viability at low concentrations with IC50 values of 24.03, 25.09 and 29.66 µg/mL, respectively. This is the first report of detection of phenolic compounds along with antimicrobial, antioxidant and cytotoxic potential of selected medicinal plants from India, which indicates that these plants might be valuable source for human and animal health.

  14. The Criteria and Variables Affecting the Selection of Quality Book Ideally Suited for Translation: The Perspectives of King Saud University Staff

    Directory of Open Access Journals (Sweden)

    Abdulaziz Abdulrahman Abanomey

    2015-04-01

    Full Text Available This study investigated the ideal definition of QB, that is Quality Book- one that is ideally suited for translation- and the variables affecting its selection criteria among 136 members of King Saud University (KSU academic staff. A workshop was held to elicit the ideal definition of QB to answer the first question, and a 19-item electronic questionnaire with four domains was designed to help collect the data necessary to answer the other two questions of the study. The results revealed that all four domains came low; “Authorship and Publication” came the highest with a mean score of 2.28 and “Titling and Contents” came the lowest with a mean score of 1.76. 5-way ANOVA (without interaction was applied in accordance with the variables of the study at α≤ 0.05 among the mean scores. The analysis revealed significance of the variables of gender, those who translated a book or more before, and those who participated in a conference devoted for translation whereas the variables of qualification and revising a translated book did not reveal any statistical significance. Key words: Quality Book, KSU, Authorship, Translation, Titling

  15. Developing a NIR multispectral imaging for prediction and visualization of peanut protein content using variable selection algorithms

    Science.gov (United States)

    Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei

    2018-01-01

    The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.

  16. Is DAS28-CRP with three and four variables interchangeable in individual patients selected for biological treatment in daily clinical practice?

    DEFF Research Database (Denmark)

    Madsen, Ole Rintek

    2011-01-01

    DAS28 is a widely used composite score for the assessment of disease activity in patients with rheumatoid arthritis (RA) and is often used as a treatment decision tool in the daily clinic. Different versions of DAS28 are available. DAS28-CRP(3) is calculated based on three variables: swollen...... and tender joint counts and CRP. DAS28-CRP(4) also includes patient global assessment. Thresholds for low and high disease activity are the same for the two scores. Based on the Bland-Altman method, the interchangeability between DAS28-CRP with three and four variables was examined in 319 RA patients...... selected for initiating biological treatment. Data were extracted from the Danish registry for biological treatment in rheumatology (DANBIO). Multiple regression analysis was used to assess the predictability of the DAS28 scores by several measures of disease activity. The overall mean DAS28-CRP was 4...

  17. Comprehensive analysis of Polygoni Multiflori Radix of different geographical origins using ultra-high-performance liquid chromatography fingerprints and multivariate chemometric methods

    Directory of Open Access Journals (Sweden)

    Li-Li Sun

    2018-01-01

    Full Text Available Polygoni Multiflori Radix (PMR is increasingly being used not just as a traditional herbal medicine but also as a popular functional food. In this study, multivariate chemometric methods and mass spectrometry were combined to analyze the ultra-high-performance liquid chromatograph (UPLC fingerprints of PMR from six different geographical origins. A chemometric strategy based on multivariate curve resolution–alternating least squares (MCR–ALS and three classification methods is proposed to analyze the UPLC fingerprints obtained. Common chromatographic problems, including the background contribution, baseline contribution, and peak overlap, were handled by the established MCR–ALS model. A total of 22 components were resolved. Moreover, relative species concentrations were obtained from the MCR–ALS model, which was used for multivariate classification analysis. Principal component analysis (PCA and Ward's method have been applied to classify 72 PMR samples from six different geographical regions. The PCA score plot showed that the PMR samples fell into four clusters, which related to the geographical location and climate of the source areas. The results were then corroborated by Ward's method. In addition, according to the variance-weighted distance between cluster centers obtained from Ward's method, five components were identified as the most significant variables (chemical markers for cluster discrimination. A counter-propagation artificial neural network has been applied to confirm and predict the effects of chemical markers on different samples. Finally, the five chemical markers were identified by UPLC–quadrupole time-of-flight mass spectrometer. Components 3, 12, 16, 18, and 19 were identified as 2,3,5,4′-tetrahydroxy-stilbene-2-O-β-d-glucoside, emodin-8-O-β-d-glucopyranoside, emodin-8-O-(6′-O-acetyl-β-d-glucopyranoside, emodin, and physcion, respectively. In conclusion, the proposed method can be applied for the

  18. Water-quality characteristics for selected sites on the Cape Fear River, North Carolina, 1955-80; variability, loads, and trends of selected constituents

    Science.gov (United States)

    Crawford, J. Kent

    1983-01-01

    Water-quality data for selected sites in the Cape Fear River basin collected by the U.S. Geological Survey, the North Carolina Department of Natural Resources and Community Development and the University of North Carolina at Chapel Hill are analyzed and interpreted in this report. Emphasis is given to the Cape Fear River at Lock 1 near Kelly, where data are most complete. Other data included in the report were collected from the Cape Fear River at Lillington, the Haw River near the Jordan Dam, and the Deep River at Moncure. Available data indicate that concentrations of dissolved oxygen at study sites are almost always within U.S. Environmental Protection Agency criteria; however, on two sampling dates, the concentration of dissolved oxygen in the Cape Fear at Lock 1 fell slightly below the 5.0 mg/L recommended for fish populations. Measurements of pH from all stations were frequently below the lower limit of 6.5 pH units recommended for protection of freshwater aquatic life. Major dissolved ions detected are sodium and bicarbonate. Sodium concentration averages 8.6 mg/L and bicarbonate averages 17.5 mg/L at Lock 1. Concentrations of dissolved substances and suspended sediment decrease in the downstream direction, presumably because the more heavily populated part of the basin is near the headwaters of the system. Heavy metals, with the exceptions of cadmium and mercury, rarely exceed Environmental Protection Agency criteria for the protection of aquatic life. Concentrations of mercury in the Haw River, which exceed the recommended 0.20 mg/L needed to protect aquatic life, have frequently been reported by other authors. Several of the most toxic metals, arsenic, cadmium, and cobalt, are about five times more concentrated in water from the Haw River site than from other study sites in the basin. Iron and manganese frequently exceed North Carolina water-quality standards. Available nitrogen averages 1.21 mg/L and available phosphorus averages 0.21 mg/L at Lock 1

  19. Selective interaction of heparin with the variable region 3 within surface glycoprotein of laboratory-adapted feline immunodeficiency virus.

    Directory of Open Access Journals (Sweden)

    Qiong-Ying Hu

    Full Text Available Heparan sulfate proteoglycans (HSPG can act as binding receptors for certain laboratory-adapted (TCA strains of feline immunodeficiency virus (FIV and human immunodeficiency virus (HIV. Heparin, a soluble heparin sulfate (HS, can inhibit TCA HIV and FIV entry mediated by HSPG interaction in vitro. In the present study, we further determined the selective interaction of heparin with the V3 loop of TCA of FIV. Our current results indicate that heparin selectively inhibits infection by TCA strains, but not for field isolates (FS. Heparin also specifically interferes with TCA surface glycoprotein (SU binding to CXCR4, by interactions with HSPG binding sites on the V3 loop of the FIV envelope protein. Peptides representing either the N- or C-terminal side of the V3 loop and containing HSPG binding sites were able to compete away the heparin block of TCA SU binding to CXCR4. Heparin does not interfere with the interaction of SU with anti-V3 antibodies that target the CXCR4 binding region or with the interaction between FS FIV and anti-V3 antibodies since FS SU has no HSPG binding sites within the HSPG binding region. Our data show that heparin blocks TCA FIV infection or entry not only through its competition of HSPG on the cell surface interaction with SU, but also by its interference with CXCR4 binding to SU. These studies aid in the design and development of heparin derivatives or analogues that can inhibit steps in virus infection and are informative regarding the HSPG/SU interaction.

  20. Impact of selected personal factors on seasonal variability of recreationist weather perceptions and preferences in Warsaw (Poland)

    Science.gov (United States)

    Lindner-Cendrowska, Katarzyna; Błażejczyk, Krzysztof

    2018-01-01

    Weather and climate are important natural resources for tourism and recreation, although sometimes they can make outdoor leisure activities less satisfying or even impossible. The aim of this work was to determine weather perception seasonal variability of people staying outdoors in urban environment for tourism and recreation, as well as to determine if personal factors influence estimation of recreationist actual biometeorological conditions and personal expectations towards weather elements. To investigate how human thermal sensations vary upon meteorological conditions typical for temperate climate, weather perception field researches were conducted in Warsaw (Poland) in all seasons. Urban recreationists' preference for slightly warm thermal conditions, sunny, windless and cloudless weather, were identified as well as PET values considered to be optimal for sightseeing were defined between 27.3 and 31.7 °C. The results confirmed existence of phenomena called alliesthesia, which manifested in divergent thermal perception of comparable biometeorological conditions in transitional seasons. The results suggest that recreationist thermal sensations differed from other interviewees' responses and were affected not only by physiological processes but they were also conditioned by psychological factors (i.e. attitude, expectations). Significant impact of respondents' place of origin and its climate on creating thermal sensations and preferences was observed. Sex and age influence thermal preferences, whereas state of acclimatization is related with thermal sensations to some point.

  1. Area-Selective ZnO Thin Film Deposition on Variable Microgap Electrodes and Their Impact on UV Sensing

    Directory of Open Access Journals (Sweden)

    Q. Humayun

    2013-01-01

    Full Text Available ZnO thin films were deposited on patterned gold electrodes using the sol-gel spin coating technique. Conventional photolithography process was used to obtain the variable microgaps of 30 and 43 μm in butterfly topology by using zero-gap chrome mask. The structural, morphological, and electrical properties of the deposited thin films were characterized by X-ray diffraction (XRD, scanning electron microscope (SEM, and Keithley SourceMeter, respectively. The current-voltage (I-V characterization was performed to investigate the effect of UV light on the fabricated devices. The ZnO fabricated sensors showed a photo to dark current (Iph/Id ratios of 6.26 for 30 μm and 5.28 for 43 μm gap electrodes spacing, respectively. Dynamic responses of both fabricated sensors were observed till 1V with good reproducibility. At the applied voltage of 1 V, the response time was observed to be 4.817 s and 3.704 s while the recovery time was observed to be 0.3738 s and 0.2891 s for 30 and 43 μm gaps, respectively. The signal detection at low operating voltages suggested that the fabricated sensors could be used for miniaturized devices with low power consumption.

  2. The Use of Asymptotic Functions for Determining Empirical Values of CN Parameter in Selected Catchments of Variable Land Cover

    Science.gov (United States)

    Wałęga, Andrzej; Młyński, Dariusz; Wachulec, Katarzyna

    2017-12-01

    The aim of the study was to assess the applicability of asymptotic functions for determining the value of CN parameter as a function of precipitation depth in mountain and upland catchments. The analyses were carried out in two catchments: the Rudawa, left tributary of the Vistula, and the Kamienica, right tributary of the Dunajec. The input material included data on precipitation and flows for a multi-year period 1980-2012, obtained from IMGW PIB in Warsaw. Two models were used to determine empirical values of CNobs parameter as a function of precipitation depth: standard Hawkins model and 2-CN model allowing for a heterogeneous nature of a catchment area. The study analyses confirmed that asymptotic functions properly described P-CNobs relationship for the entire range of precipitation variability. In the case of high rainfalls, CNobs remained above or below the commonly accepted average antecedent moisture conditions AMCII. The study calculations indicated that the runoff amount calculated according to the original SCS-CN method might be underestimated, and this could adversely affect the values of design flows required for the design of hydraulic engineering projects. In catchments with heterogeneous land cover, the results of CNobs were more accurate when 2-CN model was used instead of the standard Hawkins model. 2-CN model is more precise in accounting for differences in runoff formation depending on retention capacity of the substrate. It was also demonstrated that the commonly accepted initial abstraction coefficient λ = 0.20 yielded too big initial loss of precipitation in the analyzed catchments and, therefore, the computed direct runoff was underestimated. The best results were obtained for λ = 0.05.

  3. Variability and changes in selected climate elements in Madrid and Alicante in the period 2000-2014

    Directory of Open Access Journals (Sweden)

    Cielecka Katarzyna

    2015-10-01

    Full Text Available The aim of this study is to compare climatic conditions between the interior of the Iberian Peninsula and the southeastern coast of Spain. The article analyzes selected elements of climate over the last 15 years (2000-2014. Synoptic data from airport meteorological stations in Madrid Barajas and Alicante Elche were used. Attention was focused on annual air temperature, relative humidity and precipitation. The mean climatic conditions over the period 2000-2014 were compared with those over the 1961-1990 period which is recommended by WMO as climate normal and with data for the 1971-2000 coming from ‘Climate Atlas’ of Spanish meteorologists group AEMET. Two of climate elements discussed were characterized by significant changes. The annual air temperature was higher by about 0.2°C in Alicante and 0.9°C in Madrid in the period 2000-2014 compared to the 1961-1990. The current winters were colder than in years 1961-1990 at both stations. Gradual decrease in annual precipitation totals was observed at both stations. In 1961-1990 the annual average precipitation in Madrid amounted to 414 mm, while in Alicante it was 356 mm. However, in the recent years of 2000-2014 these totals were lower compared to 1961-1990 reaching 364.1 mm in the central part of Spain and 245.7 mm on the south-western coast.

  4. Multi-component determination and chemometric analysis of Paris polyphylla by ultra high performance liquid chromatography with photodiode array detection.

    Science.gov (United States)

    Chen, Pei; Jin, Hong-Yu; Sun, Lei; Ma, Shuang-Cheng

    2016-09-01

    Multi-source analysis of traditional Chinese medicine is key to ensuring its safety and efficacy. Compared with traditional experimental differentiation, chemometric analysis is a simpler strategy to identify traditional Chinese medicines. Multi-component analysis plays an increasingly vital role in the quality control of traditional Chinese medicines. A novel strategy, based on chemometric analysis and quantitative analysis of multiple components, was proposed to easily and effectively control the quality of traditional Chinese medicines such as Chonglou. Ultra high performance liquid chromatography was more convenient and efficient. Five species of Chonglou were distinguished by chemometric analysis and nine saponins, including Chonglou saponins I, II, V, VI, VII, D, and H, as well as dioscin and gracillin, were determined in 18 min. The method is feasible and credible, and enables to improve quality control of traditional Chinese medicines and natural products. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Genetic variability, local selection and demographic history: genomic evidence of evolving towards allopatric speciation in Asian seabass.

    Science.gov (United States)

    Wang, Le; Wan, Zi Yi; Lim, Huan Sein; Yue, Gen Hua

    2016-08-01

    Genomewide analysis of genetic divergence is critically important in understanding the genetic processes of allopatric speciation. We sequenced RAD tags of 131 Asian seabass individuals of six populations from South-East Asia and Australia/Papua New Guinea. Using 32 433 SNPs, we examined the genetic diversity and patterns of population differentiation across all the populations. We found significant evidence of genetic heterogeneity between South-East Asian and Australian/Papua New Guinean populations. The Australian/Papua New Guinean populations showed a rather lower level of genetic diversity. FST and principal components analysis revealed striking divergence between South-East Asian and Australian/Papua New Guinean populations. Interestingly, no evidence of contemporary gene flow was observed. The demographic history was further tested based on the folded joint site frequency spectrum. The scenario of ancient migration with historical population size changes was suggested to be the best fit model to explain the genetic divergence of Asian seabass between South-East Asia and Australia/Papua New Guinea. This scenario also revealed that Australian/Papua New Guinean populations were founded by ancestors from South-East Asia during mid-Pleistocene and were completely isolated from the ancestral population after the last glacial retreat. We also detected footprints of local selection, which might be related to differential ecological adaptation. The ancient gene flow was examined and deemed likely insufficient to counteract the genetic differentiation caused by genetic drift. The observed genomic pattern of divergence conflicted with the 'genomic islands' scenario. Altogether, Asian seabass have likely been evolving towards allopatric speciation since the split from the ancestral population during mid-Pleistocene. © 2016 John Wiley & Sons Ltd.

  6. Thermodynamic Study of the Ion-Pair Complexation Equilibria of Dye and Surfactant by Spectral Titration and Chemometric Analysis

    Directory of Open Access Journals (Sweden)

    Hakimeh Abbasi Awal

    2017-12-01

    Full Text Available Surfactant-dye interactions are very important in chemical and dyeing processes. The dyes interact strongly with surfactant and show new spectrophotometric properties, so the UV-vis absorption spectrophotometric method has been used to study this process and extract some thermodynamic parameters. In this work, the association equilibrium between ionic dyes and ionic surfactant were studied by analyzing spectrophotometric data using chemometric methods. Methyl orange and crystal violet were selected as a model of cationic and anionic dyes respectively. Also sodium dodecyl sulphate and cetyltrimethylammonium bromide were selected as anionic and cationic surfactant, respectively. Hard model methods such as target transform fitting (TTF classical multi-wavelength fitting and soft model method such as multivariate curve resolution (MCR were used to analyze data that were recorded as a function of surfactant concentration in premicellar and postmicellar regions. Hard model methods were used to resolve data using ion-pair model in premicellar region in order to extract the concentration and spectral profiles of individual components and also related thermodynamic parameters. The equilibrium constants and other thermodynamic parameters of interaction of dyes with surfactants were determined by studying the dependence of their absorption spectra on the temperature in the range 293–308 K at concentrations of 5 × 10−6 M and 8 × 10−6 M for dye crystal violet and methyl orange, respectively. In postmicellar region, the MCR-ALS method was applied for resolving data and getting the spectra and concentration profiles in complex mixtures of dyes and surfactants.

  7. Iron porphyrins doped sol-gel glasses: a chemometric study

    International Nuclear Information System (INIS)

    Sacco, Herica C.; Vidoto, Ednalva A.; Nascimento, Otaciro R.

    2000-01-01

    This paper describes the optimized conditions for preparation of iron porphyrin-template doped silica Fe PDS-template) obtained by the sol-gel process. The following porphyrins (Fe P) were used: Fe TFPP Cl, Fe TDCSPP(Na) 4 Cl and Fe TCPP(Na) 4 Cl. Pyridine or 4-phenylimidazole was used as template. The variables that present significant influence on iron porphyrin loading on xerogel were identified and the values that maximize the iron porphyrin loading on xerogel were established . The variables (Solvent volume, fractional factorial design in two levels, 2 5-1 type, generating 16 total experiments for each Fe P studied. (author)

  8. Iron porphyrins doped sol-gel glasses: a chemometric study

    Energy Technology Data Exchange (ETDEWEB)

    Sacco, Herica C.; Vidoto, Ednalva A.; Nascimento, Otaciro R. [Soap Paulo Univ (USP), Sao Carlos (Brazil). Inst. de Fisica; Biazzotto, Juliana C.; Serra, Osvaldo A.; Iamamoto, Yassuko [Sao Paulo Univ. (USP), Ribeirao Preto, SP (Brazil). Faculdade de Filosofia, Ciencias e Letras; Ciuffi, Katia J.; Mello, Cesar A.; Oliveira, Daniela C. de [Universidade de Franca , SP (Brazil)

    2000-07-01

    This paper describes the optimized conditions for preparation of iron porphyrin-template doped silica Fe (PDS-template) obtained by the sol-gel process. The following porphyrins (Fe P) were used: Fe TFPP Cl, Fe TDCSPP(Na){sub 4}Cl and Fe TCPP(Na){sub 4} Cl. Pyridine or 4-phenylimidazole was used as template. The variables that present significant influence on iron porphyrin loading on xerogel were identified and the values that maximize the iron porphyrin loading on xerogel were established. The variables Solvent volume, fractional factorial design in two levels, 2{sup 5-1} type, generating 16 total experiments for each Fe P studied. (author)

  9. Area- and depth- weighted averages of selected SSURGO variables for the conterminous United States and District of Columbia

    Science.gov (United States)

    Wieczorek, Michael

    2014-01-01

    This digital data release consists of seven data files of soil attributes for the United States and the District of Columbia. The files are derived from National Resources Conservations Service’s (NRCS) Soil Survey Geographic database (SSURGO). The data files can be linked to the raster datasets of soil mapping unit identifiers (MUKEY) available through the NRCS’s Gridded Soil Survey Geographic (gSSURGO) database (http://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_053628). The associated files, named DRAINAGECLASS, HYDRATING, HYDGRP, HYDRICCONDITION, LAYER, TEXT, and WTDEP are area- and depth-weighted average values for selected soil characteristics from the SSURGO database for the conterminous United States and the District of Columbia. The SSURGO tables were acquired from the NRCS on March 5, 2014. The soil characteristics in the DRAINAGE table are drainage class (DRNCLASS), which identifies the natural drainage conditions of the soil and refers to the frequency and duration of wet periods. The soil characteristics in the HYDRATING table are hydric rating (HYDRATE), a yes/no field that indicates whether or not a map unit component is classified as a "hydric soil". The soil characteristics in the HYDGRP table are the percentages for each hydrologic group per MUKEY. The soil characteristics in the HYDRICCONDITION table are hydric condition (HYDCON), which describes the natural condition of the soil component. The soil characteristics in the LAYER table are available water capacity (AVG_AWC), bulk density (AVG_BD), saturated hydraulic conductivity (AVG_KSAT), vertical saturated hydraulic conductivity (AVG_KV), soil erodibility factor (AVG_KFACT), porosity (AVG_POR), field capacity (AVG_FC), the soil fraction passing a number 4 sieve (AVG_NO4), the soil fraction passing a number 10 sieve (AVG_NO10), the soil fraction passing a number 200 sieve (AVG_NO200), and organic matter (AVG_OM). The soil characteristics in the TEXT table are

  10. Chemometric brand differentiation of commercial spices using direct analysis in real time mass spectrometry.

    Science.gov (United States)

    Pavlovich, Matthew J; Dunn, Emily E; Hall, Adam B

    2016-05-15

    Commercial spices represent an emerging class of fuels for improvised explosives. Being able to classify such spices not only by type but also by brand would represent an important step in developing methods to analytically investigate these explosive compositions. Therefore, a combined ambient mass spectrometric/chemometric approach was developed to quickly and accurately classify commercial spices by brand. Direct analysis in real time mass spectrometry (DART-MS) was used to generate mass spectra for samples of black pepper, cayenne pepper, and turmeric, along with four different brands of cinnamon, all dissolved in methanol. Unsupervised learning techniques showed that the cinnamon samples clustered according to brand. Then, we used supervised machine learning algorithms to build chemometric models with a known training set and classified the brands of an unknown testing set of cinnamon samples. Ten independent runs of five-fold cross-validation showed that the training set error for the best-performing models (i.e., the linear discriminant and neural network models) was lower than 2%. The false-positive percentages for these models were 3% or lower, and the false-negative percentages were lower than 10%. In particular, the linear discriminant model perfectly classified the testing set with 0% error. Repeated iterations of training and testing gave similar results, demonstrating the reproducibility of these models. Chemometric models were able to classify the DART mass spectra of commercial cinnamon samples according to brand, with high specificity and low classification error. This method could easily be generalized to other classes of spices, and it could be applied to authenticating questioned commercial samples of spices or to examining evidence from improvised explosives. Copyright © 2016 John Wiley & Sons, Ltd.

  11. Application of Raman spectroscopy and chemometric techniques to assess sensory characteristics of young dairy bull beef.

    Science.gov (United States)

    Zhao, Ming; Nian, Yingqun; Allen, Paul; Downey, Gerard; Kerry, Joseph P; O'Donnell, Colm P

    2018-05-01

    This work aims to develop a rapid analytical technique to predict beef sensory attributes using Raman spectroscopy (RS) and to investigate correlations between sensory attributes using chemometric analysis. Beef samples (n = 72) were obtained from young dairy bulls (Holstein-Friesian and Jersey×Holstein-Friesian) slaughtered at 15 and 19 months old. Trained sensory panel evaluation and Raman spectral data acquisition were both carried out on the same longissimus thoracis muscles after ageing for 21 days. The best prediction results were obtained using a Raman frequency range of 1300-2800 cm -1 . Prediction performance of partial least squares regression (PLSR) models developed using all samples were moderate to high for all sensory attributes (R 2 CV values of 0.50-0.84 and RMSECV values of 1.31-9.07) and were particularly high for desirable flavour attributes (R 2 CVs of 0.80-0.84, RMSECVs of 4.21-4.65). For PLSR models developed on subsets of beef samples i.e. beef of an identical age or breed type, significant improvements on prediction performances were achieved for overall sensory attributes (R 2 CVs of 0.63-0.89 and RMSECVs of 0.38-6.88 for each breed type; R 2 CVs of 0.52-0.89 and RMSECVs of 0.96-6.36 for each age group). Chemometric analysis revealed strong correlations between sensory attributes. Raman spectroscopy combined with chemometric analysis was demonstrated to have high potential as a rapid and non-destructive technique to predict the sensory quality traits of young dairy bull beef. Copyright © 2018. Published by Elsevier Ltd.

  12. Application of high performance liquid chromatography for the profiling of complex chemical mixtures with the aid of chemometrics.

    Science.gov (United States)

    Ni, Yongnian; Zhang, Liangsheng; Churchill, Jane; Kokot, Serge

    2007-06-15

    In this paper, chemometrics methods were applied to resolve the high performance liquid chromatography (HPLC) fingerprints of complex, many-component substances to compare samples from a batch from a given manufacturer, or from those of different producers. As an example of such complex substances, we used a common Chinese traditional medicine, Huoxiang Zhengqi Tincture (HZT) for this research. Twenty-one samples, each representing a separate HZT production batch from one of three manufacturers were analyzed by HPLC with the aid of a diode array detector (DAD). An Agilent Zorbax Eclipse XDB-C18 column with an Agilent Zorbax high pressure reliance cartridge guard-column were used. The mobile phase consisted of water (A) and methanol (B) with a gradient program of 25-65% (v/v, B) during 0-30min, 65-55% (v/v, B) during 30-35min and 55-100% (v/v, B) during 35-60min (flow rate, 1.0mlmin(-1); injection volume, 20mul; and column temperature-ambient). The detection wavelength was adjusted for maximum sensitivity at different time periods. A peak area matrix with 21objectsx14HPLC variables was obtained by sampling each chromatogram at 14 common retention times. Similarities were then calculated to discriminate the batch-to-batch samples and also, a more informative multi-criteria decision making methodology (MCDM), PROMETHEE and GAIA, was applied to obtain more information from the chromatograms in order to rank and compare the complex HZT profiles. The results showed that with the MCDM analysis, it was possible to match and discriminate correctly the batch samples from the three different manufacturers. Fourier transform infrared (FT-IR) spectra taken from samples from several batches were compared by the common similarity method with the HPLC results. It was found that the FT-IR spectra did not discriminate the samples from the different batches.

  13. Synergistic effect of the simultaneous chemometric analysis of {sup 1}H NMR spectroscopic and stable isotope (SNIF-NMR, {sup 18}O, {sup 13}C) data: Application to wine analysis

    Energy Technology Data Exchange (ETDEWEB)

    Monakhova, Yulia B., E-mail: yul-monakhova@mail.ru [Chemisches und Veterinäruntersuchungsamt (CVUA) Karlsruhe, Weissenburger Strasse 3, Karlsruhe 76187 (Germany); Bruker Biospin GmbH, Silberstreifen, Rheinstetten 76287 (Germany); Department of Chemistry, Saratov State University, Astrakhanskaya Street 83, Saratov 410012 (Russian Federation); Godelmann, Rolf [Chemisches und Veterinäruntersuchungsamt (CVUA) Karlsruhe, Weissenburger Strasse 3, Karlsruhe 76187 (Germany); Hermann, Armin [Landesuntersuchungsamt -Institut für Lebensmittelchemie und Arzneimittelprüfung, Emy-Roeder-Straße 1, Mainz 55129 (Germany); Kuballa, Thomas [Chemisches und Veterinäruntersuchungsamt (CVUA) Karlsruhe, Weissenburger Strasse 3, Karlsruhe 76187 (Germany); Cannet, Claire; Schäfer, Hartmut; Spraul, Manfred [Bruker Biospin GmbH, Silberstreifen, Rheinstetten 76287 (Germany); Rutledge, Douglas N. [AgroParisTech, UMR 1145, Ingénierie Procédés Aliments, 16 rue Claude Bernard, Paris F-75005 (France)

    2014-06-23

    Highlights: • {sup 1}H NMR profilings of 718 wines were fused with stable isotope analysis data (SNIF-NMR, {sup 18}O, {sup 13}C). • The best improvement was obtained for prediction of the geographical origin of wine. • Certain enhancement was also obtained for the year of vintage (from 88 to 97% for {sup 1}H NMR to 99% for the fused data). • Independent component analysis was used as an alternative chemometric tool for classification. - Abstract: It is known that {sup 1}H NMR spectroscopy represents a good tool for predicting the grape variety, the geographical origin, and the year of vintage of wine. In the present study we have shown that classification models can be improved when {sup 1}H NMR profiles are fused with stable isotope (SNIF-NMR, {sup 18}O, {sup 13}C) data. Variable selection based on clustering of latent variables was performed on {sup 1}H NMR data. Afterwards, the combined data of 718 wine samples from Germany were analyzed using linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), factorial discriminant analysis (FDA) and independent components analysis (ICA). Moreover, several specialized multiblock methods (common components and specific weights analysis (ComDim), consensus PCA and consensus PLS-DA) were applied to the data. The best improvement in comparison with {sup 1}H NMR data was obtained for prediction of the geographical origin (up to 100% for the fused data, whereas stable isotope data resulted only in 60–70% correct prediction and {sup 1}H NMR data alone in 82–89% respectively). Certain enhancement was obtained also for the year of vintage (from 88 to 97% for {sup 1}H NMR to 99% for the fused data), whereas in case of grape varieties improved models were not obtained. The combination of {sup 1}H NMR data with stable isotope data improves efficiency of classification models for geographical origin and vintage of wine and can be potentially used for other food products as well.

  14. The Impact of Variability of Selected Geological and Mining Parameters on the Value and Risks of Projects in the Hard Coal Mining Industry

    Science.gov (United States)

    Kopacz, Michał

    2017-09-01

    The paper attempts to assess the impact of variability of selected geological (deposit) parameters on the value and risks of projects in the hard coal mining industry. The study was based on simulated discounted cash flow analysis, while the results were verified for three existing bituminous coal seams. The Monte Carlo simulation was based on nonparametric bootstrap method, while correlations between individual deposit parameters were replicated with use of an empirical copula. The calculations take into account the uncertainty towards the parameters of empirical distributions of the deposit variables. The Net Present Value (NPV) and the Internal Rate of Return (IRR) were selected as the main measures of value and risk, respectively. The impact of volatility and correlation of deposit parameters were analyzed in two aspects, by identifying the overall effect of the correlated variability of the parameters and the indywidual impact of the correlation on the NPV and IRR. For this purpose, a differential approach, allowing determining the value of the possible errors in calculation of these measures in numerical terms, has been used. Based on the study it can be concluded that the mean value of the overall effect of the variability does not exceed 11.8% of NPV and 2.4 percentage points of IRR. Neglecting the correlations results in overestimating the NPV and the IRR by up to 4.4%, and 0.4 percentage point respectively. It should be noted, however, that the differences in NPV and IRR values can vary significantly, while their interpretation depends on the likelihood of implementation. Generalizing the obtained results, based on the average values, the maximum value of the risk premium in the given calculation conditions of the "X" deposit, and the correspondingly large datasets (greater than 2500), should not be higher than 2.4 percentage points. The impact of the analyzed geological parameters on the NPV and IRR depends primarily on their co-existence, which can be

  15. Chemometric characterization of the hydrogen bonding complexes of secondary amides and aromatic hydrocarbons

    Directory of Open Access Journals (Sweden)

    Jović Branislav

    2012-01-01

    Full Text Available The paper reports the results of the study of hydrogen bonding complexes between secondary amides and various aromatic hydrocarbons. The possibility of using chemometric methods was investigated in order to characterize N-H•••π hydrogen bonded complexes. Hierarchical clustering and Principal Component Analysis (PCA have been applied on infrared spectroscopic and Taft parameters of 43 N-substituted amide complexes with different aromatic hydrocarbons. Results obtained in this report are in good agreement with conclusions of other spectroscopic and thermodynamic analysis.

  16. Evolving chemometric models for predicting dynamic process parameters in viscose production

    Energy Technology Data Exchange (ETDEWEB)

    Cernuda, Carlos [Department of Knowledge-Based Mathematical Systems, Johannes Kepler University Linz (Austria); Lughofer, Edwin, E-mail: edwin.lughofer@jku.at [Department of Knowledge-Based Mathematical Systems, Johannes Kepler University Linz (Austria); Suppan, Lisbeth [Kompetenzzentrum Holz GmbH, St. Peter-Str. 25, 4021 Linz (Austria); Roeder, Thomas; Schmuck, Roman [Lenzing AG, 4860 Lenzing (Austria); Hintenaus, Peter [Software Research Center, Paris Lodron University Salzburg (Austria); Maerzinger, Wolfgang [i-RED Infrarot Systeme GmbH, Linz (Austria); Kasberger, Juergen [Recendt GmbH, Linz (Austria)

    2012-05-06

    Highlights: Black-Right-Pointing-Pointer Quality assurance of process parameters in viscose production. Black-Right-Pointing-Pointer Automatic prediction of spin-bath concentrations based on FTNIR spectra. Black-Right-Pointing-Pointer Evolving chemometric models for efficiently handling changing system dynamics over time (no time-intensive re-calibration needed). Black-Right-Pointing-Pointer Significant reduction of huge errors produced by statistical state-of-the-art calibration methods. Black-Right-Pointing-Pointer Sufficient flexibility achieved by gradual forgetting mechanisms. - Abstract: In viscose production, it is important to monitor three process parameters in order to assure a high quality of the final product: the concentrations of H{sub 2}SO{sub 4}, Na{sub 2}SO{sub 4} and Z{sub n}SO{sub 4}. During on-line production these process parameters usually show a quite high dynamics depending on the fiber type that is produced. Thus, conventional chemometric models, which are trained based on collected calibration spectra from Fourier transform near infrared (FT-NIR) measurements and kept fixed during the whole life-time of the on-line process, show a quite imprecise and unreliable behavior when predicting the concentrations of new on-line data. In this paper, we are demonstrating evolving chemometric models which are able to adapt automatically to varying process dynamics by updating their inner structures and parameters in a single-pass incremental manner. These models exploit the Takagi-Sugeno fuzzy model architecture, being able to model flexibly different degrees of non-linearities implicitly contained in the mapping between near infrared spectra (NIR) and reference values. Updating the inner structures is achieved by moving the position of already existing local regions and by evolving (increasing non-linearity) or merging (decreasing non-linearity) new local linear predictors on demand, which are guided by distance-based and similarity criteria. Gradual

  17. Classification of different tomato seed cultivars by multispectral visible-near infrared spectroscopy and chemometrics

    DEFF Research Database (Denmark)

    Shrestha, Santosh; Deleuran, Lise Christina; Gislum, René

    2016-01-01

    nm were extracted from multispectral images of tomato seeds. Principal component analysis (PCA) was used for data exploration, while partial least squares discriminant analysis (PLS-DA) and support vector machine discriminant analysis (SVM-DA) were used to classify the five different tomato cultivars....... The results showed very good classification accuracy for two independent test sets ranging from 94% to 100% for all tomato cultivars irrespective of chemometric methods. The overall classification error rates were 3.2% and 0.4% for the PLS-DA and SVM-DA calibration models, respectively. The results indicate...

  18. Structural Analysis of Multi-component Amyloid Systems by Chemometric SAXS Data Decomposition

    DEFF Research Database (Denmark)

    Trillo, Isabel Fatima Herranz; Jensen, Minna Grønning; van Maarschalkerweerd, Andreas

    2017-01-01

    Formation of amyloids is the hallmark of several neurodegenerative pathologies. Structural investigation of these complex transformation processes poses significant experimental challenges due to the co-existence of multiple species. The additive nature of small-angle X-ray scattering (SAXS) data...... least squares (MCR-ALS) chemometric method. The approach enables rigorous and robust decomposition of synchrotron SAXS data by simultaneously introducing these data in different representations that emphasize molecular changes at different time and structural resolution ranges. The approach has allowed...

  19. (Poly)phenolic fingerprint and chemometric analysis of white (Morus alba L.) and black (Morus nigra L.) mulberry leaves by using a non-targeted UHPLC-MS approach.

    Science.gov (United States)

    Sánchez-Salcedo, Eva M; Tassotti, Michele; Del Rio, Daniele; Hernández, Francisca; Martínez, Juan José; Mena, Pedro

    2016-12-01

    This study reports the (poly)phenolic fingerprinting and chemometric discrimination of leaves of eight mulberry clones from Morus alba and Morus nigra cultivated in Spain. UHPLC-MS(n) (Ultra High Performance Liquid Chromatography-Mass Spectrometry) high-throughput analysis allowed the tentative identification of a total of 31 compounds. The phenolic profile of mulberry leaf was characterized by the presence of a high number of flavonol derivatives, mainly glycosylated forms of quercetin and kaempferol. Caffeoylquinic acids, simple phenolic acids, and some organic acids were also detected. Seven compounds were identified for the first time in mulberry leaves. The chemometric analysis (cluster analysis and principal component analysis) of the chromatographic data allowed the characterization of the different mulberry clones and served to explain the great intraspecific variability in mulberry secondary metabolism. This screening of the complete phenolic profile of mulberry leaves can assist the increasing interest for purposes related to quality control, germplasm screening, and bioactivity evaluation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. [Study of near infrared spectral preprocessing and wavelength selection methods for endometrial cancer tissue].

    Science.gov (United States)

    Zhao, Li-Ting; Xiang, Yu-Hong; Dai, Yin-Mei; Zhang, Zhuo-Yong

    2010-04-01

    Near infrared spectroscopy was applied to measure the tissue slice of endometrial tissues for collecting the spectra. A total of 154 spectra were obtained from 154 samples. The number of normal, hyperplasia, and malignant samples was 36, 60, and 58, respectively. Original near infrared spectra are composed of many variables, for example, interference information including instrument errors and physical effects such as particle size and light scatter. In order to reduce these influences, original spectra data should be performed with different spectral preprocessing methods to compress variables and extract useful information. So the methods of spectral preprocessing and wavelength selection have played an important role in near infrared spectroscopy technique. In the present paper the raw spectra were processed using various preprocessing methods including first derivative, multiplication scatter correction, Savitzky-Golay first derivative algorithm, standard normal variate, smoothing, and moving-window median. Standard deviation was used to select the optimal spectral region of 4 000-6 000 cm(-1). Then principal component analysis was used for classification. Principal component analysis results showed that three types of samples could be discriminated completely and the accuracy almost achieved 100%. This study demonstrated that near infrared spectroscopy technology and chemometrics method could be a fast, efficient, and novel means to diagnose cancer. The proposed methods would be a promising and significant diagnosis technique of early stage cancer.

  1. Spectrophotometric and chemometric methods for determination of imipenem, ciprofloxacin hydrochloride, dexamethasone sodium phosphate, paracetamol and cilastatin sodium in human urine

    Science.gov (United States)

    El-Kosasy, A. M.; Abdel-Aziz, Omar; Magdy, N.; El Zahar, N. M.

    2016-03-01

    New accurate, sensitive and selective spectrophotometric and chemometric methods were developed and subsequently validated for determination of Imipenem (IMP), ciprofloxacin hydrochloride (CIPRO), dexamethasone sodium phosphate (DEX), paracetamol (PAR) and cilastatin sodium (CIL) in human urine. These methods include a new derivative ratio method, namely extended derivative ratio (EDR), principal component regression (PCR) and partial least-squares (PLS) methods. A novel EDR method was developed for the determination of these drugs, where each component in the mixture was determined by using a mixture of the other four components as divisor. Peak amplitudes were recorded at 293.0 nm, 284.0 nm, 276.0 nm, 257.0 nm and 221.0 nm within linear concentration ranges 3.00-45.00, 1.00-15.00, 4.00-40.00, 1.50-25.00 and 4.00-50.00 μg mL- 1 for IMP, CIPRO, DEX, PAR and CIL, respectively. PCR and PLS-2 models were established for simultaneous determination of the studied drugs in the range of 3.00-15.00, 1.00-13.00, 4.00-12.00, 1.50-9.50, and 4.00-12.00 μg mL- 1 for IMP, CIPRO, DEX, PAR and CIL, respectively, by using eighteen mixtures as calibration set and seven mixtures as validation set. The suggested methods were validated according to the International Conference of Harmonization (ICH) guidelines and the results revealed that they were accurate, precise and reproducible. The obtained results were statistically compared with those of the published methods and there was no significant difference.

  2. New insight into protein-nanomaterial interactions with UV-visible spectroscopy and chemometrics: human serum albumin and silver nanoparticles.

    Science.gov (United States)

    Wang, Yong; Ni, Yongnian

    2014-01-21

    In recent years, great efforts have focused on the exploration and fabrication of protein nanoconjugates due to potential applications in many fields including bioanalytical science, biosensors, biocatalysis, biofuel cells and bio-based nanodevices. An important aspect of our understanding of protein nanoconjugates is to quantitatively understand how proteins interact with nanomaterials. In this report, human serum albumin (HSA) and citrate-coated silver nanoparticles (AgNPs) are selected as a case study of protein-nanomaterial interactions. UV-visible spectroscopy together with multivariate curve resolution by alternating least squares (MCR-ALS) algorithm is first exploited for the detailed study of AgNPs-HSA interactions. Introduction of the chemometrics tool allows extracting the kinetic profiles, spectra and distribution diagrams of two major absorbing pure species (AgNPs and AgNPs-HSA conjugate). These resolved profiles are then analysed to give the thermodynamic, kinetic and structural information of HSA binding to AgNPs. Transmission electron microscopy, circular dichroism spectroscopy and Fourier transform infrared spectroscopy are used to further characterize the complex system. Moreover, a sensitive spectroscopic biosensor for HSA is fabricated with the MCR-ALS resolved concentration of absorbing pure species. It is found that the linear range for the HSA nanosensor was from 1.9 nM to 45.0 nM with a detection limit of 0.9 nM. It is believed that the proposed method will play an important role in the fabrication and optimization of a robust nanobiosensor or cross-reactive sensors array for the detection and identification of biocomponents.

  3. Authentication of vegetable oils on the basis of their physico-chemical properties with the aid of chemometrics.

    Science.gov (United States)

    Zhang, Guowen; Ni, Yongnian; Churchill, Jane; Kokot, Serge

    2006-09-15

    In food production, reliable analytical methods for confirmation of purity or degree of spoilage are required by growers, food quality assessors, processors, and consumers. Seven parameters of physico-chemical properties, such as acid number, colority, density, refractive index, moisture and volatility, saponification value and peroxide value, were measured for quality and adulterated soybean, as well as quality and rancid rapeseed oils. Chemometrics methods were then applied for qualitative and quantitative discrimination and prediction of the oils by methods such exploratory principal component analysis (PCA), partial least squares (PLS), radial basis function-artificial neural networks (RBF-ANN), and multi-criteria decision making methods (MCDM), PROMETHEE and GAIA. In general, the soybean and rapeseed oils were discriminated by PCA, and the two spoilt oils behaved differently with the rancid rapeseed samples exhibiting more object scatter on the PC-scores plot, than the adulterated soybean oil. For the PLS and RBF-ANN prediction methods, suitable training models were devised, which were able to predict satisfactorily the category of the four different oil samples in the verification set. Rank ordering with the use of MCDM models indicated that the oil types can be discriminated on the PROMETHEE II scale. For the first time, it was demonstrated how ranking of oil objects with the use of PROMETHEE and GAIA could be utilized as a versatile indicator of quality performance of products on the basis of a standard selected by the stakeholder. In principle, this approach provides a very flexible method for assessment of product quality directly from the measured data.

  4. Chemical Profiling of the Essential Oils of Syzygium aqueum, Syzygium samarangense and Eugenia uniflora and Their Discrimination Using Chemometric Analysis.

    Science.gov (United States)

    Sobeh, Mansour; Braun, Markus Santhosh; Krstin, Sonja; Youssef, Fadia S; Ashour, Mohamed L; Wink, Michael

    2016-11-01

    The essential oil compositions of the leaves of three related Myrtaceae species, namely Syzygium aqueum, Syzygium samarangense and Eugenia uniflora, were investigated using GLC/MS and GLC/FID. Altogether, 125 compounds were identified: α-Selinene (13.85%), β-caryophyllene (12.72%) and β-selinene constitute the most abundant constituents in S. aqueum. Germacrene D (21.62%) represents the major compound in S. samarangense whereas in E. uniflora, spathulenol (15.80%) represents the predominant component. Multivariate chemometric analyses were used to discriminate the essential oils using hierarchical cluster analysis (HCA) and principal component analysis (PCA) based on the chromatographic results. The antimicrobial activity of the popularly used E. uniflora essential oil was assessed using broth microdilution method against six Gram-positive, three Gram-negative bacteria and two fungi. The oil showed moderate antimicrobial activity against Bacillus licheniformis exhibiting MIC and MMC of 0.63 mg/ml. The cytotoxic activity of E. uniflora essential oil was investigated against Trypanosoma brucei brucei (T. b. brucei) and MCF-7 cancer cell line using MTT assay. It showed moderate activity against MCF-7 cells with an IC 50 value of 76.40 μg/ml. On the other hand, T. brucei was highly susceptible to E. uniflora essential oil with IC 50 of 11.20 μg/ml, and a selectivity index of 6.82. © 2016 Wiley-VHCA AG, Zurich, Switzerland.

  5. Comprehensive analysis of yeast metabolite GC x GC-TOFMS data: combining discovery-mode and deconvolution chemometric software.

    Science.gov (United States)

    Mohler, Rachel E; Dombek, Kenneth M; Hoggard, Jamin C; Pierce, Karisa M; Young, Elton T; Synovec, Robert E

    2007-08-01

    The first extensive study of yeast metabolite GC x GC-TOFMS data from cells grown under fermenting, R, and respiring, DR, conditions is reported. In this study, recently developed chemometric software for use with three-dimensional instrumentation data was implemented, using a statistically-based Fisher ratio method. The Fisher ratio method is fully automated and will rapidly reduce the data to pinpoint two-dimensional chromatographic peaks differentiating sample types while utilizing all the mass channels. The effect of lowering the Fisher ratio threshold on peak identification was studied. At the lowest threshold (just above the noise level), 73 metabolite peaks were identified, nearly three-fold greater than the number of previously reported metabolite peaks identified (26). In addition to the 73 identified metabolites, 81 unknown metabolites were also located. A Parallel Factor Analysis graphical user interface (PARAFAC GUI) was applied to selected mass channels to obtain a concentration ratio, for each metabolite under the two growth conditions. Of the 73 known metabolites identified by the Fisher ratio method, 54 were statistically changing to the 95% confidence limit between the DR and R conditions according to the rigorous Student's t-test. PARAFAC determined the concentration ratio and provided a fully-deconvoluted (i.e. mathematically resolved) mass spectrum for each of the metabolites. The combination of the Fisher ratio method with the PARAFAC GUI provides high-throughput software for discovery-based metabolomics research, and is novel for GC x GC-TOFMS data due to the use of the entire data set in the analysis (640 MB x 70 runs, double precision floating point).

  6. Genetic variability in arbuscular mycorrhizal fungi compatibility supports the selection of durum wheat genotypes for enhancing soil ecological services and cropping systems in Canada.

    Science.gov (United States)

    Singh, A K; Hamel, C; Depauw, R M; Knox, R E

    2012-03-01

    Crop nutrient- and water-use efficiency could be improved by using crop varieties highly compatible with arbuscular mycorrhizal fungi (AMF). Two greenhouse experiments demonstrated the presence of genetic variability for this trait in modern durum wheat ( Triticum turgidum L. var. durum Desf.) germplasm. Among the five cultivars tested, 'AC Morse' had consistently low levels of AM root colonization and DT710 had consistently high levels of AM root colonization, whereas 'Commander', which had the highest colonization levels under low soil fertility conditions, developed poor colonization levels under medium fertility level. The presence of genetic variability in durum wheat compatibility with AMF was further evidenced by significant genotype × inoculation interaction effects in grain and straw biomass production; grain P, straw P, and straw K concentrations under medium soil fertility level; and straw K and grain Fe concentrations at low soil fertility. Mycorrhizal dependency was an undesirable trait of 'Mongibello', which showed poor growth and nutrient balance in the absence of AMF. An AMF-mediated reduction in grain Cd under low soil fertility indicated that breeding durum wheat for compatibility with AMF could help reduce grain Cd concentration in durum wheat. Durum wheat genotypes should be selected for compatibility with AMF rather than for mycorrhizal dependency.

  7. Selective dopamine D3 receptor antagonism by SB-277011A attenuates cocaine reinforcement as assessed by progressive-ratio and variable-cost–variable-payoff fixed-ratio cocaine self-administration in rats

    Science.gov (United States)

    Xi, Zheng-Xiong; Gilbert, Jeremy G.; Pak, Arlene C.; Ashby, Charles R.; Heidbreder, Christian A.; Gardner, Eliot L.

    2013-01-01

    In rats, acute administration of SB-277011A, a highly selective dopamine (DA) D3 receptor antagonist, blocks cocaine-enhanced brain stimulation reward, cocaine-seeking behaviour and reinstatement of cocaine-seeking behaviour. Here, we investigated whether SB-277011A attenuates cocaine reinforcement as assessed by cocaine self-administration under variable-cost–variable-payoff fixed-ratio (FR) and progressive-ratio (PR) reinforcement schedules. Acute i.p. administration of SB-277011A (3–24 mg/kg) did not significantly alter cocaine (0.75 mg/kg/infusion) self-administration reinforced under FR1 (one lever press for one cocaine infusion) conditions. However, acute administration of SB-277011A (24 mg/kg, i.p.) progressively attenuated cocaine self-administration when: (a) the unit dose of self-administered cocaine was lowered from 0.75 to 0.125–0.5 mg/kg, and (b) the work demand for cocaine reinforcement was increased from FR1 to FR10. Under PR (increasing number of lever presses for each successive cocaine infusion) cocaine reinforcement, acute administration of SB-277011A (6–24 mg/kg i.p.) lowered the PR break point for cocaine self-administration in a dose-dependent manner. The reduction in the cocaine (0.25–1.0 mg/kg) dose–response break-point curve produced by 24 mg/kg SB-277011A is consistent with a reduction in cocaine’s reinforcing efficacy. When substituted for cocaine, SB-277011A alone did not sustain self-administration behaviour. In contrast with the mixed DA D2/D3 receptor antagonist haloperidol (1 mg/kg), SB-277011A (3, 12 or 24 mg/kg) failed to impede locomotor activity, failed to impair rearing behaviour, failed to produce catalepsy and failed to impair rotarod performance. These results show that SB-277011A significantly inhibits acute cocaine-induced reinforcement except at high cocaine doses and low work requirement for cocaine. If these results extrapolate to humans, SB-277011A or similar selective DA D3 receptor antagonists may be

  8. Genetic variability in G2 and F2 region between biological clones of human respiratory syncytial virus with or without host immune selection pressure

    Directory of Open Access Journals (Sweden)

    Claudia Trigo Pedroso Moraes

    2015-02-01

    Full Text Available Human respiratory syncytial virus (HRSV is an important respiratory pathogens among children between zero-five years old. Host immunity and viral genetic variability are important factors that can make vaccine production difficult. In this work, differences between biological clones of HRSV were detected in clinical samples in the absence and presence of serum collected from children in the convalescent phase of the illness and from their biological mothers. Viral clones were selected by plaque assay in the absence and presence of serum and nucleotide sequences of the G2 and F2 genes of HRSV biological clones were compared. One non-synonymous mutation was found in the F gene (Ile5Asn in one clone of an HRSV-B sample and one non-synonymous mutation was found in the G gene (Ser291Pro in four clones of the same HRSV-B sample. Only one of these clones was obtained after treatment with the child's serum. In addition, some synonymous mutations were determined in two clones of the HRSV-A samples. In conclusion, it is possible that minor sequences could be selected by host antibodies contributing to the HRSV evolutionary process, hampering the development of an effective vaccine, since we verify the same codon alteration in absence and presence of human sera in individual clones of BR-85 sample.

  9. Genetic variability in G2 and F2 region between biological clones of human respiratory syncytial virus with or without host immune selection pressure.

    Science.gov (United States)

    Moraes, Claudia Trigo Pedroso; Oliveira, Danielle Bruna Leal; Campos, Angelica Cristine Almeida; Bosso, Patricia Alves; Lima, Hildener Nogueira; Stewien, Klaus Eberhard; Gilio, Alfredo Elias; Vieira, Sandra Elisabete; Botosso, Viviane Fongaro; Durigon, Edison Luiz

    2015-02-01

    Human respiratory syncytial virus (HRSV) is an important respiratory pathogens among children between zero-five years old. Host immunity and viral genetic variability are important factors that can make vaccine production difficult. In this work, differences between biological clones of HRSV were detected in clinical samples in the absence and presence of serum collected from children in the convalescent phase of the illness and from their biological mothers. Viral clones were selected by plaque assay in the absence and presence of serum and nucleotide sequences of the G2 and F2 genes of HRSV biological clones were compared. One non-synonymous mutation was found in the F gene (Ile5Asn) in one clone of an HRSV-B sample and one non-synonymous mutation was found in the G gene (Ser291Pro) in four clones of the same HRSV-B sample. Only one of these clones was obtained after treatment with the child's serum. In addition, some synonymous mutations were determined in two clones of the HRSV-A samples. In conclusion, it is possible that minor sequences could be selected by host antibodies contributing to the HRSV evolutionary process, hampering the development of an effective vaccine, since we verify the same codon alteration in absence and presence of human sera in individual clones of BR-85 sample.

  10. AEP's selection of GE Energy's variable frequency transformer (VFT) for their grid interconnection project between the United States and Mexico

    Energy Technology Data Exchange (ETDEWEB)

    Spurlock, M.; O' Keefe, R. [American Electric Power, Gahanna, OH (United States); Kidd, D. [American Electric Power, Tulsa, OK (United States); Larsen, E. [GE Energy, Schenectady, NY (United States); Roedel, J. [GE Energy, Denver, CO (United States); Bodo, R. [GE Energy, Carrolton, TX (United States); Marken, P. [GE Energy, Columbia City, IN (United States)

    2006-07-01

    Variable frequency transformers (VFTs) are controllable, bi-directional transmission devices capable of allowing power transfer between asynchronous networks. The VFT uses a rotary transformer with 3-phase windings on both the rotor and the stator. A motor and drive system is also used to manipulate the rotational position of the rotor in order to control the magnitude and direction of the power flow. The VFT was recently selected by American Electric Power (AEP) for its new asynchronous transmission link between the United States and Mexico. This paper provided details of the feasibility studies conducted to select the technology. Three categories of asynchronous interconnection devices were evaluated: (1) a VFT; (2) a voltage source converter; and (3) a conventional high voltage direct current (HVDC) back-to-back system. Stability performance system studies were conducted for all options. The overall reliability benefits of the options were reviewed, as well as their ability to meet steady-state system requirements. Dynamic models were used to conduct the comparative evaluation. Results of the feasibility study indicated that both the VFT and the voltage source converter performed better than the HVDC system. However, the VFT was more stable than the voltage source converter. 5 refs., 3 figs.

  11. Investigation of Arctic and Antarctic spatial and depth patterns of sea water in CTD profiles using chemometric data analysis

    DEFF Research Database (Denmark)

    Kotwa, Ewelina Katarzyna; Lacorte, Silvia; Duarte, Carlos

    2014-01-01

    In this paper we examine 2- and 3-way chemometric methods for analysis of Arctic and Antarctic water samples. Standard CTD (conductivity–temperature–depth) sensor devices were used during two oceanographic expeditions (July 2007 in the Arctic; February 2009 in the Antarctic) covering a total of 174...

  12. Chemometric applications to assess quality and critical parameters of virgin and extra-virgin olive oil. A review.

    Science.gov (United States)

    Gómez-Caravaca, Ana M; Maggio, Rubén M; Cerretani, Lorenzo

    2016-03-24

    Today virgin and extra-virgin olive oil (VOO and EVOO) are food with a large number of analytical tests planned to ensure its quality and genuineness. Almost all official methods demand high use of reagents and manpower. Because of that, analytical development in this area is continuously evolving. Therefore, this review focuses on analytical methods for EVOO/VOO which use fast and smart approaches based on chemometric techniques in order to reduce time of analysis, reagent consumption, high cost equipment and manpower. Experimental approaches of chemometrics coupled with fast analytical techniques such as UV-Vis spectroscopy, fluorescence, vibrational spectroscopies (NIR, MIR and Raman fluorescence), NMR spectroscopy, and other more complex techniques like chromatography, calorimetry and electrochemical techniques applied to EVOO/VOO production and analysis have been discussed throughout this work. The advantages and drawbacks of this association have also been highlighted. Chemometrics has been evidenced as a powerful tool for the oil industry. In fact, it has been shown how chemometrics can be implemented all along the different steps of EVOO/VOO production: raw material input control, monitoring during process and quality control of final product. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. An Advanced Analytical Chemistry Experiment Using Gas Chromatography-Mass Spectrometry, MATLAB, and Chemometrics to Predict Biodiesel Blend Percent Composition

    Science.gov (United States)

    Pierce, Karisa M.; Schale, Stephen P.; Le, Trang M.; Larson, Joel C.

    2011-01-01

    We present a laboratory experiment for an advanced analytical chemistry course where we first focus on the chemometric technique partial least-squares (PLS) analysis applied to one-dimensional (1D) total-ion-current gas chromatography-mass spectrometry (GC-TIC) separations of biodiesel blends. Then, we focus on n-way PLS (n-PLS) applied to…

  14. Simultaneous spectrophotometric determination of copper, cobalt, nickel and iron in foodstuffs and vegetables with a new bis thiosemicarbazone ligand using chemometric approaches.

    Science.gov (United States)

    Rohani Moghadam, Masoud; Poorakbarian Jahromi, Sayedeh Maria; Darehkordi, Ali

    2016-02-01

    A newly synthesized bis thiosemicarbazone ligand, (2Z,2'Z)-2,2'-((4S,5R)-4,5,6-trihydroxyhexane-1,2-diylidene)bis(N-phenylhydrazinecarbothioamide), was used to make a complex with Cu(2+), Ni(2+), Co(2+) and Fe(3+) for their simultaneous spectrophotometric determination using chemometric methods. By Job's method, the ratio of metal to ligand in Ni(2+) was found to be 1:2, whereas it was 1:4 for the others. The effect of pH on the sensitivity and selectivity of the formed complexes was studied according to the net analyte signal (NAS). Under optimum conditions, the calibration graphs were linear in the ranges of 0.10-3.83, 0.20-3.83, 0.23-5.23 and 0.32-8.12 mg L(-1) with the detection limits of 2, 3, 4 and 10 μg L(-1) for Cu(2+), Co(2+), Ni(2+) and Fe(3+) respectively. The OSC-PLS1 for Cu(2+) and Ni(2+), the PLS1 for Co(2+) and the PC-FFANN for Fe(3+) were selected as the best models. The selected models were successfully applied for the simultaneous determination of elements in some foodstuffs and vegetables. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Early detection of emerging street drugs by near infrared spectroscopy and chemometrics.

    Science.gov (United States)

    Risoluti, R; Materazzi, S; Gregori, A; Ripani, L

    2016-06-01

    Near-infrared spectroscopy (NIRs) is spreading as the tool of choice for fast and non-destructive analysis and detection of different compounds in complex matrices. This paper investigated the feasibility of using near infrared (NIR) spectroscopy coupled to chemometrics calibration to detect new psychoactive substances in street samples. The capabilities of this approach in forensic chemistry were assessed in the determination of new molecules appeared in the illicit market and often claimed to contain "non-illegal" compounds, although exhibiting important psychoactive effects. The study focused on synthetic molecules belonging to the classes of synthetic cannabinoids and phenethylamines. The approach was validated comparing results with officials methods and has been successfully applied for "in site" determination of illicit drugs in confiscated real samples, in cooperation with the Scientific Investigation Department (Carabinieri-RIS) of Rome. The achieved results allow to consider NIR spectroscopy analysis followed by chemometrics as a fast, cost-effective and useful tool for the preliminary determination of new psychoactive substances in forensic science. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Application of Chemometric Techniques to Colorimetric Data in Classifying Automobile Paint

    International Nuclear Information System (INIS)

    Nur Awatif Rosli; Rozita Osman; Norashikin Saim; Mohd Zuli Jaafar

    2015-01-01

    The analysis of paint chips is of great interest to forensic investigators, particularly in the examination of hit-and run cases. This study proposes a direct and rapid method in classifying automobile paint samples based on colorimetric data sets; absorption value, reflectance value, luminosity value (L), degree of redness (a) and degree of yellowness (b) obtained from video spectral comparator (VSC) technique. A total of 42 automobile paint samples from 7 manufacturers were analysed. The colorimetric datasets obtained from VSC analysis were subjected to chemometric technique namely cluster analysis (CA) and principal component analysis (PCA). Based on CA, 5 clusters were generated; Cluster 1 consisted of silver color, cluster 2 consisted of white color, cluster 3 consisted of blue and black colors, cluster 4 consisted of red color and cluster 5 consisted of light blue color. PCA resulted in two latent factors explaining 95.58 % of the total variance, enabled to group the 42 automobile paints into five groups. Chemometric application on colorimetric datasets provide meaningful classification of automobile paints based on their tone colour (L, a, b) and light intensity These approaches have the potential to ease the interpretation of complex spectral data involving a large number of comparisons. (author)

  17. Simultaneous chemometric determination of pyridoxine hydrochloride and isoniazid in tablets by multivariate regression methods.

    Science.gov (United States)

    Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru

    2010-08-01

    The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.

  18. Analysis of lard in meatball broth using Fourier transform infrared spectroscopy and chemometrics.

    Science.gov (United States)

    Kurniawati, Endah; Rohman, Abdul; Triyana, Kuwat

    2014-01-01

    Meatball is one of the favorite foods in Indonesia. For the economic reason (due to the price difference), the substitution of beef meat with pork can occur. In this study, FTIR spectroscopy in combination with chemometrics of partial least square (PLS) and principal component analysis (PCA) was used for analysis of pork fat (lard) in meatball broth. Lard in meatball broth was quantitatively determined at wavenumber region of 1018-1284 cm(-1). The coefficient of determination (R(2)) and root mean square error of calibration (RMSEC) values obtained were 0.9975 and 1.34% (v/v), respectively. Furthermore, the classification of lard and beef fat in meatball broth as well as in commercial samples was performed at wavenumber region of 1200-1000 cm(-1). The results showed that FTIR spectroscopy coupled with chemometrics can be used for quantitative analysis and classification of lard in meatball broth for Halal verification studies. The developed method is simple in operation, rapid and not involving extensive sample preparation. © 2013.

  19. Activated sludge characterization through microscopy: A review on quantitative image analysis and chemometric techniques

    Energy Technology Data Exchange (ETDEWEB)

    Mesquita, Daniela P. [IBB-Institute for Biotechnology and Bioengineering, Centre of Biological Engineering, Universidade do Minho, Campus de Gualtar, 4710-057 Braga (Portugal); Amaral, A. Luís [IBB-Institute for Biotechnology and Bioengineering, Centre of Biological Engineering, Universidade do Minho, Campus de Gualtar, 4710-057 Braga (Portugal); Instituto Politécnico de Coimbra, ISEC, DEQB, Rua Pedro Nunes, Quinta da Nora, 3030-199 Coimbra (Portugal); Ferreira, Eugénio C., E-mail: ecferreira@deb.uminho.pt [IBB-Institute for Biotechnology and Bioengineering, Centre of Biological Engineering, Universidade do Minho, Campus de Gualtar, 4710-057 Braga (Portugal)

    2013-11-13

    Graphical abstract: -- Highlights: •Quantitative image analysis shows potential to monitor activated sludge systems. •Staining techniques increase the potential for detection of operational problems. •Chemometrics combined with quantitative image analysis is valuable for process monitoring. -- Abstract: In wastewater treatment processes, and particularly in activated sludge systems, efficiency is quite dependent on the operating conditions, and a number of problems may arise due to sludge structure and proliferation of specific microorganisms. In fact, bacterial communities and protozoa identification by microscopy inspection is already routinely employed in a considerable number of cases. Furthermore, quantitative image analysis techniques have been increasingly used throughout the years for the assessment of aggregates and filamentous bacteria properties. These procedures are able to provide an ever growing amount of data for wastewater treatment processes in which chemometric techniques can be a valuable tool. However, the determination of microbial communities’ properties remains a current challenge in spite of the great diversity of microscopy techniques applied. In this review, activated sludge characterization is discussed highlighting the aggregates structure and filamentous bacteria determination by image analysis on bright-field, phase-contrast, and fluorescence microscopy. An in-depth analysis is performed to summarize the many new findings that have been obtained, and future developments for these biological processes are further discussed.

  20. Chemometrics and chromatographic fingerprints to classify plant food supplements according to the content of regulated plants.

    Science.gov (United States)

    Deconinck, E; Sokeng Djiogo, C A; Courselle, P

    2017-09-05

    Plant food supplements are gaining popularity, resulting in a broader spectrum of available products and an increased consumption. Next to the problem of adulteration of these products with synthetic drugs the presence of regulated or toxic plants is an important issue, especially when the products are purchased from irregular sources. This paper focusses on this problem by using specific chromatographic fingerprints for five targeted plants and chemometric classification techniques in order to extract the important information from the fingerprints and determine the presence of the targeted plants in plant food supplements in an objective way. Two approaches were followed: (1) a multiclass model, (2) 2-class model for each of the targeted plants separately. For both approaches good classification models were obtained, especially when using SIMCA and PLS-DA. For each model, misclassification rates for the external test set of maximum one sample could be obtained. The models were applied to five real samples resulting in the identification of the correct plants, confirmed by mass spectrometry. Therefore chromatographic fingerprinting combined with chemometric modelling can be considered interesting to make a more objective decision on whether a regulated plant is present in a plant food supplement or not, especially when no mass spectrometry equipment is available. The results suggest also that the use of a battery of 2-class models to screen for several plants is the approach to be preferred. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Simultaneous quantitative determination of paracetamol and tramadol in tablet formulation using UV spectrophotometry and chemometric methods

    Science.gov (United States)

    Glavanović, Siniša; Glavanović, Marija; Tomišić, Vladislav

    2016-03-01

    The UV spectrophotometric methods for simultaneous quantitative determination of paracetamol and tramadol in paracetamol-tramadol tablets were developed. The spectrophotometric data obtained were processed by means of partial least squares (PLS) and genetic algorithm coupled with PLS (GA-PLS) methods in order to determine the content of active substances in the tablets. The results gained by chemometric processing of the spectroscopic data were statistically compared with those obtained by means of validated ultra-high performance liquid chromatographic (UHPLC) method. The accuracy and precision of data obtained by the developed chemometric models were verified by analysing the synthetic mixture of drugs, and by calculating recovery as well as relative standard error (RSE). A statistically good agreement was found between the amounts of paracetamol determined using PLS and GA-PLS algorithms, and that obtained by UHPLC analysis, whereas for tramadol GA-PLS results were proven to be more reliable compared to those of PLS. The simplest and the most accurate and precise models were constructed by using the PLS method for paracetamol (mean recovery 99.5%, RSE 0.89%) and the GA-PLS method for tramadol (mean recovery 99.4%, RSE 1.69%).

  2. Chemometrics-based process analytical technology (PAT) tools: applications and adaptation in pharmaceutical and biopharmaceutical industries.

    Science.gov (United States)

    Challa, Shruthi; Potumarthi, Ravichandra

    2013-01-01

    Process analytical technology (PAT) is used to monitor and control critical process parameters in raw materials and in-process products to maintain the critical quality attributes and build quality into the product. Process analytical technology can be successfully implemented in pharmaceutical and biopharmaceutical industries not only to impart quality into the products but also to prevent out-of-specifications and improve the productivity. PAT implementation eliminates the drawbacks of traditional methods which involves excessive sampling and facilitates rapid testing through direct sampling without any destruction of sample. However, to successfully adapt PAT tools into pharmaceutical and biopharmaceutical environment, thorough understanding of the process is needed along with mathematical and statistical tools to analyze large multidimensional spectral data generated by PAT tools. Chemometrics is a chemical discipline which incorporates both statistical and mathematical methods to obtain and analyze relevant information from PAT spectral tools. Applications of commonly used PAT tools in combination with appropriate chemometric method along with their advantages and working principle are discussed. Finally, systematic application of PAT tools in biopharmaceutical environment to control critical process parameters for achieving product quality is diagrammatically represented.

  3. Chromatographic fingerprinting through chemometric techniques for herbal slimming pills: A way of adulterant identification.

    Science.gov (United States)

    Shekari, Nafiseh; Vosough, Maryam; Tabar Heidar, Kourosh

    2018-05-01

    In the current study, gas chromatography-mass spectrometry (GC-MS) fingerprinting of herbal slimming pills assisted by chemometric methods has been presented. Deconvolution of two-way chromatographic signals of nine herbal slimming pills into pure chromatographic and spectral patterns was performed. The peak clusters were resolved using multivariate curve resolution-alternating least squares (MCR-ALS) by employing appropriate constraints. It was revealed that more useful chemical information about the composition of the slimming pills can be obtained by employing sophisticated GC-MS method coupled with proper chemometric tools yielding the extended number of identified constituents. The thorough fingerprinting of the complex mixtures proved the presence of some toxic or carcinogen components, such as toluene, furfural, furfuryl alcohol, styrene, itaconic anhydride, citraconic anhydride, trimethyl phosphate, phenol, pyrocatechol, p-propenylanisole and pyrogallol. In addition, some samples were shown to be adulterated with undeclared ingredients, including stimulants, anorexiant and laxatives such as phenolphthalein, amfepramone, caffeine and sibutramine. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Chemometric analysis for discrimination of extra virgin olive oils from whole and stoned olive pastes.

    Science.gov (United States)

    De Luca, Michele; Restuccia, Donatella; Clodoveo, Maria Lisa; Puoci, Francesco; Ragno, Gaetano

    2016-07-01

    Chemometric discrimination of extra virgin olive oils (EVOO) from whole and stoned olive pastes was carried out by using Fourier transform infrared (FTIR) data and partial least squares-discriminant analysis (PLS1-DA) approach. Four Italian commercial EVOO brands, all in both whole and stoned version, were considered in this study. The adopted chemometric methodologies were able to describe the different chemical features in phenolic and volatile compounds contained in the two types of oil by using unspecific IR spectral information. Principal component analysis (PCA) was employed in cluster analysis to capture data patterns and to highlight differences between technological processes and EVOO brands. The PLS1-DA algorithm was used as supervised discriminant analysis to identify the different oil extraction procedures. Discriminant analysis was extended to the evaluation of possible adulteration by addition of aliquots of oil from whole paste to the most valuable oil from stoned olives. The statistical parameters from external validation of all the PLS models were very satisfactory, with low root mean square error of prediction (RMSEP) and relative error (RE%). Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Determination of geographical origin and icariin content of Herba Epimedii using near infrared spectroscopy and chemometrics

    Science.gov (United States)

    Yang, Yue; Wu, Yongjiang; Li, Weili; Liu, Xuesong; Zheng, Jiyu; Zhang, Wentao; Chen, Yong

    2018-02-01

    Near infrared (NIR) spectroscopy coupled with chemometrics was used to discriminate the geographical origin of Herba Epimedii in this work. Four different classification models, namely discriminant analysis (DA), back propagation neural network (BPNN), K-nearest neighbor (KNN), and support vector machine (SVM), were constructed, and their performances in terms of recognition accuracy were compared. The results indicated that the SVM model was superior over the other models in the geographical origin identification of Herba Epimedii. The recognition rates of the optimum SVM model were up to 100% for the calibration set and 94.44% for the prediction set, respectively. In addition, the feasibility of NIR spectroscopy with the CARS-PLSR calibration model in prediction of icariin content of Herba Epimedii was also investigated. The determination coefficient (RP2) and root-mean-square error (RMSEP) for prediction set were 0.9269 and 0.0480, respectively. It can be concluded that the NIR spectroscopy technique in combination with chemometrics has great potential in determination of geographical origin and icariin content of Herba Epimedii. This study can provide a valuable reference for rapid quality control of food products.

  6. Effect of emodin on Candida albicans growth investigated by microcalorimetry combined with chemometric analysis.

    Science.gov (United States)

    Kong, W J; Wang, J B; Jin, C; Zhao, Y L; Dai, C M; Xiao, X H; Li, Z L

    2009-07-01

    Using the 3114/3115 thermal activity monitor (TAM) air isothermal microcalorimeter, ampoule mode, the heat output of Candida albicans growth at 37 degrees C was measured, and the effect of emodin on C. albicans growth was evaluated by microcalorimetry coupled with chemometric methods. The similarities between the heat flow power (HFP)-time curves of C. albicans growth affected by different concentrations of emodin were calculated by similarity analysis (SA). In the correspondence analysis (CA) diagram of eight quantitative parameters taken from the HFP-time curves, it could be deduced that emodin had definite dose-effect relationship as the distance between different concentrations of it increased along with the dosage and the effect. From the principal component analysis (PCA) on eight quantitative parameters, the action of emodin on C. albicans growth could be easily evaluated by analyzing the change of values of the main two parameters, growth rate constant k (2) and maximum power output P(2)(m). The coherent results of SA, CA, and PCA showed that emodin at different concentrations had different effects on C. albicans growth metabolism: A low concentration (0-10 microg ml(-1)) poorly inhibited the growth of C. albicans, and a high concentration (15-35 microg ml(-1)) could notably inhibit growth of this fungus. This work provided a useful idea of the combination of microcalorimetry and chemometric analysis for investigating the effect of drug and other compounds on microbes.

  7. Discriminating the Geographical Origins of Chinese White Lotus Seeds by Near-Infrared Spectroscopy and Chemometrics

    Directory of Open Access Journals (Sweden)

    Lu Xu

    2015-01-01

    Full Text Available The traceability of a Chinese white lotus seed (WLS with Protected Designation of Origin (PDO was investigated using near-infrared (NIR spectroscopy and chemometrics. Three chemometrics methods, discrimination analysis (DA, class modeling, and a newly proposed strategy, the fusion of DA and class modeling, were investigated to compare their capacity to trace the geographical origins of WLS. Least squares support vector machine (LS-SVM was developed to distinguish the PDO WLS from non-PDO WLS of four main producing areas. A class modeling technique, one-class partial least squares (OCPLS, was developed only using the data of PDO WLS. By the fusion of LS-SVM and OCPLS, the best prediction sensitivity and specificity were 0.900 and 0.973, respectively. The results indicate that fusion of DA and class modeling can enhance the specificity for detection of non-PDO products. The conclusion is that DA and class modeling should be combined for tracing food geographical origins.

  8. Cider fermentation process monitoring by Vis-NIR sensor system and chemometrics.

    Science.gov (United States)

    Villar, Alberto; Vadillo, Julen; Santos, Jose I; Gorritxategi, Eneko; Mabe, Jon; Arnaiz, Aitor; Fernández, Luis A

    2017-04-15

    Optimization of a multivariate calibration process has been undertaken for a Visible-Near Infrared (400-1100nm) sensor system, applied in the monitoring of the fermentation process of the cider produced in the Basque Country (Spain). The main parameters that were monitored included alcoholic proof, l-lactic acid content, glucose+fructose and acetic acid content. The multivariate calibration was carried out using a combination of different variable selection techniques and the most suitable pre-processing strategies were selected based on the spectra characteristics obtained by the sensor system. The variable selection techniques studied in this work include Martens Uncertainty test, interval Partial Least Square Regression (iPLS) and Genetic Algorithm (GA). This procedure arises from the need to improve the calibration models prediction ability for cider monitoring. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Structural characterization and discrimination of Chinese medicinal materials with multiple botanical origins based on metabolite profiling and chemometrics analysis: Clematidis Radix et Rhizoma as a case study.

    Science.gov (United States)

    Guo, Lin-Xiu; Li, Rui; Liu, Ke; Yang, Jie; Li, Hui-Jun; Li, Song-Lin; Liu, Jian-Qun; Liu, Li-Fang; Xin, Gui-Zhong

    2015-12-18

    Traditional Chinese medicines (TCMs)-based products are becoming more and more popular over the world. To ensure the safety and efficacy, authentication of Chinese medicinal materials has been an important issue, especially for that with multiple botanical origins (one-to-multiple). Taking Clematidis Radix et Rhizoma (CRR) as a case study, we herein developed an integrated platform based on metabolite profiling and chemometrics analysis to characterize, classify, and predict the "one-to-multiple" herbs. Firstly, the predominant constituents, triterpenoid saponins, in three Clematis CRR were rapid characterized by a novel UPLC-QTOF/MS-based strategy, and a total of 49 triterpenoid saponins were identified. Secondly, metabolite profiling was performed by UPLC-QTOF/MS, and 4623 variables were extracted and aligned as dataset. Thirdly, by using pattern recognition analysis, a clear separation of the three Clematis CRR was achieved as well as a total number of 28 variables were screened as the valuable variables for discrimination. By matching with identified saponins, these 28 variables were corresponding to 10 saponins which were identified as marker compounds. Fourthly, based on the relative intensity of the marker compounds-related variables, genetic algorithm optimized support vector machines (GA-SVM) was employed to predict the species of CRR samples. The obtained model showed excellent prediction performance with a prediction accuracy of 100%. Finally, a heatmap visualization was employed for clarifying the distribution of identified saponins, which could be useful for phytochemotaxonomy study of Clematis herbs. These results indicated that our proposed platform was a powerful tool for chemical profiling and discrimination of herbs with multiple botanical origins, providing promising perspectives in tracking the formulation processes of TCMs products. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. A qualitative chemometric study of resin composite polymerization

    Directory of Open Access Journals (Sweden)

    Regina Ferraz Mendes

    2008-01-01

    Full Text Available Objective: An experiment was carried out to assess the effect produced by different polymerization techniques on resin composite color after it has been immersed in coffee. Methods: Samples were manufactured using TPH Spectrum composite. It was polymerized for 10 or 40 seconds, with the light tip at one or zero millimeters from the resin surface, and afterwards the samples were immersed in coffee for 24 hours or 7 days. Ten different evaluators classified the samples according to their degree of staining. Results: The samples that were polymerized for 10 seconds were more susceptible to staining than the ones polymerized by 40 seconds. Samples immersed in coffee for 7 days were more susceptible to staining than the ones immersed for 24 hours. Conclusion: The variables polymerization time and immersion time were determinant in the staining susceptibility of the studied composite by coffee. However, there was no significant difference, irrespective of whether the resin was polymerized 10 or zero millimeters away from the resin surface.

  11. SELECTION OF BURST-LIKE TRANSIENTS AND STOCHASTIC VARIABLES USING MULTI-BAND IMAGE DIFFERENCING IN THE PAN-STARRS1 MEDIUM-DEEP SURVEY

    International Nuclear Information System (INIS)

    Kumar, S.; Gezari, S.; Heinis, S.; Chornock, R.; Berger, E.; Soderberg, A.; Stubbs, C. W.; Kirshner, R. P.; Rest, A.; Huber, M. E.; Narayan, G.; Marion, G. H.; Burgett, W. S.; Foley, R. J.; Scolnic, D.; Riess, A. G.; Lawrence, A.; Smartt, S. J.; Smith, K.; Wood-Vasey, W. M.

    2015-01-01

    We present a novel method for the light-curve characterization of Pan-STARRS1 Medium Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SVs) and burst-like (BL) transients, using multi-band image-differencing time-series data. We select detections in difference images associated with galaxy hosts using a star/galaxy catalog extracted from the deep PS1 MDS stacked images, and adopt a maximum a posteriori formulation to model their difference-flux time-series in four Pan-STARRS1 photometric bands g P1 , r P1 , i P1 , and z P1 . We use three deterministic light-curve models to fit BL transients; a Gaussian, a Gamma distribution, and an analytic supernova (SN) model, and one stochastic light-curve model, the Ornstein-Uhlenbeck process, in order to fit variability that is characteristic of active galactic nuclei (AGNs). We assess the quality of fit of the models band-wise and source-wise, using their estimated leave-out-one cross-validation likelihoods and corrected Akaike information criteria. We then apply a K-means clustering algorithm on these statistics, to determine the source classification in each band. The final source classification is derived as a combination of the individual filter classifications, resulting in two measures of classification quality, from the averages across the photometric filters of (1) the classifications determined from the closest K-means cluster centers, and (2) the square distances from the clustering centers in the K-means clustering spaces. For a verification set of AGNs and SNe, we show that SV and BL occupy distinct regions in the plane constituted by these measures. We use our clustering method to characterize 4361 extragalactic image difference detected sources, in the first 2.5 yr of the PS1 MDS, into 1529 BL, and 2262 SV, with a purity of 95.00% for AGNs, and 90.97% for SN based on our verification sets. We combine our light-curve classifications with their nuclear or off-nuclear host galaxy offsets, to

  12. SELECTION OF BURST-LIKE TRANSIENTS AND STOCHASTIC VARIABLES USING MULTI-BAND IMAGE DIFFERENCING IN THE PAN-STARRS1 MEDIUM-DEEP SURVEY

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, S.; Gezari, S.; Heinis, S. [Department of Astronomy, University of Maryland, Stadium Drive, College Park, MD 21224 (United States); Chornock, R.; Berger, E.; Soderberg, A.; Stubbs, C. W.; Kirshner, R. P. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Rest, A. [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States); Huber, M. E.; Narayan, G.; Marion, G. H.; Burgett, W. S. [Institute for Astronomy, University of Hawaii, 2680 Woodlawn Drive, Honolulu, HI 96822 (United States); Foley, R. J. [Astronomy Department, University of Illinois at Urbana-Champaign, 1002 West Green Street, Urbana, IL 61801 (United States); Scolnic, D.; Riess, A. G. [Department of Physics and Astronomy, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218 (United States); Lawrence, A. [Institute for Astronomy, University of Edinburgh Scottish Universities Physics Alliance, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ (United Kingdom); Smartt, S. J.; Smith, K. [Astrophysics Research Centre, School of Mathematics and Physics, Queen' s University Belfast, Belfast BT7 1NN (United Kingdom); Wood-Vasey, W. M. [Pittsburgh Particle Physics, Astrophysics, and Cosmology Center, Department of Physics and Astronomy, University of Pittsburgh, 3941 O' Hara Street, Pittsburgh, PA 15260 (United States); and others

    2015-03-20

    We present a novel method for the light-curve characterization of Pan-STARRS1 Medium Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SVs) and burst-like (BL) transients, using multi-band image-differencing time-series data. We select detections in difference images associated with galaxy hosts using a star/galaxy catalog extracted from the deep PS1 MDS stacked images, and adopt a maximum a posteriori formulation to model their difference-flux time-series in four Pan-STARRS1 photometric bands g {sub P1}, r {sub P1}, i {sub P1}, and z {sub P1}. We use three deterministic light-curve models to fit BL transients; a Gaussian, a Gamma distribution, and an analytic supernova (SN) model, and one stochastic light-curve model, the Ornstein-Uhlenbeck process, in order to fit variability that is characteristic of active galactic nuclei (AGNs). We assess the quality of fit of the models band-wise and source-wise, using their estimated leave-out-one cross-validation likelihoods and corrected Akaike information criteria. We then apply a K-means clustering algorithm on these statistics, to determine the source classification in each band. The final source classification is derived as a combination of the individual filter classifications, resulting in two measures of classification quality, from the averages across the photometric filters of (1) the classifications determined from the closest K-means cluster centers, and (2) the square distances from the clustering centers in the K-means clustering spaces. For a verification set of AGNs and SNe, we show that SV and BL occupy distinct regions in the plane constituted by these measures. We use our clustering method to characterize 4361 extragalactic image difference detected sources, in the first 2.5 yr of the PS1 MDS, into 1529 BL, and 2262 SV, with a purity of 95.00% for AGNs, and 90.97% for SN based on our verification sets. We combine our light-curve classifications with their nuclear or off-nuclear host

  13. Improved intact soil-core carbon determination applying regression shrinkage and variable selection techniques to complete spectrum laser-induced breakdown spectroscopy (LIBS).

    Science.gov (United States)

    Bricklemyer, Ross S; Brown, David J; Turk, Philip J; Clegg, Sam M

    2013-10-01

    Laser-induced breakdown spectroscopy (LIBS) provides a potential method for rapid, in situ soil C measurement. In previous research on the application of LIBS to intact soil cores, we hypothesized that ultraviolet (UV) spectrum LIBS (200-300 nm) might not provide sufficient elemental information to reliably discriminate between soil organic C (SOC) and inorganic C (IC). In this study, using a custom complete spectrum (245-925 nm) core-scanning LIBS instrument, we analyzed 60 intact soil cores from six wheat fields. Predictive multi-response partial least squares (PLS2) models using full and reduced spectrum LIBS were compared for directly determining soil total C (TC), IC, and SOC. Two regression shrinkage and variable selection approaches, the least absolute shrinkage and selection operator (LASSO) and sparse multivariate regression with covariance estimation (MRCE), were tested for soil C predictions and the identification of wavelengths important for soil C prediction. Using complete spectrum LIBS for PLS2 modeling reduced the calibration standard error of prediction (SEP) 15 and 19% for TC and IC, respectively, compared to UV spectrum LIBS. The LASSO and MRCE approaches provided significantly improved calibration accuracy and reduced SEP 32-55% over UV spectrum PLS2 models. We conclude that (1) complete spectrum LIBS is superior to UV spectrum LIBS for predicting soil C for intact soil cores without pretreatment; (2) LASSO and MRCE approaches provide improved calibration prediction accuracy over PLS2 but require additional testing with increased soil and target analyte diversity; and (3) measurement errors associated with analyzing intact cores (e.g., sample density and surface roughness) require further study and quantification.

  14. Assessment of sediment quality in the Mediterranean Sea-Boughrara lagoon exchange areas (southeastern Tunisia): GIS approach-based chemometric methods.

    Science.gov (United States)

    Kharroubi, Adel; Gargouri, Dorra; Baati, Houda; Azri, Chafai

    2012-06-01

    Concentrations of selected heavy metals (Cd, Pb, Zn, Cu, Mn, and Fe) in surface sediments from 66 sites in both northern and eastern Mediterranean Sea-Boughrara lagoon exchange areas (southeastern Tunisia) were studied in order to understand current metal contamination due to the urbanization and economic development of nearby several coastal regions of the Gulf of Gabès. Multiple approaches were applied for the sediment quality assessment. These approaches were based on GIS coupled with chemometric methods (enrichment factors, geoaccumulation index, principal component analysis, and cluster analysis). Enrichment factors and principal component analysis revealed two distinct groups of metals. The first group corresponded to Fe and Mn derived from natural sources, and the second group contained Cd, Pb, Zn, and Cu originated from man-made sources. For these latter metals, cluster analysis showed two distinct distributions in the selected areas. They were attributed to temporal and spatial variations of contaminant sources input. The geoaccumulation index (I (geo)) values explained that only Cd, Pb, and Cu can be considered as moderate to extreme pollutants in the studied sediments.

  15. Bayesian variable selection for multistate Markov models with interval-censored data in an ecological momentary assessment study of smoking cessation.

    Science.gov (United States)

    Koslovsky, Matthew D; Swartz, Michael D; Chan, Wenyaw; Leon-Novelo, Luis; Wilkinson, Anna V; Kendzor, Darla E; Businelle, Michael S

    2017-10-11

    The application of sophisticated analytical methods to intensive longitudinal data, collected with ecological momentary assessments (EMA), has helped researchers better understand smoking behaviors after a quit attempt. Unfortunately, the wealth of information captured with EMAs is typically underutilized in practice. Thus, novel methods are needed to extract this information in exploratory research studies. One of the main objectives of intensive longitudinal data analysis is identifying relations between risk factors and outcomes of interest. Our goal is to develop and apply expectation maximization variable selection for Bayesian multistate Markov models with interval-censored data to generate new insights into the relation between potential risk factors and transitions between smoking states. Through simulation, we demonstrate the effectiveness of our method in identifying associated risk factors and its ability to outperform the LASSO in a special case. Additionally, we use the expectation conditional-maximization algorithm to simplify estimation, a deterministic annealing variant to reduce the algorithm's dependence on starting values, and Louis's method to estimate unknown parameter uncertainty. We then apply our method to intensive longitudinal data collected with EMA to identify risk factors associated with transitions between smoking states after a quit attempt in a cohort of socioeconomically disadvantaged smokers who were interested in quitting. © 2017, The International Biometric Society.

  16. Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics

    Science.gov (United States)

    Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio

    2018-01-01

    The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.

  17. Chemometrics. Development of a sensor for the evaluation of palatable water; Chemometrics. Oishii mizu sensor no kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    Sasaki, K.; Hamaoka, T. [Hiroshima-Denki Inst. of Technology, Hiroshima (Japan)

    1997-05-20

    A method was developed to detect and determine palatable waters. Multiple regression analysis was carried out to verify the determination on renowned palatable waters based on requirements for palatable waters specified by the Ministry of Health and Welfare. Test samples used in the analysis are renowned palatable waters produced in the Hiroshima and other areas. It was found possible to express palatability of the waters to some extent by detecting total hardness, organic matter content, total iron and bicarbonate ion, in relation to the correlation thereof with a result of organoleptic examination. Fuzzy inference was applied to evaluation of the renowned waters. As the inference rule, selections were made on organic matter content, total hardness, bicarbonate ion and total iron for the antecedents, and values to express palatability for the consequents. A neural network method was devised, in which error from the reference water determination is minimized. The fuzzy inference was capable of expressing human sense in a numerical value. A prototype sensor was fabricated to determine reputed waters suitable for brewing Japanese Sake. Water suitable for fermentation is the water containing minerals that increase fermentation strength in yeast, and not containing such impurities as organic matters and iron. A bio-sensor was developed, which is capable of evaluating latent fermenting capability of mineral containing waters. 10 refs., 5 figs., 3 tabs.

  18. Chemometric profile, antioxidant and tyrosinase inhibitory activity of Camel's foot creeper leaves (Bauhinia vahlii).

    Science.gov (United States)

    Panda, Pritipadma; Dash, Priyanka; Ghosh, Goutam

    2018-03-01

    The present study is the first effort to a comprehensive evaluation of antityrosinase activity and chemometric analysis of Bauhinia vahlii. The experimental results revealed that the methanol extract of Bauhinia vahlii (BVM) possesses higher polyphenolic compounds and total antioxidant activity than those reported elsewhere for other more conventionally and geographically different varieties. The BVM contain saturated fatty acids such as hexadecanoic acid (10.15%), octadecanoic acid (1.97%), oleic acid (0.61%) and cis-vaccenic acid (2.43%) along with vitamin E (12.71%), α-amyrin (9.84%), methyl salicylate (2.39%) and β-sitosterol (17.35%), which were mainly responsible for antioxidant as well as tyrosinase inhibitory activity. Tyrosinase inhibitory activity of this extract was comparable to that of Kojic acid. These findings suggested that the B. vahlii leaves could be exploited as potential source of natural antioxidant and tyrosinase inhibitory agent, as well.

  19. Chemometrics applied to the incorporation of omega-3 in tilapia fillet feed flaxseed flour

    Directory of Open Access Journals (Sweden)

    Márcia Fernandes Nishiyama

    2014-09-01

    Full Text Available This study evaluated the effect of adding flaxseed flour to the diet of Nile tilapia on the fatty acid composition of fillets using chemometrics. A traditional and an experimental diet containing flaxseed flour were used to feed the fish for 60 days. An increase of 18:3 n-3 and 22:6 n-3 and a decrease of 18:2 n-6 were observed in the tilapia fillets fed the experimental diet. There was a reduction in the n-6:n-3 ratio. A period of 45 days of incorporation caused a significant change in tilapia chemical composition. Principal Component Analysis showed that the time periods of 45 and 60 days positively contributed to the total content of n-3, LNA, and DHA, highlighting the effect of omega-3 incorporation in the treatment containing flaxseed flour.

  20. Authentication of monofloral Yemeni Sidr honey using ultraviolet spectroscopy and chemometric analysis.

    Science.gov (United States)

    Roshan, Abdul-Rahman A; Gad, Haidy A; El-Ahmady, Sherweit H; Khanbash, Mohamed S; Abou-Shoer, Mohamed I; Al-Azizi, Mohamed M

    2013-08-14

    This work describes a simple model developed for the authentication of monofloral Yemeni Sidr honey using UV spectroscopy together with chemometric techniques of hierarchical cluster analysis (HCA), principal component analysis (PCA), and soft independent modeling of class analogy (SIMCA). The model was constructed using 13 genuine Sidr honey samples and challenged with 25 honey samples of different botanical origins. HCA and PCA were successfully able to present a preliminary clustering pattern to segregate the genuine Sidr samples from the lower priced local polyfloral and non-Sidr samples. The SIMCA model presented a clear demarcation of the samples and was used to identify genuine Sidr honey samples as well as detect admixture with lower priced polyfloral honey by detection limits >10%. The constructed model presents a simple and efficient method of analysis and may serve as a basis for the authentication of other honey types worldwide.

  1. Detection of irradiated beef by nuclear magnetic resonance lipid profiling combined with chemometric techniques.

    Science.gov (United States)

    Zanardi, Emanuela; Caligiani, Augusta; Padovani, Enrico; Mariani, Mario; Ghidini, Sergio; Palla, Gerardo; Ianieri, Adriana

    2013-02-01<