integrating statistical predictions: Topics by WorldWideScience.org

Sample records for integrating statistical predictions

Personalizing oncology treatments by predicting drug efficacy, side-effects, and improved therapy: mathematics, statistics, and their integration.

Science.gov (United States)

Agur, Zvia; Elishmereni, Moran; Kheifetz, Yuri

2014-01-01

Despite its great promise, personalized oncology still faces many hurdles, and it is increasingly clear that targeted drugs and molecular biomarkers alone yield only modest clinical benefit. One reason is the complex relationships between biomarkers and the patient's response to drugs, obscuring the true weight of the biomarkers in the overall patient's response. This complexity can be disentangled by computational models that integrate the effects of personal biomarkers into a simulator of drug-patient dynamic interactions, for predicting the clinical outcomes. Several computational tools have been developed for personalized oncology, notably evidence-based tools for simulating pharmacokinetics, Bayesian-estimated tools for predicting survival, etc. We describe representative statistical and mathematical tools, and discuss their merits, shortcomings and preliminary clinical validation attesting to their potential. Yet, the individualization power of mathematical models alone, or statistical models alone, is limited. More accurate and versatile personalization tools can be constructed by a new application of the statistical/mathematical nonlinear mixed effects modeling (NLMEM) approach, which until recently has been used only in drug development. Using these advanced tools, clinical data from patient populations can be integrated with mechanistic models of disease and physiology, for generating personal mathematical models. Upon a more substantial validation in the clinic, this approach will hopefully be applied in personalized clinical trials, P-trials, hence aiding the establishment of personalized medicine within the main stream of clinical oncology. © 2014 Wiley Periodicals, Inc.
Statistical timing for parametric yield prediction of digital integrated circuits

NARCIS (Netherlands)

Jess, J.A.G.; Kalafala, K.; Naidu, S.R.; Otten, R.H.J.M.; Visweswariah, C.

2006-01-01

Uncertainty in circuit performance due to manufacturing and environmental variations is increasing with each new generation of technology. It is therefore important to predict the performance of a chip as a probabilistic quantity. This paper proposes three novel path-based algorithms for statistical
A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

Science.gov (United States)

Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

2015-05-27

Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.
GAPIT: genome association and prediction integrated tool.

Science.gov (United States)

Lipka, Alexander E; Tian, Feng; Wang, Qishan; Peiffer, Jason; Li, Meng; Bradbury, Peter J; Gore, Michael A; Buckler, Edward S; Zhang, Zhiwu

2012-09-15

Software programs that conduct genome-wide association studies and genomic prediction and selection need to use methodologies that maximize statistical power, provide high prediction accuracy and run in a computationally efficient manner. We developed an R package called Genome Association and Prediction Integrated Tool (GAPIT) that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The GAPIT package can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time, while providing user-friendly access and concise tables and graphs to interpret results. http://www.maizegenetics.net/GAPIT. zhiwu.zhang@cornell.edu Supplementary data are available at Bioinformatics online.
Multimodal integration in statistical learning

DEFF Research Database (Denmark)

Mitchell, Aaron; Christiansen, Morten Hyllekvist; Weiss, Dan

2014-01-01

, we investigated the ability of adults to integrate audio and visual input during statistical learning. We presented learners with a speech stream synchronized with a video of a speaker’s face. In the critical condition, the visual (e.g., /gi/) and auditory (e.g., /mi/) signals were occasionally...... facilitated participants’ ability to segment the speech stream. Our results therefore demonstrate that participants can integrate audio and visual input to perceive the McGurk illusion during statistical learning. We interpret our findings as support for modality-interactive accounts of statistical learning.......Recent advances in the field of statistical learning have established that learners are able to track regularities of multimodal stimuli, yet it is unknown whether the statistical computations are performed on integrated representations or on separate, unimodal representations. In the present study...
Statistical Methods in Integrative Genomics

Science.gov (United States)

Richardson, Sylvia; Tseng, George C.; Sun, Wei

2016-01-01

Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531
Statistical inference an integrated approach

CERN Document Server

Migon, Helio S; Louzada, Francisco

2014-01-01

Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

Science.gov (United States)

2012-01-01

Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence
Exclusion statistics and integrable models

International Nuclear Information System (INIS)

Mashkevich, S.

1998-01-01

The definition of exclusion statistics, as given by Haldane, allows for a statistical interaction between distinguishable particles (multi-species statistics). The thermodynamic quantities for such statistics ca be evaluated exactly. The explicit expressions for the cluster coefficients are presented. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models. The interesting questions of generalizing this correspondence onto the higher-dimensional and the multi-species cases remain essentially open
Exclusion statistics and integrable models

International Nuclear Information System (INIS)

Mashkevich, S.

1998-01-01

The definition of exclusion statistics that was given by Haldane admits a 'statistical interaction' between distinguishable particles (multispecies statistics). For such statistics, thermodynamic quantities can be evaluated exactly; explicit expressions are presented here for cluster coefficients. Furthermore, single-species exclusion statistics is realized in one-dimensional integrable models of the Calogero-Sutherland type. The interesting questions of generalizing this correspondence to the higher-dimensional and the multispecies cases remain essentially open; however, our results provide some hints as to searches for the models in question
Probability and statistics with integrated software routines

CERN Document Server

Deep, Ronald

2005-01-01

Probability & Statistics with Integrated Software Routines is a calculus-based treatment of probability concurrent with and integrated with statistics through interactive, tailored software applications designed to enhance the phenomena of probability and statistics. The software programs make the book unique.The book comes with a CD containing the interactive software leading to the Statistical Genie. The student can issue commands repeatedly while making parameter changes to observe the effects. Computer programming is an excellent skill for problem solvers, involving design, prototyping, data gathering, testing, redesign, validating, etc, all wrapped up in the scientific method.See also: CD to accompany Probability and Stats with Integrated Software Routines (0123694698)* Incorporates more than 1,000 engaging problems with answers* Includes more than 300 solved examples* Uses varied problem solving methods
Functional integral approach to classical statistical dynamics

International Nuclear Information System (INIS)

Jensen, R.V.

1980-04-01

A functional integral method is developed for the statistical solution of nonlinear stochastic differential equations which arise in classical dynamics. The functional integral approach provides a very natural and elegant derivation of the statistical dynamical equations that have been derived using the operator formalism of Martin, Siggia, and Rose
Integrating Expert Knowledge with Statistical Analysis for Landslide Susceptibility Assessment at Regional Scale

Directory of Open Access Journals (Sweden)

Christos Chalkias

2016-03-01

Full Text Available In this paper, an integration landslide susceptibility model by combining expert-based and bivariate statistical analysis (Landslide Susceptibility Index—LSI approaches is presented. Factors related with the occurrence of landslides—such as elevation, slope angle, slope aspect, lithology, land cover, Mean Annual Precipitation (MAP and Peak Ground Acceleration (PGA—were analyzed within a GIS environment. This integrated model produced a landslide susceptibility map which categorized the study area according to the probability level of landslide occurrence. The accuracy of the final map was evaluated by Receiver Operating Characteristics (ROC analysis depending on an independent (validation dataset of landslide events. The prediction ability was found to be 76% revealing that the integration of statistical analysis with human expertise can provide an acceptable landslide susceptibility assessment at regional scale.
Statistical Approaches for Spatiotemporal Prediction of Low Flows

Science.gov (United States)

Fangmann, A.; Haberlandt, U.

2017-12-01

An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be
Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science).

Science.gov (United States)

Zeng, Irene Sui Lan; Lumley, Thomas

2018-01-01

Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Statistical prediction of Late Miocene climate

Digital Repository Service at National Institute of Oceanography (India)

Fernandes, A.A; Gupta, S.M.

by making certain simplifying assumptions; for example in modelling ocean 4 currents, the geostrophic approximation is made. In case of statistical prediction no such a priori assumption need be made. statistical prediction comprises of using observed data... the number of equations. In this case the equations are overdetermined, and therefore one has to look for a solution that best fits the sample data in a least squares sense. To this end we express the sample .... (2.1)+ ry = y + data as follows: n L c. (x...
Learning Predictive Statistics: Strategies and Brain Mechanisms.

Science.gov (United States)

Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

2017-08-30

When immersed in a new environment, we are challenged to decipher initially incomprehensible streams of sensory information. However, quite rapidly, the brain finds structure and meaning in these incoming signals, helping us to predict and prepare ourselves for future actions. This skill relies on extracting the statistics of event streams in the environment that contain regularities of variable complexity from simple repetitive patterns to complex probabilistic combinations. Here, we test the brain mechanisms that mediate our ability to adapt to the environment's statistics and predict upcoming events. By combining behavioral training and multisession fMRI in human participants (male and female), we track the corticostriatal mechanisms that mediate learning of temporal sequences as they change in structure complexity. We show that learning of predictive structures relates to individual decision strategy; that is, selecting the most probable outcome in a given context (maximizing) versus matching the exact sequence statistics. These strategies engage distinct human brain regions: maximizing engages dorsolateral prefrontal, cingulate, sensory-motor regions, and basal ganglia (dorsal caudate, putamen), whereas matching engages occipitotemporal regions (including the hippocampus) and basal ganglia (ventral caudate). Our findings provide evidence for distinct corticostriatal mechanisms that facilitate our ability to extract behaviorally relevant statistics to make predictions. SIGNIFICANCE STATEMENT Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. Past work has studied how humans identify repetitive patterns and associative pairings. However, the natural environment contains regularities that vary in complexity from simple repetition to complex probabilistic combinations. Here, we combine behavior and multisession fMRI to track the brain mechanisms that mediate our ability to adapt to
Statistical Analysis of CO₂ Exposed Wells to Predict Long Term Leakage through the Development of an Integrated Neural-Genetic Algorithm

Energy Technology Data Exchange (ETDEWEB)

Guo, Boyun [Univ. of Louisiana, Lafayette, LA (United States); Duguid, Andrew [Battelle, Columbus, OH (United States); Nygaard, Ronar [Missouri Univ. of Science and Technology, Rolla, MO (United States)

2017-08-05

The objective of this project is to develop a computerized statistical model with the Integrated Neural-Genetic Algorithm (INGA) for predicting the probability of long-term leak of wells in CO₂ sequestration operations. This object has been accomplished by conducting research in three phases: 1) data mining of CO₂-explosed wells, 2) INGA computer model development, and 3) evaluation of the predictive performance of the computer model with data from field tests. Data mining was conducted for 510 wells in two CO₂ sequestration projects in the Texas Gulf Coast region. They are the Hasting West field and Oyster Bayou field in the Southern Texas. Missing wellbore integrity data were estimated using an analytical and Finite Element Method (FEM) model. The INGA was first tested for performances of convergence and computing efficiency with the obtained data set of high dimension. It was concluded that the INGA can handle the gathered data set with good accuracy and reasonable computing time after a reduction of dimension with a grouping mechanism. A computerized statistical model with the INGA was then developed based on data pre-processing and grouping. Comprehensive training and testing of the model were carried out to ensure that the model is accurate and efficient enough for predicting the probability of long-term leak of wells in CO₂ sequestration operations. The Cranfield in the southern Mississippi was select as the test site. Observation wells CFU31F2 and CFU31F3 were used for pressure-testing, formation-logging, and cement-sampling. Tools run in the wells include Isolation Scanner, Slim Cement Mapping Tool (SCMT), Cased Hole Formation Dynamics Tester (CHDT), and Mechanical Sidewall Coring Tool (MSCT). Analyses of the obtained data indicate no leak of CO₂ cross the cap zone while it is evident that the well cement sheath was invaded by the CO₂ from the storage zone. This observation is consistent
IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

Science.gov (United States)

Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben

2017-09-15

Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Statistical inference an integrated Bayesianlikelihood approach

CERN Document Server

Aitkin, Murray

2010-01-01

Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre

Permutation statistical methods an integrated approach

CERN Document Server

Berry, Kenneth J; Johnston, Janis E

2016-01-01

This research monograph provides a synthesis of a number of statistical tests and measures, which, at first consideration, appear disjoint and unrelated. Numerous comparisons of permutation and classical statistical methods are presented, and the two methods are compared via probability values and, where appropriate, measures of effect size. Permutation statistical methods, compared to classical statistical methods, do not rely on theoretical distributions, avoid the usual assumptions of normality and homogeneity of variance, and depend only on the data at hand. This text takes a unique approach to explaining statistics by integrating a large variety of statistical methods, and establishing the rigor of a topic that to many may seem to be a nascent field in statistics. This topic is new in that it took modern computing power to make permutation methods available to people working in the mainstream of research. This research monograph addresses a statistically-informed audience, and can also easily serve as a ...
Statistical distribution of components of energy eigenfunctions: from nearly-integrable to chaotic

International Nuclear Information System (INIS)

Wang, Jiaozi; Wang, Wen-ge

2016-01-01

We study the statistical distribution of components in the non-perturbative parts of energy eigenfunctions (EFs), in which main bodies of the EFs lie. Our numerical simulations in five models show that deviation of the distribution from the prediction of random matrix theory (RMT) is useful in characterizing the process from nearly-integrable to chaotic, in a way somewhat similar to the nearest-level-spacing distribution. But, the statistics of EFs reveals some more properties, as described below. (i) In the process of approaching quantum chaos, the distribution of components shows a delay feature compared with the nearest-level-spacing distribution in most of the models studied. (ii) In the quantum chaotic regime, the distribution of components always shows small but notable deviation from the prediction of RMT in models possessing classical counterparts, while, the deviation can be almost negligible in models not possessing classical counterparts. (iii) In models whose Hamiltonian matrices possess a clear band structure, tails of EFs show statistical behaviors obviously different from those in the main bodies, while, the difference is smaller for Hamiltonian matrices without a clear band structure.
Statistical Methodologies to Integrate Experimental and Computational Research

Science.gov (United States)

Parker, P. A.; Johnson, R. T.; Montgomery, D. C.

2008-01-01

Development of advanced algorithms for simulating engine flow paths requires the integration of fundamental experiments with the validation of enhanced mathematical models. In this paper, we provide an overview of statistical methods to strategically and efficiently conduct experiments and computational model refinement. Moreover, the integration of experimental and computational research efforts is emphasized. With a statistical engineering perspective, scientific and engineering expertise is combined with statistical sciences to gain deeper insights into experimental phenomenon and code development performance; supporting the overall research objectives. The particular statistical methods discussed are design of experiments, response surface methodology, and uncertainty analysis and planning. Their application is illustrated with a coaxial free jet experiment and a turbulence model refinement investigation. Our goal is to provide an overview, focusing on concepts rather than practice, to demonstrate the benefits of using statistical methods in research and development, thereby encouraging their broader and more systematic application.
Learning predictive statistics from temporal sequences: Dynamics and strategies.

Science.gov (United States)

Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

2017-10-01

Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.
An integrated model of statistical process control and maintenance based on the delayed monitoring

International Nuclear Information System (INIS)

Yin, Hui; Zhang, Guojun; Zhu, Haiping; Deng, Yuhao; He, Fei

2015-01-01

This paper develops an integrated model of statistical process control and maintenance decision. The proposal of delayed monitoring policy postpones the sampling process till a scheduled time and contributes to ten-scenarios of the production process, where equipment failure may occur besides quality shift. The equipment failure and the control chart alert trigger the corrective maintenance and the predictive maintenance, respectively. The occurrence probability, the cycle time and the cycle cost of each scenario are obtained by integral calculation; therefore, a mathematical model is established to minimize the expected cost by using the genetic algorithm. A Monte Carlo simulation experiment is conducted and compared with the integral calculation in order to ensure the analysis of the ten-scenario model. Another ordinary integrated model without delayed monitoring is also established as comparison. The results of a numerical example indicate satisfactory economic performance of the proposed model. Finally, a sensitivity analysis is performed to investigate the effect of model parameters. - Highlights: • We develop an integrated model of statistical process control and maintenance. • We propose delayed monitoring policy and derive an economic model with 10 scenarios. • We consider two deterioration mechanisms, quality shift and equipment failure. • The delayed monitoring policy will help reduce the expected cost
Integrating Statistical Machine Learning in a Semantic Sensor Web for Proactive Monitoring and Control

Directory of Open Access Journals (Sweden)

Jude Adekunle Adeleke

2017-04-01

Full Text Available Proactive monitoring and control of our natural and built environments is important in various application scenarios. Semantic Sensor Web technologies have been well researched and used for environmental monitoring applications to expose sensor data for analysis in order to provide responsive actions in situations of interest. While these applications provide quick response to situations, to minimize their unwanted effects, research efforts are still necessary to provide techniques that can anticipate the future to support proactive control, such that unwanted situations can be averted altogether. This study integrates a statistical machine learning based predictive model in a Semantic Sensor Web using stream reasoning. The approach is evaluated in an indoor air quality monitoring case study. A sliding window approach that employs the Multilayer Perceptron model to predict short term PM 2 . 5 pollution situations is integrated into the proactive monitoring and control framework. Results show that the proposed approach can effectively predict short term PM 2 . 5 pollution situations: precision of up to 0.86 and sensitivity of up to 0.85 is achieved over half hour prediction horizons, making it possible for the system to warn occupants or even to autonomously avert the predicted pollution situations within the context of Semantic Sensor Web.
Integrating Statistical Machine Learning in a Semantic Sensor Web for Proactive Monitoring and Control.

Science.gov (United States)

Adeleke, Jude Adekunle; Moodley, Deshendran; Rens, Gavin; Adewumi, Aderemi Oluyinka

2017-04-09

Proactive monitoring and control of our natural and built environments is important in various application scenarios. Semantic Sensor Web technologies have been well researched and used for environmental monitoring applications to expose sensor data for analysis in order to provide responsive actions in situations of interest. While these applications provide quick response to situations, to minimize their unwanted effects, research efforts are still necessary to provide techniques that can anticipate the future to support proactive control, such that unwanted situations can be averted altogether. This study integrates a statistical machine learning based predictive model in a Semantic Sensor Web using stream reasoning. The approach is evaluated in an indoor air quality monitoring case study. A sliding window approach that employs the Multilayer Perceptron model to predict short term PM 2 . 5 pollution situations is integrated into the proactive monitoring and control framework. Results show that the proposed approach can effectively predict short term PM 2 . 5 pollution situations: precision of up to 0.86 and sensitivity of up to 0.85 is achieved over half hour prediction horizons, making it possible for the system to warn occupants or even to autonomously avert the predicted pollution situations within the context of Semantic Sensor Web.
Prediction of lacking control power in power plants using statistical models

DEFF Research Database (Denmark)

Odgaard, Peter Fogh; Mataji, B.; Stoustrup, Jakob

2007-01-01

Prediction of the performance of plants like power plants is of interest, since the plant operator can use these predictions to optimize the plant production. In this paper the focus is addressed on a special case where a combination of high coal moisture content and a high load limits the possible...... plant load, meaning that the requested plant load cannot be met. The available models are in this case uncertain. Instead statistical methods are used to predict upper and lower uncertainty bounds on the prediction. Two different methods are used. The first relies on statistics of recent prediction...... errors; the second uses operating point depending statistics of prediction errors. Using these methods on the previous mentioned case, it can be concluded that the second method can be used to predict the power plant performance, while the first method has problems predicting the uncertain performance...
Model output statistics applied to wind power prediction

Energy Technology Data Exchange (ETDEWEB)

Joensen, A; Giebel, G; Landberg, L [Risoe National Lab., Roskilde (Denmark); Madsen, H; Nielsen, H A [The Technical Univ. of Denmark, Dept. of Mathematical Modelling, Lyngby (Denmark)

1999-03-01

Being able to predict the output of a wind farm online for a day or two in advance has significant advantages for utilities, such as better possibility to schedule fossil fuelled power plants and a better position on electricity spot markets. In this paper prediction methods based on Numerical Weather Prediction (NWP) models are considered. The spatial resolution used in NWP models implies that these predictions are not valid locally at a specific wind farm. Furthermore, due to the non-stationary nature and complexity of the processes in the atmosphere, and occasional changes of NWP models, the deviation between the predicted and the measured wind will be time dependent. If observational data is available, and if the deviation between the predictions and the observations exhibits systematic behavior, this should be corrected for; if statistical methods are used, this approaches is usually referred to as MOS (Model Output Statistics). The influence of atmospheric turbulence intensity, topography, prediction horizon length and auto-correlation of wind speed and power is considered, and to take the time-variations into account, adaptive estimation methods are applied. Three estimation techniques are considered and compared, Extended Kalman Filtering, recursive least squares and a new modified recursive least squares algorithm. (au) EU-JOULE-3. 11 refs.
Risk prediction model: Statistical and artificial neural network approach

Science.gov (United States)

Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

2017-04-01

Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.
Fatigue crack initiation and growth life prediction with statistical consideration

International Nuclear Information System (INIS)

Kwon, J.D.; Choi, S.H.; Kwak, S.G.; Chun, K.O.

1991-01-01

Life prediction or residual life prediction of structures or machines is one of the most strongly world wide needed problems as requirement in the stage of slowly developing economy which comes after rapidly and highly developing stage. For the purpose of statistical life prediction, fatigue test was conducted under the 3 stress levels, and for each stress level, 20 specimens are used. The statistical properties of the crack growth parameter m and C in the fatigue crack growth law of da/dN = C(ΔK) m , and the relationship between m and C, and the statistical distribution pattern of fatigue crack initiation, growth and fracture lives can be obtained by experimental results
Statistics of spatially integrated speckle intensity difference

DEFF Research Database (Denmark)

Hanson, Steen Grüner; Yura, Harold

2009-01-01

We consider the statistics of the spatially integrated speckle intensity difference obtained from two separated finite collecting apertures. For fully developed speckle, closed-form analytic solutions for both the probability density function and the cumulative distribution function are derived...... here for both arbitrary values of the mean number of speckles contained within an aperture and the degree of coherence of the optical field. Additionally, closed-form expressions are obtained for the corresponding nth statistical moments....
Statistical short-term earthquake prediction.

Science.gov (United States)

Kagan, Y Y; Knopoff, L

1987-06-19

A statistical procedure, derived from a theoretical model of fracture growth, is used to identify a foreshock sequence while it is in progress. As a predictor, the procedure reduces the average uncertainty in the rate of occurrence for a future strong earthquake by a factor of more than 1000 when compared with the Poisson rate of occurrence. About one-third of all main shocks with local magnitude greater than or equal to 4.0 in central California can be predicted in this way, starting from a 7-year database that has a lower magnitude cut off of 1.5. The time scale of such predictions is of the order of a few hours to a few days for foreshocks in the magnitude range from 2.0 to 5.0.
Nonparametric predictive inference in statistical process control

NARCIS (Netherlands)

Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

2000-01-01

New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

Science.gov (United States)

Kosinski, Andrzej S

2013-03-15

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
THE INTEGRATED SHORT-TERM STATISTICAL SURVEYS: EXPERIENCE OF NBS IN MOLDOVA

Directory of Open Access Journals (Sweden)

Oleg CARA

2012-07-01

Full Text Available The users’ rising need for relevant, reliable, coherent, timely data for the early diagnosis of the economic vulnerability and of the turning points in the business cycles, especially during a financial and economic crisis, asks for a prompt answer, coordinated by statistical institutions. High quality short term statistics are of special interest for the emerging market economies, such as the Moldavian one, being extremely vulnerable when facing economic recession. Answering to the challenges of producing a coherent and adequate image of the economic activity, by using the system of indicators and definitions efficiently applied at the level of the European Union, the National Bureau of Statistics (NBS of the Republic of Moldova has launched the development process of an integrated system of short term statistics (STS based on the advanced international experience.Thus, in 2011, BNS implemented the integrated statistical survey on STS based on consistent concepts, harmonized with the EU standards. The integration of the production processes, which were previously separated, is based on a common technical infrastructure, standardized procedures and techniques for data production. The achievement of this complex survey with holistic approach has allowed the consolidation of the statistical data quality, comparable at European level and the signifi cant reduction of information burden on business units, especially of small size.The reformation of STS based on the integrated survey has been possible thanks to the consistent methodological and practical support given to NBS by the National Institute of Statistics (INS of Romania, for which we would like to thank to our Romanian colleagues.
Nonparametric predictive inference in statistical process control

NARCIS (Netherlands)

Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

2004-01-01

Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric
Wind speed prediction using statistical regression and neural network

Indian Academy of Sciences (India)

Prediction of wind speed in the atmospheric boundary layer is important for wind energy assess- ment,satellite launching and aviation,etc.There are a few techniques available for wind speed prediction,which require a minimum number of input parameters.Four different statistical techniques,viz.,curve ﬁtting,Auto Regressive ...
Statistical model for prediction of hearing loss in patients receiving cisplatin chemotherapy.

Science.gov (United States)

Johnson, Andrew; Tarima, Sergey; Wong, Stuart; Friedland, David R; Runge, Christina L

2013-03-01

This statistical model might be used to predict cisplatin-induced hearing loss, particularly in patients undergoing concomitant radiotherapy. To create a statistical model based on pretreatment hearing thresholds to provide an individual probability for hearing loss from cisplatin therapy and, secondarily, to investigate the use of hearing classification schemes as predictive tools for hearing loss. Retrospective case-control study. Tertiary care medical center. A total of 112 subjects receiving chemotherapy and audiometric evaluation were evaluated for the study. Of these subjects, 31 met inclusion criteria for analysis. The primary outcome measurement was a statistical model providing the probability of hearing loss following the use of cisplatin chemotherapy. Fifteen of the 31 subjects had significant hearing loss following cisplatin chemotherapy. American Academy of Otolaryngology-Head and Neck Society and Gardner-Robertson hearing classification schemes revealed little change in hearing grades between pretreatment and posttreatment evaluations for subjects with or without hearing loss. The Chang hearing classification scheme could effectively be used as a predictive tool in determining hearing loss with a sensitivity of 73.33%. Pretreatment hearing thresholds were used to generate a statistical model, based on quadratic approximation, to predict hearing loss (C statistic = 0.842, cross-validated = 0.835). The validity of the model improved when only subjects who received concurrent head and neck irradiation were included in the analysis (C statistic = 0.91). A calculated cutoff of 0.45 for predicted probability has a cross-validated sensitivity and specificity of 80%. Pretreatment hearing thresholds can be used as a predictive tool for cisplatin-induced hearing loss, particularly with concomitant radiotherapy.
Integrative Analysis of Gene Expression Data Including an Assessment of Pathway Enrichment for Predicting Prostate Cancer

Directory of Open Access Journals (Sweden)

Pingzhao Hu

2006-01-01

biological pathways. In particular, we observed that by integrating information from the insulin signalling pathway into our prediction model, we achieved better prediction of prostate cancer. Conclusions: Our data integration methodology provides an efficient way to identify biologically sound and statistically significant pathways from gene expression data. The significant gene expression phenotypes identified in our study have the potential to characterize complex genetic alterations in prostate cancer.

IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics.

Science.gov (United States)

Hoyt, Robert Eugene; Snider, Dallas; Thompson, Carla; Mantravadi, Sarita

2016-10-11

We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes. To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs. The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets. Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix
Disturbance metrics predict a wetland Vegetation Index of Biotic Integrity

Science.gov (United States)

Stapanian, Martin A.; Mack, John; Adams, Jean V.; Gara, Brian; Micacchion, Mick

2013-01-01

Indices of biological integrity of wetlands based on vascular plants (VIBIs) have been developed in many areas in the USA. Knowledge of the best predictors of VIBIs would enable management agencies to make better decisions regarding mitigation site selection and performance monitoring criteria. We use a novel statistical technique to develop predictive models for an established index of wetland vegetation integrity (Ohio VIBI), using as independent variables 20 indices and metrics of habitat quality, wetland disturbance, and buffer area land use from 149 wetlands in Ohio, USA. For emergent and forest wetlands, predictive models explained 61% and 54% of the variability, respectively, in Ohio VIBI scores. In both cases the most important predictor of Ohio VIBI score was a metric that assessed habitat alteration and development in the wetland. Of secondary importance as a predictor was a metric that assessed microtopography, interspersion, and quality of vegetation communities in the wetland. Metrics and indices assessing disturbance and land use of the buffer area were generally poor predictors of Ohio VIBI scores. Our results suggest that vegetation integrity of emergent and forest wetlands could be most directly enhanced by minimizing substrate and habitat disturbance within the wetland. Such efforts could include reducing or eliminating any practices that disturb the soil profile, such as nutrient enrichment from adjacent farm land, mowing, grazing, or cutting or removing woody plants.
Enterprise Human Resources Integration-Statistical Data Mart (EHRI-SDM) Status Data

Data.gov (United States)

Office of Personnel Management — The Enterprise Human Resources Integration-Statistical Data Mart (EHRI-SDM) is a statistically cleansed sub-set of the data contained in the EHRI data warehouse. It...
Enterprise Human Resources Integration-Statistical Data Mart (EHRI-SDM) Dynamics Data

Data.gov (United States)

Office of Personnel Management — The Enterprise Human Resources Integration-Statistical Data Mart (EHRI-SDM) is a statistically cleansed sub-set of the data contained in the EHRI data warehouse. It...
Statistical models for expert judgement and wear prediction

International Nuclear Information System (INIS)

Pulkkinen, U.

1994-01-01

This thesis studies the statistical analysis of expert judgements and prediction of wear. The point of view adopted is the one of information theory and Bayesian statistics. A general Bayesian framework for analyzing both the expert judgements and wear prediction is presented. Information theoretic interpretations are given for some averaging techniques used in the determination of consensus distributions. Further, information theoretic models are compared with a Bayesian model. The general Bayesian framework is then applied in analyzing expert judgements based on ordinal comparisons. In this context, the value of information lost in the ordinal comparison process is analyzed by applying decision theoretic concepts. As a generalization of the Bayesian framework, stochastic filtering models for wear prediction are formulated. These models utilize the information from condition monitoring measurements in updating the residual life distribution of mechanical components. Finally, the application of stochastic control models in optimizing operational strategies for inspected components are studied. Monte-Carlo simulation methods, such as the Gibbs sampler and the stochastic quasi-gradient method, are applied in the determination of posterior distributions and in the solution of stochastic optimization problems. (orig.) (57 refs., 7 figs., 1 tab.)
Multiple Kernel Learning with Random Effects for Predicting Longitudinal Outcomes and Data Integration

Science.gov (United States)

Chen, Tianle; Zeng, Donglin

2015-01-01

Summary Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although kernel-based statistical learning methods are proven to be powerful for a wide range of disease prediction problems, these methods are only well studied for independent data but not for longitudinal data. It is thus important to develop time-sensitive prediction rules that make use of the longitudinal nature of the data. In this paper, we develop a novel statistical learning method for longitudinal data by introducing subject-specific short-term and long-term latent effects through a designed kernel to account for within-subject correlation of longitudinal measurements. Since the presence of multiple sources of data is increasingly common, we embed our method in a multiple kernel learning framework and propose a regularized multiple kernel statistical learning with random effects to construct effective nonparametric prediction rules. Our method allows easy integration of various heterogeneous data sources and takes advantage of correlation among longitudinal measures to increase prediction power. We use different kernels for each data source taking advantage of the distinctive feature of each data modality, and then optimally combine data across modalities. We apply the developed methods to two large epidemiological studies, one on Huntington's disease and the other on Alzheimer's Disease (Alzheimer's Disease Neuroimaging Initiative, ADNI) where we explore a unique opportunity to combine imaging and genetic data to study prediction of mild cognitive impairment, and show a substantial gain in performance while accounting for the longitudinal aspect of the data. PMID:26177419
Spatial statistics for predicting flow through a rock fracture

International Nuclear Information System (INIS)

Coakley, K.J.

1989-03-01

Fluid flow through a single rock fracture depends on the shape of the space between the upper and lower pieces of rock which define the fracture. In this thesis, the normalized flow through a fracture, i.e. the equivalent permeability of a fracture, is predicted in terms of spatial statistics computed from the arrangement of voids, i.e. open spaces, and contact areas within the fracture. Patterns of voids and contact areas, with complexity typical of experimental data, are simulated by clipping a correlated Gaussian process defined on a N by N pixel square region. The voids have constant aperture; the distance between the upper and lower surfaces which define the fracture is either zero or a constant. Local flow is assumed to be proportional to local aperture cubed times local pressure gradient. The flow through a pattern of voids and contact areas is solved using a finite-difference method. After solving for the flow through simulated 10 by 10 by 30 pixel patterns of voids and contact areas, a model to predict equivalent permeability is developed. The first model is for patterns with 80% voids where all voids have the same aperture. The equivalent permeability of a pattern is predicted in terms of spatial statistics computed from the arrangement of voids and contact areas within the pattern. Four spatial statistics are examined. The change point statistic measures how often adjacent pixel alternate from void to contact area (or vice versa ) in the rows of the patterns which are parallel to the overall flow direction. 37 refs., 66 figs., 41 tabs
Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation.

Science.gov (United States)

Pearce, Marcus T

2018-05-11

Music perception depends on internal psychological models derived through exposure to a musical culture. It is hypothesized that this musical enculturation depends on two cognitive processes: (1) statistical learning, in which listeners acquire internal cognitive models of statistical regularities present in the music to which they are exposed; and (2) probabilistic prediction based on these learned models that enables listeners to organize and process their mental representations of music. To corroborate these hypotheses, I review research that uses a computational model of probabilistic prediction based on statistical learning (the information dynamics of music (IDyOM) model) to simulate data from empirical studies of human listeners. The results show that a broad range of psychological processes involved in music perception-expectation, emotion, memory, similarity, segmentation, and meter-can be understood in terms of a single, underlying process of probabilistic prediction using learned statistical models. Furthermore, IDyOM simulations of listeners from different musical cultures demonstrate that statistical learning can plausibly predict causal effects of differential cultural exposure to musical styles, providing a quantitative model of cultural distance. Understanding the neural basis of musical enculturation will benefit from close coordination between empirical neuroimaging and computational modeling of underlying mechanisms, as outlined here. © 2018 The Authors. Annals of the New York Academy of Sciences published by Wiley Periodicals, Inc. on behalf of New York Academy of Sciences.
Knowledge-Intensive Gathering and Integration of Statistical Information on European Fisheries

NARCIS (Netherlands)

Klinkert, M.; Treur, J.; Verwaart, T.; Loganantharaj, R.; Palm, G.; Ali, M.

2000-01-01

Gathering, maintenance, integration and presentation of statistics are major activities of the Dutch Agricultural Economics Research Institute LEI. In this paper we explore how knowledge and agent technology can be exploited to support the information gathering and integration process. In
A neighborhood statistics model for predicting stream pathogen indicator levels.

Science.gov (United States)

Pandey, Pramod K; Pasternack, Gregory B; Majumder, Mahbubul; Soupir, Michelle L; Kaiser, Mark S

2015-03-01

Because elevated levels of water-borne Escherichia coli in streams are a leading cause of water quality impairments in the U.S., water-quality managers need tools for predicting aqueous E. coli levels. Presently, E. coli levels may be predicted using complex mechanistic models that have a high degree of unchecked uncertainty or simpler statistical models. To assess spatio-temporal patterns of instream E. coli levels, herein we measured E. coli, a pathogen indicator, at 16 sites (at four different times) within the Squaw Creek watershed, Iowa, and subsequently, the Markov Random Field model was exploited to develop a neighborhood statistics model for predicting instream E. coli levels. Two observed covariates, local water temperature (degrees Celsius) and mean cross-sectional depth (meters), were used as inputs to the model. Predictions of E. coli levels in the water column were compared with independent observational data collected from 16 in-stream locations. The results revealed that spatio-temporal averages of predicted and observed E. coli levels were extremely close. Approximately 66 % of individual predicted E. coli concentrations were within a factor of 2 of the observed values. In only one event, the difference between prediction and observation was beyond one order of magnitude. The mean of all predicted values at 16 locations was approximately 1 % higher than the mean of the observed values. The approach presented here will be useful while assessing instream contaminations such as pathogen/pathogen indicator levels at the watershed scale.
Nonparametric Bayesian predictive distributions for future order statistics

Science.gov (United States)

Richard A. Johnson; James W. Evans; David W. Green

1999-01-01

We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Scalable Integrated Region-Based Image Retrieval Using IRM and Statistical Clustering.

Science.gov (United States)

Wang, James Z.; Du, Yanping

Statistical clustering is critical in designing scalable image retrieval systems. This paper presents a scalable algorithm for indexing and retrieving images based on region segmentation. The method uses statistical clustering on region features and IRM (Integrated Region Matching), a measure developed to evaluate overall similarity between images…
Integrated Detection and Prediction of Influenza Activity for Real-Time Surveillance: Algorithm Design.

Science.gov (United States)

Spreco, Armin; Eriksson, Olle; Dahlström, Örjan; Cowling, Benjamin John; Timpka, Toomas

2017-06-15

Influenza is a viral respiratory disease capable of causing epidemics that represent a threat to communities worldwide. The rapidly growing availability of electronic "big data" from diagnostic and prediagnostic sources in health care and public health settings permits advance of a new generation of methods for local detection and prediction of winter influenza seasons and influenza pandemics. The aim of this study was to present a method for integrated detection and prediction of influenza virus activity in local settings using electronically available surveillance data and to evaluate its performance by retrospective application on authentic data from a Swedish county. An integrated detection and prediction method was formally defined based on a design rationale for influenza detection and prediction methods adapted for local surveillance. The novel method was retrospectively applied on data from the winter influenza season 2008-09 in a Swedish county (population 445,000). Outcome data represented individuals who met a clinical case definition for influenza (based on International Classification of Diseases version 10 [ICD-10] codes) from an electronic health data repository. Information from calls to a telenursing service in the county was used as syndromic data source. The novel integrated detection and prediction method is based on nonmechanistic statistical models and is designed for integration in local health information systems. The method is divided into separate modules for detection and prediction of local influenza virus activity. The function of the detection module is to alert for an upcoming period of increased load of influenza cases on local health care (using influenza-diagnosis data), whereas the function of the prediction module is to predict the timing of the activity peak (using syndromic data) and its intensity (using influenza-diagnosis data). For detection modeling, exponential regression was used based on the assumption that the beginning
A systems approach to integrative biology: an overview of statistical methods to elucidate association and architecture.

Science.gov (United States)

Ciaccio, Mark F; Finkle, Justin D; Xue, Albert Y; Bagheri, Neda

2014-07-01

An organism's ability to maintain a desired physiological response relies extensively on how cellular and molecular signaling networks interpret and react to environmental cues. The capacity to quantitatively predict how networks respond to a changing environment by modifying signaling regulation and phenotypic responses will help inform and predict the impact of a changing global enivronment on organisms and ecosystems. Many computational strategies have been developed to resolve cue-signal-response networks. However, selecting a strategy that answers a specific biological question requires knowledge both of the type of data being collected, and of the strengths and weaknesses of different computational regimes. We broadly explore several computational approaches, and we evaluate their accuracy in predicting a given response. Specifically, we describe how statistical algorithms can be used in the context of integrative and comparative biology to elucidate the genomic, proteomic, and/or cellular networks responsible for robust physiological response. As a case study, we apply this strategy to a dataset of quantitative levels of protein abundance from the mussel, Mytilus galloprovincialis, to uncover the temperature-dependent signaling network. © The Author 2014. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Using HIV&AIDS statistics in pre-service Mathematics Education to integrate HIV&AIDS education.

Science.gov (United States)

van Laren, Linda

2012-12-01

In South Africa, the HIV&AIDS education policy documents indicate opportunities for integration across disciplines/subjects. There are different interpretations of integration/inclusion and mainstreaming HIV&AIDS education, and numerous levels of integration. Integration ensures that learners experience the disciplines/subjects as being linked and related, and integration is required to support and expand the learners' opportunities to attain skills, acquire knowledge and develop attitudes and values across the curriculum. This study makes use of self-study methodology where I, a teacher educator, aim to improve my practice through including HIV&AIDS statistics in Mathematics Education. This article focuses on how I used HIV&AIDS statistics to facilitate pre-service teacher reflection and introduce them to integration of HIV&AIDS education across the curriculum. After pre-service teachers were provided with HIV statistics, they drew a pie chart which graphically illustrated the situation and reflected on issues relating to HIV&AIDS. Three themes emerged from the analysis of their reflections. The themes relate to the need for further HIV&AIDS education, the changing pastoral role of teachers and the changing context of teaching. This information indicates that the use of statistics is an appropriate means of initiating the integration of HIV&AIDS education into the academic curriculum.
Analysis and Evaluation of Statistical Models for Integrated Circuits Design

Directory of Open Access Journals (Sweden)

Sáenz-Noval J.J.

2011-10-01

Full Text Available Statistical models for integrated circuits (IC allow us to estimate the percentage of acceptable devices in the batch before fabrication. Actually, Pelgrom is the statistical model most accepted in the industry; however it was derived from a micrometer technology, which does not guarantee reliability in nanometric manufacturing processes. This work considers three of the most relevant statistical models in the industry and evaluates their limitations and advantages in analog design, so that the designer has a better criterion to make a choice. Moreover, it shows how several statistical models can be used for each one of the stages and design purposes.
BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.

Science.gov (United States)

Kaur, Harpreet; Raghava, G P S

2002-03-01

beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/
A statistically rigorous sampling design to integrate avian monitoring and management within Bird Conservation Regions.

Science.gov (United States)

Pavlacky, David C; Lukacs, Paul M; Blakesley, Jennifer A; Skorkowsky, Robert C; Klute, David S; Hahn, Beth A; Dreitz, Victoria J; George, T Luke; Hanni, David J

2017-01-01

Monitoring is an essential component of wildlife management and conservation. However, the usefulness of monitoring data is often undermined by the lack of 1) coordination across organizations and regions, 2) meaningful management and conservation objectives, and 3) rigorous sampling designs. Although many improvements to avian monitoring have been discussed, the recommendations have been slow to emerge in large-scale programs. We introduce the Integrated Monitoring in Bird Conservation Regions (IMBCR) program designed to overcome the above limitations. Our objectives are to outline the development of a statistically defensible sampling design to increase the value of large-scale monitoring data and provide example applications to demonstrate the ability of the design to meet multiple conservation and management objectives. We outline the sampling process for the IMBCR program with a focus on the Badlands and Prairies Bird Conservation Region (BCR 17). We provide two examples for the Brewer's sparrow (Spizella breweri) in BCR 17 demonstrating the ability of the design to 1) determine hierarchical population responses to landscape change and 2) estimate hierarchical habitat relationships to predict the response of the Brewer's sparrow to conservation efforts at multiple spatial scales. The collaboration across organizations and regions provided economy of scale by leveraging a common data platform over large spatial scales to promote the efficient use of monitoring resources. We designed the IMBCR program to address the information needs and core conservation and management objectives of the participating partner organizations. Although it has been argued that probabilistic sampling designs are not practical for large-scale monitoring, the IMBCR program provides a precedent for implementing a statistically defensible sampling design from local to bioregional scales. We demonstrate that integrating conservation and management objectives with rigorous statistical
A statistically rigorous sampling design to integrate avian monitoring and management within Bird Conservation Regions.

Directory of Open Access Journals (Sweden)

David C Pavlacky

Full Text Available Monitoring is an essential component of wildlife management and conservation. However, the usefulness of monitoring data is often undermined by the lack of 1 coordination across organizations and regions, 2 meaningful management and conservation objectives, and 3 rigorous sampling designs. Although many improvements to avian monitoring have been discussed, the recommendations have been slow to emerge in large-scale programs. We introduce the Integrated Monitoring in Bird Conservation Regions (IMBCR program designed to overcome the above limitations. Our objectives are to outline the development of a statistically defensible sampling design to increase the value of large-scale monitoring data and provide example applications to demonstrate the ability of the design to meet multiple conservation and management objectives. We outline the sampling process for the IMBCR program with a focus on the Badlands and Prairies Bird Conservation Region (BCR 17. We provide two examples for the Brewer's sparrow (Spizella breweri in BCR 17 demonstrating the ability of the design to 1 determine hierarchical population responses to landscape change and 2 estimate hierarchical habitat relationships to predict the response of the Brewer's sparrow to conservation efforts at multiple spatial scales. The collaboration across organizations and regions provided economy of scale by leveraging a common data platform over large spatial scales to promote the efficient use of monitoring resources. We designed the IMBCR program to address the information needs and core conservation and management objectives of the participating partner organizations. Although it has been argued that probabilistic sampling designs are not practical for large-scale monitoring, the IMBCR program provides a precedent for implementing a statistically defensible sampling design from local to bioregional scales. We demonstrate that integrating conservation and management objectives with rigorous
Optimizing Groundwater Monitoring Networks Using Integrated Statistical and Geostatistical Approaches

Directory of Open Access Journals (Sweden)

Jay Krishna Thakur

2015-08-01

Full Text Available The aim of this work is to investigate new approaches using methods based on statistics and geo-statistics for spatio-temporal optimization of groundwater monitoring networks. The formulated and integrated methods were tested with the groundwater quality data set of Bitterfeld/Wolfen, Germany. Spatially, the monitoring network was optimized using geo-statistical methods. Temporal optimization of the monitoring network was carried out using Sen’s method (1968. For geostatistical network optimization, a geostatistical spatio-temporal algorithm was used to identify redundant wells in 2- and 2.5-D Quaternary and Tertiary aquifers. Influences of interpolation block width, dimension, contaminant association, groundwater flow direction and aquifer homogeneity on statistical and geostatistical methods for monitoring network optimization were analysed. The integrated approach shows 37% and 28% redundancies in the monitoring network in Quaternary aquifer and Tertiary aquifer respectively. The geostatistical method also recommends 41 and 22 new monitoring wells in the Quaternary and Tertiary aquifers respectively. In temporal optimization, an overall optimized sampling interval was recommended in terms of lower quartile (238 days, median quartile (317 days and upper quartile (401 days in the research area of Bitterfeld/Wolfen. Demonstrated methods for improving groundwater monitoring network can be used in real monitoring network optimization with due consideration given to influencing factors.

The extraction and integration framework: a two-process account of statistical learning.

Science.gov (United States)

Thiessen, Erik D; Kronstein, Alexandra T; Hufnagle, Daniel G

2013-07-01

The term statistical learning in infancy research originally referred to sensitivity to transitional probabilities. Subsequent research has demonstrated that statistical learning contributes to infant development in a wide array of domains. The range of statistical learning phenomena necessitates a broader view of the processes underlying statistical learning. Learners are sensitive to a much wider range of statistical information than the conditional relations indexed by transitional probabilities, including distributional and cue-based statistics. We propose a novel framework that unifies learning about all of these kinds of statistical structure. From our perspective, learning about conditional relations outputs discrete representations (such as words). Integration across these discrete representations yields sensitivity to cues and distributional information. To achieve sensitivity to all of these kinds of statistical structure, our framework combines processes that extract segments of the input with processes that compare across these extracted items. In this framework, the items extracted from the input serve as exemplars in long-term memory. The similarity structure of those exemplars in long-term memory leads to the discovery of cues and categorical structure, which guides subsequent extraction. The extraction and integration framework provides a way to explain sensitivity to both conditional statistical structure (such as transitional probabilities) and distributional statistical structure (such as item frequency and variability), and also a framework for thinking about how these different aspects of statistical learning influence each other. 2013 APA, all rights reserved
Advanced data analysis in neuroscience integrating statistical and computational models

CERN Document Server

Durstewitz, Daniel

2017-01-01

This book is intended for use in advanced graduate courses in statistics / machine learning, as well as for all experimental neuroscientists seeking to understand statistical methods at a deeper level, and theoretical neuroscientists with a limited background in statistics. It reviews almost all areas of applied statistics, from basic statistical estimation and test theory, linear and nonlinear approaches for regression and classification, to model selection and methods for dimensionality reduction, density estimation and unsupervised clustering. Its focus, however, is linear and nonlinear time series analysis from a dynamical systems perspective, based on which it aims to convey an understanding also of the dynamical mechanisms that could have generated observed time series. Further, it integrates computational modeling of behavioral and neural dynamics with statistical estimation and hypothesis testing. This way computational models in neuroscience are not only explanat ory frameworks, but become powerfu...
Statistical predictions from anarchic field theory landscapes

International Nuclear Information System (INIS)

Balasubramanian, Vijay; Boer, Jan de; Naqvi, Asad

2010-01-01

Consistent coupling of effective field theories with a quantum theory of gravity appears to require bounds on the rank of the gauge group and the amount of matter. We consider landscapes of field theories subject to such to boundedness constraints. We argue that appropriately 'coarse-grained' aspects of the randomly chosen field theory in such landscapes, such as the fraction of gauge groups with ranks in a given range, can be statistically predictable. To illustrate our point we show how the uniform measures on simple classes of N=1 quiver gauge theories localize in the vicinity of theories with certain typical structures. Generically, this approach would predict a high energy theory with very many gauge factors, with the high rank factors largely decoupled from the low rank factors if we require asymptotic freedom for the latter.
Using machine learning, neural networks and statistics to predict bankruptcy

NARCIS (Netherlands)

Pompe, P.P.M.; Feelders, A.J.; Feelders, A.J.

1997-01-01

Recent literature strongly suggests that machine learning approaches to classification outperform "classical" statistical methods. We make a comparison between the performance of linear discriminant analysis, classification trees, and neural networks in predicting corporate bankruptcy. Linear
Application of statistical classification methods for predicting the acceptability of well-water quality

Science.gov (United States)

Cameron, Enrico; Pilla, Giorgio; Stella, Fabio A.

2018-01-01

The application of statistical classification methods is investigated—in comparison also to spatial interpolation methods—for predicting the acceptability of well-water quality in a situation where an effective quantitative model of the hydrogeological system under consideration cannot be developed. In the example area in northern Italy, in particular, the aquifer is locally affected by saline water and the concentration of chloride is the main indicator of both saltwater occurrence and groundwater quality. The goal is to predict if the chloride concentration in a water well will exceed the allowable concentration so that the water is unfit for the intended use. A statistical classification algorithm achieved the best predictive performances and the results of the study show that statistical classification methods provide further tools for dealing with groundwater quality problems concerning hydrogeological systems that are too difficult to describe analytically or to simulate effectively.
Output from Statistical Predictive Models as Input to eLearning Dashboards

Directory of Open Access Journals (Sweden)

Marlene A. Smith

2015-06-01

Full Text Available We describe how statistical predictive models might play an expanded role in educational analytics by giving students automated, real-time information about what their current performance means for eventual success in eLearning environments. We discuss how an online messaging system might tailor information to individual students using predictive analytics. The proposed system would be data-driven and quantitative; e.g., a message might furnish the probability that a student will successfully complete the certificate requirements of a massive open online course. Repeated messages would prod underperforming students and alert instructors to those in need of intervention. Administrators responsible for accreditation or outcomes assessment would have ready documentation of learning outcomes and actions taken to address unsatisfactory student performance. The article’s brief introduction to statistical predictive models sets the stage for a description of the messaging system. Resources and methods needed to develop and implement the system are discussed.
Technical Topic 3.2.2.d Bayesian and Non-Parametric Statistics: Integration of Neural Networks with Bayesian Networks for Data Fusion and Predictive Modeling

Science.gov (United States)

2016-05-31

Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics: Integration of Neural...Transfer N/A Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale ): Number of graduating undergraduates funded by a DoD funded
Performance prediction for silicon photonics integrated circuits with layout-dependent correlated manufacturing variability.

Science.gov (United States)

Lu, Zeqin; Jhoja, Jaspreet; Klein, Jackson; Wang, Xu; Liu, Amy; Flueckiger, Jonas; Pond, James; Chrostowski, Lukas

2017-05-01

This work develops an enhanced Monte Carlo (MC) simulation methodology to predict the impacts of layout-dependent correlated manufacturing variations on the performance of photonics integrated circuits (PICs). First, to enable such performance prediction, we demonstrate a simple method with sub-nanometer accuracy to characterize photonics manufacturing variations, where the width and height for a fabricated waveguide can be extracted from the spectral response of a racetrack resonator. By measuring the spectral responses for a large number of identical resonators spread over a wafer, statistical results for the variations of waveguide width and height can be obtained. Second, we develop models for the layout-dependent enhanced MC simulation. Our models use netlist extraction to transfer physical layouts into circuit simulators. Spatially correlated physical variations across the PICs are simulated on a discrete grid and are mapped to each circuit component, so that the performance for each component can be updated according to its obtained variations, and therefore, circuit simulations take the correlated variations between components into account. The simulation flow and theoretical models for our layout-dependent enhanced MC simulation are detailed in this paper. As examples, several ring-resonator filter circuits are studied using the developed enhanced MC simulation, and statistical results from the simulations can predict both common-mode and differential-mode variations of the circuit performance.
Predicting statistical properties of open reading frames in bacterial genomes.

Directory of Open Access Journals (Sweden)

Katharina Mir

Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.
Integrating source-language context into phrase-based statistical machine translation

NARCIS (Netherlands)

Haque, R.; Kumar Naskar, S.; Bosch, A.P.J. van den; Way, A.

2011-01-01

The translation features typically used in Phrase-Based Statistical Machine Translation (PB-SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated that integrating source context modelling
Estimating cross-validatory predictive p-values with integrated importance sampling for disease mapping models.

Science.gov (United States)

Li, Longhai; Feng, Cindy X; Qiu, Shi

2017-06-30

An important statistical task in disease mapping problems is to identify divergent regions with unusually high or low risk of disease. Leave-one-out cross-validatory (LOOCV) model assessment is the gold standard for estimating predictive p-values that can flag such divergent regions. However, actual LOOCV is time-consuming because one needs to rerun a Markov chain Monte Carlo analysis for each posterior distribution in which an observation is held out as a test case. This paper introduces a new method, called integrated importance sampling (iIS), for estimating LOOCV predictive p-values with only Markov chain samples drawn from the posterior based on a full data set. The key step in iIS is that we integrate away the latent variables associated the test observation with respect to their conditional distribution without reference to the actual observation. By following the general theory for importance sampling, the formula used by iIS can be proved to be equivalent to the LOOCV predictive p-value. We compare iIS and other three existing methods in the literature with two disease mapping datasets. Our empirical results show that the predictive p-values estimated with iIS are almost identical to the predictive p-values estimated with actual LOOCV and outperform those given by the existing three methods, namely, the posterior predictive checking, the ordinary importance sampling, and the ghosting method by Marshall and Spiegelhalter (2003). Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A statistical approach for predicting thermal diffusivity profiles in fusion plasmas as a transport model

International Nuclear Information System (INIS)

Yokoyama, Masayuki

2014-01-01

A statistical approach is proposed to predict thermal diffusivity profiles as a transport “model” in fusion plasmas. It can provide regression expressions for the ion and electron heat diffusivities (χ i and χ e ), separately, to construct their radial profiles. An approach that this letter is proposing outstrips the conventional scaling laws for the global confinement time (τ E ) since it also deals with profiles (temperature, density, heating depositions etc.). This approach has become possible with the analysis database accumulated by the extensive application of the integrated transport analysis suite to experiment data. In this letter, TASK3D-a analysis database for high-ion-temperature (high-T i ) plasmas in the LHD (Large Helical Device) is used as an example to describe an approach. (author)
Predicting Flowering Behavior and Exploring Its Genetic Determinism in an Apple Multi-family Population Based on Statistical Indices and Simplified Phenotyping.

Science.gov (United States)

Durand, Jean-Baptiste; Allard, Alix; Guitton, Baptiste; van de Weg, Eric; Bink, Marco C A M; Costes, Evelyne

2017-01-01

Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS) were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI) with an autoregressive coefficient (γ g ) efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γ g and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.
Predictive Analytics in Information Systems Research

NARCIS (Netherlands)

G. Shmueli (Galit); O.R. Koppius (Otto)

2011-01-01

textabstractThis research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as
An integrated artificial neural networks approach for predicting global radiation

International Nuclear Information System (INIS)

Azadeh, A.; Maghsoudi, A.; Sohrabkhani, S.

2009-01-01

This article presents an integrated artificial neural network (ANN) approach for predicting solar global radiation by climatological variables. The integrated ANN trains and tests data with multi layer perceptron (MLP) approach which has the lowest mean absolute percentage error (MAPE). The proposed approach is particularly useful for locations where no available measurement equipment. Also, it considers all related climatological and meteorological parameters as input variables. To show the applicability and superiority of the integrated ANN approach, monthly data were collected for 6 years (1995-2000) in six nominal cities in Iran. Separate model for each city is considered and the quantity of solar global radiation in each city is calculated. Furthermore an integrated ANN model has been introduced for prediction of solar global radiation. The acquired results of the integrated model have shown high accuracy of about 94%. The results of the integrated model have been compared with traditional angstrom's model to show its considerable accuracy. Therefore, the proposed approach can be used as an efficient tool for prediction of solar radiation in the remote and rural locations with no direct measurement equipment.
UK Environmental Prediction - integration and evaluation at the convective scale

Science.gov (United States)

Fallmann, Joachim; Lewis, Huw; Castillo, Juan Manuel; Pearson, David; Harris, Chris; Saulter, Andy; Bricheno, Lucy; Blyth, Eleanor

2016-04-01

Traditionally, the simulation of regional ocean, wave and atmosphere components of the Earth System have been considered separately, with some information on other components provided by means of boundary or forcing conditions. More recently, the potential value of a more integrated approach, as required for global climate and Earth System prediction, for regional short-term applications has begun to gain increasing research effort. In the UK, this activity is motivated by an understanding that accurate prediction and warning of the impacts of severe weather requires an integrated approach to forecasting. The substantial impacts on individuals, businesses and infrastructure of such events indicate a pressing need to understand better the value that might be delivered through more integrated environmental prediction. To address this need, the Met Office, NERC Centre for Ecology & Hydrology and NERC National Oceanography Centre have begun to develop the foundations of a coupled high resolution probabilistic forecast system for the UK at km-scale. This links together existing model components of the atmosphere, coastal ocean, land surface and hydrology. Our initial focus has been on a 2-year Prototype project to demonstrate the UK coupled prediction concept in research mode. This presentation will provide an update on UK environmental prediction activities. We will present the results from the initial implementation of an atmosphere-land-ocean coupled system, including a new eddy-permitting resolution ocean component, and discuss progress and initial results from further development to integrate wave interactions in this relatively high resolution system. We will discuss future directions and opportunities for collaboration in environmental prediction, and the challenges to realise the potential of integrated regional coupled forecasting for improving predictions and applications.
IPF-LASSO: Integrative L1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data

Directory of Open Access Journals (Sweden)

Anne-Laure Boulesteix

2017-01-01

Full Text Available As modern biotechnologies advance, it has become increasingly frequent that different modalities of high-dimensional molecular data (termed “omics” data in this paper, such as gene expression, methylation, and copy number, are collected from the same patient cohort to predict the clinical outcome. While prediction based on omics data has been widely studied in the last fifteen years, little has been done in the statistical literature on the integration of multiple omics modalities to select a subset of variables for prediction, which is a critical task in personalized medicine. In this paper, we propose a simple penalized regression method to address this problem by assigning different penalty factors to different data modalities for feature selection and prediction. The penalty factors can be chosen in a fully data-driven fashion by cross-validation or by taking practical considerations into account. In simulation studies, we compare the prediction performance of our approach, called IPF-LASSO (Integrative LASSO with Penalty Factors and implemented in the R package ipflasso, with the standard LASSO and sparse group LASSO. The use of IPF-LASSO is also illustrated through applications to two real-life cancer datasets. All data and codes are available on the companion website to ensure reproducibility.
A statistical model for predicting muscle performance

Science.gov (United States)

Byerly, Diane Leslie De Caix

The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing
Statistical Analysis of CFD Solutions from the Fourth AIAA Drag Prediction Workshop

Science.gov (United States)

Morrison, Joseph H.

2010-01-01

A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from the U.S., Europe, Asia, and Russia using a variety of grid systems and turbulence models for the June 2009 4th Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was a new subsonic transport model, the Common Research Model, designed using a modern approach for the wing and included a horizontal tail. The fourth workshop focused on the prediction of both absolute and incremental drag levels for wing-body and wing-body-horizontal tail configurations. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with earlier workshops using the statistical framework.
Statistical Modeling and Prediction for Tourism Economy Using Dendritic Neural Network.

Science.gov (United States)

Yu, Ying; Wang, Yirui; Gao, Shangce; Tang, Zheng

2017-01-01

With the impact of global internationalization, tourism economy has also been a rapid development. The increasing interest aroused by more advanced forecasting methods leads us to innovate forecasting methods. In this paper, the seasonal trend autoregressive integrated moving averages with dendritic neural network model (SA-D model) is proposed to perform the tourism demand forecasting. First, we use the seasonal trend autoregressive integrated moving averages model (SARIMA model) to exclude the long-term linear trend and then train the residual data by the dendritic neural network model and make a short-term prediction. As the result showed in this paper, the SA-D model can achieve considerably better predictive performances. In order to demonstrate the effectiveness of the SA-D model, we also use the data that other authors used in the other models and compare the results. It also proved that the SA-D model achieved good predictive performances in terms of the normalized mean square error, absolute percentage of error, and correlation coefficient.

Predictability of the recent slowdown and subsequent recovery of large-scale surface warming using statistical methods

Science.gov (United States)

Mann, Michael E.; Steinman, Byron A.; Miller, Sonya K.; Frankcombe, Leela M.; England, Matthew H.; Cheung, Anson H.

2016-04-01

The temporary slowdown in large-scale surface warming during the early 2000s has been attributed to both external and internal sources of climate variability. Using semiempirical estimates of the internal low-frequency variability component in Northern Hemisphere, Atlantic, and Pacific surface temperatures in concert with statistical hindcast experiments, we investigate whether the slowdown and its recent recovery were predictable. We conclude that the internal variability of the North Pacific, which played a critical role in the slowdown, does not appear to have been predictable using statistical forecast methods. An additional minor contribution from the North Atlantic, by contrast, appears to exhibit some predictability. While our analyses focus on combining semiempirical estimates of internal climatic variability with statistical hindcast experiments, possible implications for initialized model predictions are also discussed.
Uncertainty propagation for statistical impact prediction of space debris

Science.gov (United States)

Hoogendoorn, R.; Mooij, E.; Geul, J.

2018-01-01

Predictions of the impact time and location of space debris in a decaying trajectory are highly influenced by uncertainties. The traditional Monte Carlo (MC) method can be used to perform accurate statistical impact predictions, but requires a large computational effort. A method is investigated that directly propagates a Probability Density Function (PDF) in time, which has the potential to obtain more accurate results with less computational effort. The decaying trajectory of Delta-K rocket stages was used to test the methods using a six degrees-of-freedom state model. The PDF of the state of the body was propagated in time to obtain impact-time distributions. This Direct PDF Propagation (DPP) method results in a multi-dimensional scattered dataset of the PDF of the state, which is highly challenging to process. No accurate results could be obtained, because of the structure of the DPP data and the high dimensionality. Therefore, the DPP method is less suitable for practical uncontrolled entry problems and the traditional MC method remains superior. Additionally, the MC method was used with two improved uncertainty models to obtain impact-time distributions, which were validated using observations of true impacts. For one of the two uncertainty models, statistically more valid impact-time distributions were obtained than in previous research.
Predicting radiotherapy outcomes using statistical learning techniques

International Nuclear Information System (INIS)

El Naqa, Issam; Bradley, Jeffrey D; Deasy, Joseph O; Lindsay, Patricia E; Hope, Andrew J

2009-01-01

Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for 'generalizabilty' validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among model
Statistical Analysis of CFD Solutions From the Fifth AIAA Drag Prediction Workshop

Science.gov (United States)

Morrison, Joseph H.

2013-01-01

A graphical framework is used for statistical analysis of the results from an extensive N-version test of a collection of Reynolds-averaged Navier-Stokes computational fluid dynamics codes. The solutions were obtained by code developers and users from North America, Europe, Asia, and South America using a common grid sequence and multiple turbulence models for the June 2012 fifth Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration for this workshop was the Common Research Model subsonic transport wing-body previously used for the 4th Drag Prediction Workshop. This work continues the statistical analysis begun in the earlier workshops and compares the results from the grid convergence study of the most recent workshop with previous workshops.
Predicting Flowering Behavior and Exploring Its Genetic Determinism in an Apple Multi-family Population Based on Statistical Indices and Simplified Phenotyping

Directory of Open Access Journals (Sweden)

Jean-Baptiste Durand

2017-06-01

Full Text Available Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI with an autoregressive coefficient (γg efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γg and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.
Integration of Multi-Modal Biomedical Data to Predict Cancer Grade and Patient Survival.

Science.gov (United States)

Phan, John H; Hoffman, Ryan; Kothari, Sonal; Wu, Po-Yen; Wang, May D

2016-02-01

The Big Data era in Biomedical research has resulted in large-cohort data repositories such as The Cancer Genome Atlas (TCGA). These repositories routinely contain hundreds of matched patient samples for genomic, proteomic, imaging, and clinical data modalities, enabling holistic and multi-modal integrative analysis of human disease. Using TCGA renal and ovarian cancer data, we conducted a novel investigation of multi-modal data integration by combining histopathological image and RNA-seq data. We compared the performances of two integrative prediction methods: majority vote and stacked generalization. Results indicate that integration of multiple data modalities improves prediction of cancer grade and outcome. Specifically, stacked generalization, a method that integrates multiple data modalities to produce a single prediction result, outperforms both single-data-modality prediction and majority vote. Moreover, stacked generalization reveals the contribution of each data modality (and specific features within each data modality) to the final prediction result and may provide biological insights to explain prediction performance.
Integrating functional data to prioritize causal variants in statistical fine-mapping studies.

Directory of Open Access Journals (Sweden)

Gleb Kichaev

2014-10-01

Full Text Available Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy. Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.
Statistical Analysis of a Method to Predict Drug-Polymer Miscibility

DEFF Research Database (Denmark)

Knopp, Matthias Manne; Olesen, Niels Erik; Huang, Yanbin

2016-01-01

In this study, a method proposed to predict drug-polymer miscibility from differential scanning calorimetry measurements was subjected to statistical analysis. The method is relatively fast and inexpensive and has gained popularity as a result of the increasing interest in the formulation of drug...... as provided in this study. © 2015 Wiley Periodicals, Inc. and the American Pharmacists Association J Pharm Sci....
Statistical Modeling and Prediction for Tourism Economy Using Dendritic Neural Network

Directory of Open Access Journals (Sweden)

Ying Yu

2017-01-01

Full Text Available With the impact of global internationalization, tourism economy has also been a rapid development. The increasing interest aroused by more advanced forecasting methods leads us to innovate forecasting methods. In this paper, the seasonal trend autoregressive integrated moving averages with dendritic neural network model (SA-D model is proposed to perform the tourism demand forecasting. First, we use the seasonal trend autoregressive integrated moving averages model (SARIMA model to exclude the long-term linear trend and then train the residual data by the dendritic neural network model and make a short-term prediction. As the result showed in this paper, the SA-D model can achieve considerably better predictive performances. In order to demonstrate the effectiveness of the SA-D model, we also use the data that other authors used in the other models and compare the results. It also proved that the SA-D model achieved good predictive performances in terms of the normalized mean square error, absolute percentage of error, and correlation coefficient.
Translating visual information into action predictions: Statistical learning in action and nonaction contexts.

Science.gov (United States)

Monroy, Claire D; Gerson, Sarah A; Hunnius, Sabine

2018-05-01

Humans are sensitive to the statistical regularities in action sequences carried out by others. In the present eyetracking study, we investigated whether this sensitivity can support the prediction of upcoming actions when observing unfamiliar action sequences. In two between-subjects conditions, we examined whether observers would be more sensitive to statistical regularities in sequences performed by a human agent versus self-propelled 'ghost' events. Secondly, we investigated whether regularities are learned better when they are associated with contingent effects. Both implicit and explicit measures of learning were compared between agent and ghost conditions. Implicit learning was measured via predictive eye movements to upcoming actions or events, and explicit learning was measured via both uninstructed reproduction of the action sequences and verbal reports of the regularities. The findings revealed that participants, regardless of condition, readily learned the regularities and made correct predictive eye movements to upcoming events during online observation. However, different patterns of explicit-learning outcomes emerged following observation: Participants were most likely to re-create the sequence regularities and to verbally report them when they had observed an actor create a contingent effect. These results suggest that the shift from implicit predictions to explicit knowledge of what has been learned is facilitated when observers perceive another agent's actions and when these actions cause effects. These findings are discussed with respect to the potential role of the motor system in modulating how statistical regularities are learned and used to modify behavior.
Toward integration of genomic selection with crop modelling: the development of an integrated approach to predicting rice heading dates.

Science.gov (United States)

Onogi, Akio; Watanabe, Maya; Mochizuki, Toshihiro; Hayashi, Takeshi; Nakagawa, Hiroshi; Hasegawa, Toshihiro; Iwata, Hiroyoshi

2016-04-01

It is suggested that accuracy in predicting plant phenotypes can be improved by integrating genomic prediction with crop modelling in a single hierarchical model. Accurate prediction of phenotypes is important for plant breeding and management. Although genomic prediction/selection aims to predict phenotypes on the basis of whole-genome marker information, it is often difficult to predict phenotypes of complex traits in diverse environments, because plant phenotypes are often influenced by genotype-environment interaction. A possible remedy is to integrate genomic prediction with crop/ecophysiological modelling, which enables us to predict plant phenotypes using environmental and management information. To this end, in the present study, we developed a novel method for integrating genomic prediction with phenological modelling of Asian rice (Oryza sativa, L.), allowing the heading date of untested genotypes in untested environments to be predicted. The method simultaneously infers the phenological model parameters and whole-genome marker effects on the parameters in a Bayesian framework. By cultivating backcross inbred lines of Koshihikari × Kasalath in nine environments, we evaluated the potential of the proposed method in comparison with conventional genomic prediction, phenological modelling, and two-step methods that applied genomic prediction to phenological model parameters inferred from Nelder-Mead or Markov chain Monte Carlo algorithms. In predicting heading dates of untested lines in untested environments, the proposed and two-step methods tended to provide more accurate predictions than the conventional genomic prediction methods, particularly in environments where phenotypes from environments similar to the target environment were unavailable for training genomic prediction. The proposed method showed greater accuracy in prediction than the two-step methods in all cross-validation schemes tested, suggesting the potential of the integrated approach in
Revisiting EOR Projects in Indonesia through Integrated Study: EOR Screening, Predictive Model, and Optimisation

KAUST Repository

Hartono, A. D.; Hakiki, Farizal; Syihab, Z.; Ambia, F.; Yasutra, A.; Sutopo, S.; Efendi, M.; Sitompul, V.; Primasari, I.; Apriandi, R.

2017-01-01

EOR preliminary analysis is pivotal to be performed at early stage of assessment in order to elucidate EOR feasibility. This study proposes an in-depth analysis toolkit for EOR preliminary evaluation. The toolkit incorporates EOR screening, predictive, economic, risk analysis and optimisation modules. The screening module introduces algorithms which assimilates statistical and engineering notions into consideration. The United States Department of Energy (U.S. DOE) predictive models were implemented in the predictive module. The economic module is available to assess project attractiveness, while Monte Carlo Simulation is applied to quantify risk and uncertainty of the evaluated project. Optimization scenario of EOR practice can be evaluated using the optimisation module, in which stochastic methods of Genetic Algorithms (GA), Particle Swarm Optimization (PSO) and Evolutionary Strategy (ES) were applied in the algorithms. The modules were combined into an integrated package of EOR preliminary assessment. Finally, we utilised the toolkit to evaluate several Indonesian oil fields for EOR evaluation (past projects) and feasibility (future projects). The attempt was able to update the previous consideration regarding EOR attractiveness and open new opportunity for EOR implementation in Indonesia.
Revisiting EOR Projects in Indonesia through Integrated Study: EOR Screening, Predictive Model, and Optimisation

KAUST Repository

Hartono, A. D.

2017-10-17

EOR preliminary analysis is pivotal to be performed at early stage of assessment in order to elucidate EOR feasibility. This study proposes an in-depth analysis toolkit for EOR preliminary evaluation. The toolkit incorporates EOR screening, predictive, economic, risk analysis and optimisation modules. The screening module introduces algorithms which assimilates statistical and engineering notions into consideration. The United States Department of Energy (U.S. DOE) predictive models were implemented in the predictive module. The economic module is available to assess project attractiveness, while Monte Carlo Simulation is applied to quantify risk and uncertainty of the evaluated project. Optimization scenario of EOR practice can be evaluated using the optimisation module, in which stochastic methods of Genetic Algorithms (GA), Particle Swarm Optimization (PSO) and Evolutionary Strategy (ES) were applied in the algorithms. The modules were combined into an integrated package of EOR preliminary assessment. Finally, we utilised the toolkit to evaluate several Indonesian oil fields for EOR evaluation (past projects) and feasibility (future projects). The attempt was able to update the previous consideration regarding EOR attractiveness and open new opportunity for EOR implementation in Indonesia.
A multimetric approach for predicting the ecological integrity of New Zealand streams

Directory of Open Access Journals (Sweden)

Clapcott J.E.

2014-01-01

Full Text Available Integrating multiple measures of stream health into a combined metric can provide a holistic assessment of the ecological integrity of a stream. The aim of this study was to develop a multimetric index (MMI of stream integrity based on predictive modelling of national data sets of water quality, macroinvertebrates, fish and ecosystem process metrics. We used a boosted regression tree approach to calculate an observed/expected score for each metric prior to combining metrics in a MMI based on data availability and the strength of predictive models. The resulting MMI provides a geographically meaningful prediction of the ecological integrity of rivers in New Zealand, but identifies limitations in data and approach, providing focus for ongoing research.
Statistical Model Predictions for p+p and Pb+Pb Collisions at LHC

CERN Document Server

Kraus, I; Oeschler, H; Redlich, K; Wheaton, S

2009-01-01

Particle production in p+p and central Pb+Pb collisions at LHC is discussed in the context of the statistical thermal model. For heavy-ion collisions, predictions of various particle ratios are presented. The sensitivity of several ratios on the temperature and the baryon chemical potential is studied in detail, and some of them, which are particularly appropriate to determine the chemical freeze-out point experimentally, are indicated. Considering elementary interactions on the other hand, we focus on strangeness production and its possible suppression. Extrapolating the thermal parameters to LHC energy, we present predictions of the statistical model for particle yields in p+p collisions. We quantify the strangeness suppression by the correlation volume parameter and discuss its influence on particle production. We propose observables that can provide deeper insight into the mechanism of strangeness production and suppression at LHC.
On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.

Directory of Open Access Journals (Sweden)

Paul B Conn

Full Text Available Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook's notion of an independent variable hull (IVH, developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models.
Structural integrity of frontostriatal connections predicts longitudinal changes in self-esteem.

Science.gov (United States)

Chavez, Robert S; Heatherton, Todd F

2017-06-01

Diverse neurological and psychiatric conditions are marked by a diminished sense of positive self-regard, and reductions in self-esteem are associated with risk for these disorders. Recent evidence has shown that the connectivity of frontostriatal circuitry reflects individual differences in self-esteem. However, it remains an open question as to whether the integrity of these connections can predict self-esteem changes over larger timescales. Using diffusion magnetic resonance imaging and probabilistic tractography, we demonstrate that the integrity of white matter pathways linking the medial prefrontal cortex to the ventral striatum predicts changes in self-esteem 8 months after initial scanning in a sample of 30 young adults. Individuals with greater integrity of this pathway during the scanning session at Time 1 showed increased levels of self-esteem at follow-up, whereas individuals with lower integrity showed stifled or decreased levels of self-esteem. These results provide evidence that frontostriatal white matter integrity predicts the trajectory of self-esteem development in early adulthood, which may contribute to blunted levels of positive self-regard seen in multiple psychiatric conditions, including depression and anxiety.
Estimating Predictive Variance for Statistical Gas Distribution Modelling

International Nuclear Information System (INIS)

Lilienthal, Achim J.; Asadi, Sahar; Reggente, Matteo

2009-01-01

Recent publications in statistical gas distribution modelling have proposed algorithms that model mean and variance of a distribution. This paper argues that estimating the predictive concentration variance entails not only a gradual improvement but is rather a significant step to advance the field. This is, first, since the models much better fit the particular structure of gas distributions, which exhibit strong fluctuations with considerable spatial variations as a result of the intermittent character of gas dispersal. Second, because estimating the predictive variance allows to evaluate the model quality in terms of the data likelihood. This offers a solution to the problem of ground truth evaluation, which has always been a critical issue for gas distribution modelling. It also enables solid comparisons of different modelling approaches, and provides the means to learn meta parameters of the model, to determine when the model should be updated or re-initialised, or to suggest new measurement locations based on the current model. We also point out directions of related ongoing or potential future research work.
Integration of modern statistical tools for the analysis of climate extremes into the web-GIS “CLIMATE”

Science.gov (United States)

Ryazanova, A. A.; Okladnikov, I. G.; Gordov, E. P.

2017-11-01

The frequency of occurrence and magnitude of precipitation and temperature extreme events show positive trends in several geographical regions. These events must be analyzed and studied in order to better understand their impact on the environment, predict their occurrences, and mitigate their effects. For this purpose, we augmented web-GIS called “CLIMATE” to include a dedicated statistical package developed in the R language. The web-GIS “CLIMATE” is a software platform for cloud storage processing and visualization of distributed archives of spatial datasets. It is based on a combined use of web and GIS technologies with reliable procedures for searching, extracting, processing, and visualizing the spatial data archives. The system provides a set of thematic online tools for the complex analysis of current and future climate changes and their effects on the environment. The package includes new powerful methods of time-dependent statistics of extremes, quantile regression and copula approach for the detailed analysis of various climate extreme events. Specifically, the very promising copula approach allows obtaining the structural connections between the extremes and the various environmental characteristics. The new statistical methods integrated into the web-GIS “CLIMATE” can significantly facilitate and accelerate the complex analysis of climate extremes using only a desktop PC connected to the Internet.
Statistical prediction of biomethane potentials based on the composition of lignocellulosic biomass

DEFF Research Database (Denmark)

Thomsen, Sune Tjalfe; Spliid, Henrik; Østergård, Hanne

2014-01-01

Mixture models are introduced as a new and stronger methodology for statistical prediction of biomethane potentials (BPM) from lignocellulosic biomass compared to the linear regression models previously used. A large dataset from literature combined with our own data were analysed using canonical...

Statistical and Biophysical Models for Predicting Total and Outdoor Water Use in Los Angeles

Science.gov (United States)

Mini, C.; Hogue, T. S.; Pincetl, S.

2012-04-01

Modeling water demand is a complex exercise in the choice of the functional form, techniques and variables to integrate in the model. The goal of the current research is to identify the determinants that control total and outdoor residential water use in semi-arid cities and to utilize that information in the development of statistical and biophysical models that can forecast spatial and temporal urban water use. The City of Los Angeles is unique in its highly diverse socio-demographic, economic and cultural characteristics across neighborhoods, which introduces significant challenges in modeling water use. Increasing climate variability also contributes to uncertainties in water use predictions in urban areas. Monthly individual water use records were acquired from the Los Angeles Department of Water and Power (LADWP) for the 2000 to 2010 period. Study predictors of residential water use include socio-demographic, economic, climate and landscaping variables at the zip code level collected from US Census database. Climate variables are estimated from ground-based observations and calculated at the centroid of each zip code by inverse-distance weighting method. Remotely-sensed products of vegetation biomass and landscape land cover are also utilized. Two linear regression models were developed based on the panel data and variables described: a pooled-OLS regression model and a linear mixed effects model. Both models show income per capita and the percentage of landscape areas in each zip code as being statistically significant predictors. The pooled-OLS model tends to over-estimate higher water use zip codes and both models provide similar RMSE values.Outdoor water use was estimated at the census tract level as the residual between total water use and indoor use. This residual is being compared with the output from a biophysical model including tree and grass cover areas, climate variables and estimates of evapotranspiration at very high spatial resolution. A
An integrated logistic formula for prediction of complications from radiosurgery

International Nuclear Information System (INIS)

Flickinger, J.C.

1989-01-01

An integrated logistic model for predicting the probability of complications when small volumes of tissue receive an inhomogeneous radiation dose is described. This model can be used with either an exponential or linear quadratic correction for dose per fraction and time. Both the exponential and linear quadratic versions of this integrated logistic formula provide reasonable estimates of the tolerance of brain to radiosurgical dose distributions where there are small volumes of brain receiving high radiation doses and larger volumes receiving lower doses. This makes it possible to predict the probability of complications from stereotactic radiosurgery, as well as combinations of fractionated large volume irradiation with a radiosurgical boost. Complication probabilities predicted for single fraction radiosurgery with the Leksell Gamma Unit using 4, 8, 14, and 18 mm diameter collimators as well as for whole brain irradiation combined with a radiosurgical boost are presented. The exponential and linear quadratic versions of the integrated logistic formula provide useful methods of calculating the probability of complications from radiosurgical treatment
Probably not future prediction using probability and statistical inference

CERN Document Server

Dworsky, Lawrence N

2008-01-01

An engaging, entertaining, and informative introduction to probability and prediction in our everyday lives Although Probably Not deals with probability and statistics, it is not heavily mathematical and is not filled with complex derivations, proofs, and theoretical problem sets. This book unveils the world of statistics through questions such as what is known based upon the information at hand and what can be expected to happen. While learning essential concepts including "the confidence factor" and "random walks," readers will be entertained and intrigued as they move from chapter to chapter. Moreover, the author provides a foundation of basic principles to guide decision making in almost all facets of life including playing games, developing winning business strategies, and managing personal finances. Much of the book is organized around easy-to-follow examples that address common, everyday issues such as: How travel time is affected by congestion, driving speed, and traffic lights Why different gambling ...
The Prediction of Exchange Rates with the Use of Auto-Regressive Integrated Moving-Average Models

Directory of Open Access Journals (Sweden)

Daniela Spiesová

2014-10-01

Full Text Available Currency market is recently the largest world market during the existence of which there have been many theories regarding the prediction of the development of exchange rates based on macroeconomic, microeconomic, statistic and other models. The aim of this paper is to identify the adequate model for the prediction of non-stationary time series of exchange rates and then use this model to predict the trend of the development of European currencies against Euro. The uniqueness of this paper is in the fact that there are many expert studies dealing with the prediction of the currency pairs rates of the American dollar with other currency but there is only a limited number of scientific studies concerned with the long-term prediction of European currencies with the help of the integrated ARMA models even though the development of exchange rates has a crucial impact on all levels of economy and its prediction is an important indicator for individual countries, banks, companies and businessmen as well as for investors. The results of this study confirm that to predict the conditional variance and then to estimate the future values of exchange rates, it is adequate to use the ARIMA (1,1,1 model without constant, or ARIMA [(1,7,1,(1,7] model, where in the long-term, the square root of the conditional variance inclines towards stable value.
Statistical modeling of an integrated boiler for coal fired thermal power plant.

Science.gov (United States)

Chandrasekharan, Sreepradha; Panda, Rames Chandra; Swaminathan, Bhuvaneswari Natrajan

2017-06-01

The coal fired thermal power plants plays major role in the power production in the world as they are available in abundance. Many of the existing power plants are based on the subcritical technology which can produce power with the efficiency of around 33%. But the newer plants are built on either supercritical or ultra-supercritical technology whose efficiency can be up to 50%. Main objective of the work is to enhance the efficiency of the existing subcritical power plants to compensate for the increasing demand. For achieving the objective, the statistical modeling of the boiler units such as economizer, drum and the superheater are initially carried out. The effectiveness of the developed models is tested using analysis methods like R 2 analysis and ANOVA (Analysis of Variance). The dependability of the process variable (temperature) on different manipulated variables is analyzed in the paper. Validations of the model are provided with their error analysis. Response surface methodology (RSM) supported by DOE (design of experiments) are implemented to optimize the operating parameters. Individual models along with the integrated model are used to study and design the predictive control of the coal-fired thermal power plant.
Predictive Analytics in Information Systems Research

OpenAIRE

Shmueli, Galit; Koppius, Otto

2011-01-01

textabstractThis research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as well as methods for assessing predictive power. Predictive analytics not only assist in creating practically useful models, they also play an important role alongside explanatory modeling in theory bu...
Testing earthquake prediction algorithms: Statistically significant advance prediction of the largest earthquakes in the Circum-Pacific, 1992-1997

Science.gov (United States)

Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.

1999-01-01

Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
An Integrated Model to Predict Corporate Failure of Listed Companies in Sri Lanka

Directory of Open Access Journals (Sweden)

Nisansala Wijekoon

2015-07-01

Full Text Available The primary objective of this study is to develop an integrated model to predict corporate failure of listed companies in Sri Lanka. The logistic regression analysis was employed to a data set of 70 matched-pairs of failed and non-failed companies listed in the Colombo Stock Exchange (CSE in Sri Lanka over the period 2002 to 2010. A total of fifteen financial ratios and eight corporate governance variables were used as predictor variables of corporate failure. Analysis of the statistical testing results indicated that model consists with both corporate governance variables and financial ratios improved the prediction accuracy to reach 88.57 per cent one year prior to failure. Furthermore, predictive accuracy of this model in all three years prior to failure is above 80 per cent. Hence model is robust in obtaining accurate results for up to three years prior to failure. It was further found that two financial ratios, working capital to total assets and cash flow from operating activities to total assets, and two corporate governance variables, outside director ratio and company audit committee are having more explanatory power to predict corporate failure. Therefore, model developed in this study can assist investors, managers, shareholders, financial institutions, auditors and regulatory agents in Sri Lanka to forecast corporate failure of listed companies.
Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic.

Science.gov (United States)

Wang, Ming; Long, Qi

2016-09-01

Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
Statistical approach to predict compressive strength of high workability slag-cement mortars

International Nuclear Information System (INIS)

Memon, N.A.; Memon, N.A.; Sumadi, S.R.

2009-01-01

This paper reports an attempt made to develop empirical expressions to estimate/ predict the compressive strength of high workability slag-cement mortars. Experimental data of 54 mix mortars were used. The mortars were prepared with slag as cement replacement of the order of 0, 50 and 60%. The flow (workability) was maintained at 136+-3%. The numerical and statistical analysis was performed by using database computer software Microsoft Office Excel 2003. Three empirical mathematical models were developed to estimate/predict 28 days compressive strength of high workability slag cement-mortars with 0, 50 and 60% slag which predict the values accurate between 97 and 98%. Finally a generalized empirical mathematical model was proposed which can predict 28 days compressive strength of high workability mortars up to degree of accuracy 95%. (author)
Direct Breakthrough Curve Prediction From Statistics of Heterogeneous Conductivity Fields

Science.gov (United States)

Hansen, Scott K.; Haslauer, Claus P.; Cirpka, Olaf A.; Vesselinov, Velimir V.

2018-01-01

This paper presents a methodology to predict the shape of solute breakthrough curves in heterogeneous aquifers at early times and/or under high degrees of heterogeneity, both cases in which the classical macrodispersion theory may not be applicable. The methodology relies on the observation that breakthrough curves in heterogeneous media are generally well described by lognormal distributions, and mean breakthrough times can be predicted analytically. The log-variance of solute arrival is thus sufficient to completely specify the breakthrough curves, and this is calibrated as a function of aquifer heterogeneity and dimensionless distance from a source plane by means of Monte Carlo analysis and statistical regression. Using the ensemble of simulated groundwater flow and solute transport realizations employed to calibrate the predictive regression, reliability estimates for the prediction are also developed. Additional theoretical contributions include heuristics for the time until an effective macrodispersion coefficient becomes applicable, and also an expression for its magnitude that applies in highly heterogeneous systems. It is seen that the results here represent a way to derive continuous time random walk transition distributions from physical considerations rather than from empirical field calibration.
The Meta-Analysis of Clinical Judgment Project: Fifty-Six Years of Accumulated Research on Clinical Versus Statistical Prediction

Science.gov (United States)

Aegisdottir, Stefania; White, Michael J.; Spengler, Paul M.; Maugherman, Alan S.; Anderson, Linda A.; Cook, Robert S.; Nichols, Cassandra N.; Lampropoulos, Georgios K.; Walker, Blain S.; Cohen, Genna; Rush, Jeffrey D.

2006-01-01

Clinical predictions made by mental health practitioners are compared with those using statistical approaches. Sixty-seven studies were identified from a comprehensive search of 56 years of research; 92 effect sizes were derived from these studies. The overall effect of clinical versus statistical prediction showed a somewhat greater accuracy for…
Statistical model predictions for p+p and Pb+Pb collisions at LHC

NARCIS (Netherlands)

Kraus, I.; Cleymans, J.; Oeschler, H.; Redlich, K.; Wheaton, S.

2009-01-01

Particle production in p+p and central collisions at LHC is discussed in the context of the statistical thermal model. For heavy-ion collisions, predictions of various particle ratios are presented. The sensitivity of several ratios on the temperature and the baryon chemical potential is studied in
Measuring the data universe data integration using statistical data and metadata exchange

CERN Document Server

Stahl, Reinhold

2018-01-01

This richly illustrated book provides an easy-to-read introduction to the challenges of organizing and integrating modern data worlds, explaining the contribution of public statistics and the ISO standard SDMX (Statistical Data and Metadata Exchange). As such, it is a must for data experts as well those aspiring to become one. Today, exponentially growing data worlds are increasingly determining our professional and private lives. The rapid increase in the amount of globally available data, fueled by search engines and social networks but also by new technical possibilities such as Big Data, offers great opportunities. But whatever the undertaking – driving the block chain revolution or making smart phones even smarter – success will be determined by how well it is possible to integrate, i.e. to collect, link and evaluate, the required data. One crucial factor in this is the introduction of a cross-domain order system in combination with a standardization of the data structure. Using everyday examples, th...
A reductionist perspective on quantum statistical mechanics: Coarse-graining of path integrals.

Science.gov (United States)

Sinitskiy, Anton V; Voth, Gregory A

2015-09-07

Computational modeling of the condensed phase based on classical statistical mechanics has been rapidly developing over the last few decades and has yielded important information on various systems containing up to millions of atoms. However, if a system of interest contains important quantum effects, well-developed classical techniques cannot be used. One way of treating finite temperature quantum systems at equilibrium has been based on Feynman's imaginary time path integral approach and the ensuing quantum-classical isomorphism. This isomorphism is exact only in the limit of infinitely many classical quasiparticles representing each physical quantum particle. In this work, we present a reductionist perspective on this problem based on the emerging methodology of coarse-graining. This perspective allows for the representations of one quantum particle with only two classical-like quasiparticles and their conjugate momenta. One of these coupled quasiparticles is the centroid particle of the quantum path integral quasiparticle distribution. Only this quasiparticle feels the potential energy function. The other quasiparticle directly provides the observable averages of quantum mechanical operators. The theory offers a simplified perspective on quantum statistical mechanics, revealing its most reductionist connection to classical statistical physics. By doing so, it can facilitate a simpler representation of certain quantum effects in complex molecular environments.
A reductionist perspective on quantum statistical mechanics: Coarse-graining of path integrals

International Nuclear Information System (INIS)

Sinitskiy, Anton V.; Voth, Gregory A.

2015-01-01

Computational modeling of the condensed phase based on classical statistical mechanics has been rapidly developing over the last few decades and has yielded important information on various systems containing up to millions of atoms. However, if a system of interest contains important quantum effects, well-developed classical techniques cannot be used. One way of treating finite temperature quantum systems at equilibrium has been based on Feynman’s imaginary time path integral approach and the ensuing quantum-classical isomorphism. This isomorphism is exact only in the limit of infinitely many classical quasiparticles representing each physical quantum particle. In this work, we present a reductionist perspective on this problem based on the emerging methodology of coarse-graining. This perspective allows for the representations of one quantum particle with only two classical-like quasiparticles and their conjugate momenta. One of these coupled quasiparticles is the centroid particle of the quantum path integral quasiparticle distribution. Only this quasiparticle feels the potential energy function. The other quasiparticle directly provides the observable averages of quantum mechanical operators. The theory offers a simplified perspective on quantum statistical mechanics, revealing its most reductionist connection to classical statistical physics. By doing so, it can facilitate a simpler representation of certain quantum effects in complex molecular environments
Program integration of predictive maintenance with reliability centered maintenance

International Nuclear Information System (INIS)

Strong, D.K. Jr; Wray, D.M.

1990-01-01

This paper addresses improving the safety and reliability of power plants in a cost-effective manner by integrating the recently developed reliability centered maintenance techniques with the traditional predictive maintenance techniques of nuclear power plants. The topics of the paper include a description of reliability centered maintenance (RCM), enhancing RCM with predictive maintenance, predictive maintenance programs, condition monitoring techniques, performance test techniques, the mid-Atlantic Reliability Centered Maintenance Users Group, test guides and the benefits of shared guide development
Statistical Basis for Predicting Technological Progress

Science.gov (United States)

Nagy, Béla; Farmer, J. Doyne; Bui, Quan M.; Trancik, Jessika E.

2013-01-01

Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation. PMID:23468837
Statistical basis for predicting technological progress.

Directory of Open Access Journals (Sweden)

Béla Nagy

Full Text Available Forecasting technological progress is of great interest to engineers, policy makers, and private investors. Several models have been proposed for predicting technological improvement, but how well do these models perform? An early hypothesis made by Theodore Wright in 1936 is that cost decreases as a power law of cumulative production. An alternative hypothesis is Moore's law, which can be generalized to say that technologies improve exponentially with time. Other alternatives were proposed by Goddard, Sinclair et al., and Nordhaus. These hypotheses have not previously been rigorously tested. Using a new database on the cost and production of 62 different technologies, which is the most expansive of its kind, we test the ability of six different postulated laws to predict future costs. Our approach involves hindcasting and developing a statistical model to rank the performance of the postulated laws. Wright's law produces the best forecasts, but Moore's law is not far behind. We discover a previously unobserved regularity that production tends to increase exponentially. A combination of an exponential decrease in cost and an exponential increase in production would make Moore's law and Wright's law indistinguishable, as originally pointed out by Sahal. We show for the first time that these regularities are observed in data to such a degree that the performance of these two laws is nearly the same. Our results show that technological progress is forecastable, with the square root of the logarithmic error growing linearly with the forecasting horizon at a typical rate of 2.5% per year. These results have implications for theories of technological change, and assessments of candidate technologies and policies for climate change mitigation.
MirZ: an integrated microRNA expression atlas and target prediction resource.

Science.gov (United States)

Hausser, Jean; Berninger, Philipp; Rodak, Christoph; Jantscher, Yvonne; Wirth, Stefan; Zavolan, Mihaela

2009-07-01

MicroRNAs (miRNAs) are short RNAs that act as guides for the degradation and translational repression of protein-coding mRNAs. A large body of work showed that miRNAs are involved in the regulation of a broad range of biological functions, from development to cardiac and immune system function, to metabolism, to cancer. For most of the over 500 miRNAs that are encoded in the human genome the functions still remain to be uncovered. Identifying miRNAs whose expression changes between cell types or between normal and pathological conditions is an important step towards characterizing their function as is the prediction of mRNAs that could be targeted by these miRNAs. To provide the community the possibility of exploring interactively miRNA expression patterns and the candidate targets of miRNAs in an integrated environment, we developed the MirZ web server, which is accessible at www.mirz.unibas.ch. The server provides experimental and computational biologists with statistical analysis and data mining tools operating on up-to-date databases of sequencing-based miRNA expression profiles and of predicted miRNA target sites in species ranging from Caenorhabditis elegans to Homo sapiens.

Comparison of classical statistical methods and artificial neural network in traffic noise prediction

International Nuclear Information System (INIS)

Nedic, Vladimir; Despotovic, Danijela; Cvetanovic, Slobodan; Despotovic, Milan; Babic, Sasa

2014-01-01

Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L eq . Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model
Comparison of classical statistical methods and artificial neural network in traffic noise prediction

Energy Technology Data Exchange (ETDEWEB)

Nedic, Vladimir, E-mail: vnedic@kg.ac.rs [Faculty of Philology and Arts, University of Kragujevac, Jovana Cvijića bb, 34000 Kragujevac (Serbia); Despotovic, Danijela, E-mail: ddespotovic@kg.ac.rs [Faculty of Economics, University of Kragujevac, Djure Pucara Starog 3, 34000 Kragujevac (Serbia); Cvetanovic, Slobodan, E-mail: slobodan.cvetanovic@eknfak.ni.ac.rs [Faculty of Economics, University of Niš, Trg kralja Aleksandra Ujedinitelja, 18000 Niš (Serbia); Despotovic, Milan, E-mail: mdespotovic@kg.ac.rs [Faculty of Engineering, University of Kragujevac, Sestre Janjic 6, 34000 Kragujevac (Serbia); Babic, Sasa, E-mail: babicsf@yahoo.com [College of Applied Mechanical Engineering, Trstenik (Serbia)

2014-11-15

Traffic is the main source of noise in urban environments and significantly affects human mental and physical health and labor productivity. Therefore it is very important to model the noise produced by various vehicles. Techniques for traffic noise prediction are mainly based on regression analysis, which generally is not good enough to describe the trends of noise. In this paper the application of artificial neural networks (ANNs) for the prediction of traffic noise is presented. As input variables of the neural network, the proposed structure of the traffic flow and the average speed of the traffic flow are chosen. The output variable of the network is the equivalent noise level in the given time period L{sub eq}. Based on these parameters, the network is modeled, trained and tested through a comparative analysis of the calculated values and measured levels of traffic noise using the originally developed user friendly software package. It is shown that the artificial neural networks can be a useful tool for the prediction of noise with sufficient accuracy. In addition, the measured values were also used to calculate equivalent noise level by means of classical methods, and comparative analysis is given. The results clearly show that ANN approach is superior in traffic noise level prediction to any other statistical method. - Highlights: • We proposed an ANN model for prediction of traffic noise. • We developed originally designed user friendly software package. • The results are compared with classical statistical methods. • The results are much better predictive capabilities of ANN model.
Prediction of transmission loss through an aircraft sidewall using statistical energy analysis

Science.gov (United States)

Ming, Ruisen; Sun, Jincai

1989-06-01

The transmission loss of randomly incident sound through an aircraft sidewall is investigated using statistical energy analysis. Formulas are also obtained for the simple calculation of sound transmission loss through single- and double-leaf panels. Both resonant and nonresonant sound transmissions can be easily calculated using the formulas. The formulas are used to predict sound transmission losses through a Y-7 propeller airplane panel. The panel measures 2.56 m x 1.38 m and has two windows. The agreement between predicted and measured values through most of the frequency ranges tested is quite good.
Predicting future protection of respirator users: Statistical approaches and practical implications.

Science.gov (United States)

Hu, Chengcheng; Harber, Philip; Su, Jing

2016-01-01

The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.
Statistical prediction of AVB wear growth and initiation in model F steam generator tubes using Monte Carlo method

International Nuclear Information System (INIS)

Lee, Jae Bong; Park, Jae Hak; Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong

2005-01-01

The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained
Statistical prediction of AVB wear growth and initiation in model F steam generator tubes using Monte Carlo method

Energy Technology Data Exchange (ETDEWEB)

Lee, Jae Bong; Park, Jae Hak [Chungbuk National Univ., Cheongju (Korea, Republic of); Kim, Hong Deok; Chung, Han Sub; Kim, Tae Ryong [Korea Electtric Power Research Institute, Daejeon (Korea, Republic of)

2005-07-01

The growth of AVB wear in Model F steam generator tubes is predicted using the Monte Carlo Method and statistical approaches. The statistical parameters that represent the characteristics of wear growth and wear initiation are derived from In-Service Inspection (ISI) Non-Destructive Evaluation (NDE) data. Based on the statistical approaches, wear growth model are proposed and applied to predict wear distribution at the End Of Cycle (EOC). Probabilistic distributions of the number of wear flaws and maximum wear depth at EOC are obtained from the analysis. Comparing the predicted EOC wear flaw data with the known EOC data the usefulness of the proposed method is examined and satisfactory results are obtained.
Statistical Indicators of Ethno-Cultural Community Integration in Canadian Society = Indicateurs statistiques de l'integration des communautes ethnoculturelles dans la societe canadienne.

Science.gov (United States)

de Vries, John

This paper addresses the issue of measuring the integration of various ethnocultural communities into Canadian society by means of statistical or social indicators. The overall philosophy of the study is based on the following principles: (1) indicators should have a clear meaning with respect to the underlying concept of integration; (2)…
Thermodynamics and statistical mechanics an integrated approach

CERN Document Server

Shell, M Scott

2015-01-01

Learn classical thermodynamics alongside statistical mechanics with this fresh approach to the subjects. Molecular and macroscopic principles are explained in an integrated, side-by-side manner to give students a deep, intuitive understanding of thermodynamics and equip them to tackle future research topics that focus on the nanoscale. Entropy is introduced from the get-go, providing a clear explanation of how the classical laws connect to the molecular principles, and closing the gap between the atomic world and thermodynamics. Notation is streamlined throughout, with a focus on general concepts and simple models, for building basic physical intuition and gaining confidence in problem analysis and model development. Well over 400 guided end-of-chapter problems are included, addressing conceptual, fundamental, and applied skill sets. Numerous worked examples are also provided together with handy shaded boxes to emphasize key concepts, making this the complete teaching package for students in chemical engineer...
Long-time predictions in nonlinear dynamics

Science.gov (United States)

Szebehely, V.

1980-01-01

It is known that nonintegrable dynamical systems do not allow precise predictions concerning their behavior for arbitrary long times. The available series solutions are not uniformly convergent according to Poincare's theorem and numerical integrations lose their meaningfulness after the elapse of arbitrary long times. Two approaches are the use of existing global integrals and statistical methods. This paper presents a generalized method along the first approach. As examples long-time predictions in the classical gravitational satellite and planetary problems are treated.
Predictive Solar-Integrated Commercial Building Load Control

Energy Technology Data Exchange (ETDEWEB)

Glasgow, Nathan [EdgePower Inc., Aspen, CO (United States)

2017-01-31

This report is the final technical report for the Department of Energy SunShot award number EE0007180 to EdgePower Inc., for the project entitled “Predictive Solar-Integrated Commercial Building Load Control.” The goal of this project was to successfully prove that the integration of solar forecasting and building load control can reduce demand charge costs for commercial building owners with solar PV. This proof of concept Tier 0 project demonstrated its value through a pilot project at a commercial building. This final report contains a summary of the work completed through he duration of the project. Clean Power Research was a sub-recipient on the award.
Online Sentence Comprehension in PPA: Verb-Based Integration and Prediction

Directory of Open Access Journals (Sweden)

Jennifer E Mack

2015-05-01

Full Text Available Introduction. Impaired language comprehension is frequently observed in primary progressive aphasia (PPA. Word comprehension deficits are characteristic of the semantic variant (PPA-S whereas sentence comprehension deficits are more prevalent in the agrammatic (PPA-G and logopenic (PPA-L variants (Amici et al., 2007; Gorno-Tempini et al., 2011; Thompson et al., 2013. Word and sentence comprehension deficits have also been shown to have distinct neural substrates in PPA (Mesulam, Thompson, Weintraub, & Rogalski, in press. However, little is known about the relationship between word and sentence comprehension processes in PPA, specifically how words are accessed, combined, and used to predict upcoming elements within a sentence. A previous study demonstrated that listeners with stroke-induced agrammatic aphasia rapidly access verb meanings and use them to semantically integrate verb-arguments; however, they show deficits in using verb meanings predictively (Mack, Ji, & Thompson, 2013. The present study tested whether listeners with PPA are able to access verb meanings and to use this information to integrate and predict verb-arguments. Methods. Fifteen adults with PPA (8 with PPA-G, 3 with PPA-L, and 4 with PPA-S and ten age-matched controls participated in two eyetracking experiments. In both experiments, participants heard sentences with restrictive verbs that were semantically compatible with only one object in a four-picture visual array (e.g., eat when the array included a cake and three non-edible objects and unrestrictive verbs (e.g., move that were compatible with all four objects. The verb-based integration experiment tested access to verb meaning and its effects on integration of the direct object (e.g., Susan will eat/move the cake; the verb-based prediction experiment examined prediction of the direct object (e.g., Susan will eat/move the …. The dependent variable was the rate of fixations on the target picture (e.g., the cake in the
Integrated statistical learning of metabolic ion mobility spectrometry profiles for pulmonary disease identification

DEFF Research Database (Denmark)

Hauschild, A.C.; Baumbach, Jan; Baumbach, J.

2012-01-01

sophisticated statistical learning techniques for VOC-based feature selection and supervised classification into patient groups. We analyzed breath data from 84 volunteers, each of them either suffering from chronic obstructive pulmonary disease (COPD), or both COPD and bronchial carcinoma (COPD + BC), as well...... as from 35 healthy volunteers, comprising a control group (CG). We standardized and integrated several statistical learning methods to provide a broad overview of their potential for distinguishing the patient groups. We found that there is strong potential for separating MCC/IMS chromatograms of healthy...... patients from healthy controls. We conclude that these statistical learning methods have a generally high accuracy when applied to well-structured, medical MCC/IMS data....
The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction.

Science.gov (United States)

Roche, Daniel B; Buenavista, Maria T; Tetchner, Stuart J; McGuffin, Liam J

2011-07-01

The IntFOLD server is a novel independent server that integrates several cutting edge methods for the prediction of structure and function from sequence. Our guiding principles behind the server development were as follows: (i) to provide a simple unified resource that makes our prediction software accessible to all and (ii) to produce integrated output for predictions that can be easily interpreted. The output for predictions is presented as a simple table that summarizes all results graphically via plots and annotated 3D models. The raw machine readable data files for each set of predictions are also provided for developers, which comply with the Critical Assessment of Methods for Protein Structure Prediction (CASP) data standards. The server comprises an integrated suite of five novel methods: nFOLD4, for tertiary structure prediction; ModFOLD 3.0, for model quality assessment; DISOclust 2.0, for disorder prediction; DomFOLD 2.0 for domain prediction; and FunFOLD 1.0, for ligand binding site prediction. Predictions from the IntFOLD server were found to be competitive in several categories in the recent CASP9 experiment. The IntFOLD server is available at the following web site: http://www.reading.ac.uk/bioinf/IntFOLD/.
Gaussian orthogonal ensemble statistics in graphene billiards with the shape of classically integrable billiards.

Science.gov (United States)

Yu, Pei; Li, Zi-Yuan; Xu, Hong-Ya; Huang, Liang; Dietz, Barbara; Grebogi, Celso; Lai, Ying-Cheng

2016-12-01

A crucial result in quantum chaos, which has been established for a long time, is that the spectral properties of classically integrable systems generically are described by Poisson statistics, whereas those of time-reversal symmetric, classically chaotic systems coincide with those of random matrices from the Gaussian orthogonal ensemble (GOE). Does this result hold for two-dimensional Dirac material systems? To address this fundamental question, we investigate the spectral properties in a representative class of graphene billiards with shapes of classically integrable circular-sector billiards. Naively one may expect to observe Poisson statistics, which is indeed true for energies close to the band edges where the quasiparticle obeys the Schrödinger equation. However, for energies near the Dirac point, where the quasiparticles behave like massless Dirac fermions, Poisson statistics is extremely rare in the sense that it emerges only under quite strict symmetry constraints on the straight boundary parts of the sector. An arbitrarily small amount of imperfection of the boundary results in GOE statistics. This implies that, for circular-sector confinements with arbitrary angle, the spectral properties will generically be GOE. These results are corroborated by extensive numerical computation. Furthermore, we provide a physical understanding for our results.
Gaussian orthogonal ensemble statistics in graphene billiards with the shape of classically integrable billiards

Science.gov (United States)

Yu, Pei; Li, Zi-Yuan; Xu, Hong-Ya; Huang, Liang; Dietz, Barbara; Grebogi, Celso; Lai, Ying-Cheng

2016-12-01

A crucial result in quantum chaos, which has been established for a long time, is that the spectral properties of classically integrable systems generically are described by Poisson statistics, whereas those of time-reversal symmetric, classically chaotic systems coincide with those of random matrices from the Gaussian orthogonal ensemble (GOE). Does this result hold for two-dimensional Dirac material systems? To address this fundamental question, we investigate the spectral properties in a representative class of graphene billiards with shapes of classically integrable circular-sector billiards. Naively one may expect to observe Poisson statistics, which is indeed true for energies close to the band edges where the quasiparticle obeys the Schrödinger equation. However, for energies near the Dirac point, where the quasiparticles behave like massless Dirac fermions, Poisson statistics is extremely rare in the sense that it emerges only under quite strict symmetry constraints on the straight boundary parts of the sector. An arbitrarily small amount of imperfection of the boundary results in GOE statistics. This implies that, for circular-sector confinements with arbitrary angle, the spectral properties will generically be GOE. These results are corroborated by extensive numerical computation. Furthermore, we provide a physical understanding for our results.
Statistical modeling of an integrated boiler for coal fired thermal power plant

Directory of Open Access Journals (Sweden)

Sreepradha Chandrasekharan

2017-06-01

Full Text Available The coal fired thermal power plants plays major role in the power production in the world as they are available in abundance. Many of the existing power plants are based on the subcritical technology which can produce power with the efficiency of around 33%. But the newer plants are built on either supercritical or ultra-supercritical technology whose efficiency can be up to 50%. Main objective of the work is to enhance the efficiency of the existing subcritical power plants to compensate for the increasing demand. For achieving the objective, the statistical modeling of the boiler units such as economizer, drum and the superheater are initially carried out. The effectiveness of the developed models is tested using analysis methods like R2 analysis and ANOVA (Analysis of Variance. The dependability of the process variable (temperature on different manipulated variables is analyzed in the paper. Validations of the model are provided with their error analysis. Response surface methodology (RSM supported by DOE (design of experiments are implemented to optimize the operating parameters. Individual models along with the integrated model are used to study and design the predictive control of the coal-fired thermal power plant. Keywords: Chemical engineering, Applied mathematics
Five year prediction of Sea Surface Temperature in the Tropical Atlantic: a comparison of simple statistical methods

OpenAIRE

Laepple, Thomas; Jewson, Stephen; Meagher, Jonathan; O'Shay, Adam; Penzer, Jeremy

2007-01-01

We are developing schemes that predict future hurricane numbers by first predicting future sea surface temperatures (SSTs), and then apply the observed statistical relationship between SST and hurricane numbers. As part of this overall goal, in this study we compare the historical performance of three simple statistical methods for making five-year SST forecasts. We also present SST forecasts for 2006-2010 using these methods and compare them to forecasts made from two structural time series ...
Predicting Protein Function via Semantic Integration of Multiple Networks.

Science.gov (United States)

Yu, Guoxian; Fu, Guangyuan; Wang, Jun; Zhu, Hailong

2016-01-01

Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet, to Semantically integrate multiple functional association Networks derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet.
Predicting axillary lymph node metastasis from kinetic statistics of DCE-MRI breast images

Science.gov (United States)

Ashraf, Ahmed B.; Lin, Lilie; Gavenonis, Sara C.; Mies, Carolyn; Xanthopoulos, Eric; Kontos, Despina

2012-03-01

The presence of axillary lymph node metastases is the most important prognostic factor in breast cancer and can influence the selection of adjuvant therapy, both chemotherapy and radiotherapy. In this work we present a set of kinetic statistics derived from DCE-MRI for predicting axillary node status. Breast DCE-MRI images from 69 women with known nodal status were analyzed retrospectively under HIPAA and IRB approval. Axillary lymph nodes were positive in 12 patients while 57 patients had no axillary lymph node involvement. Kinetic curves for each pixel were computed and a pixel-wise map of time-to-peak (TTP) was obtained. Pixels were first partitioned according to the similarity of their kinetic behavior, based on TTP values. For every kinetic curve, the following pixel-wise features were computed: peak enhancement (PE), wash-in-slope (WIS), wash-out-slope (WOS). Partition-wise statistics for every feature map were calculated, resulting in a total of 21 kinetic statistic features. ANOVA analysis was done to select features that differ significantly between node positive and node negative women. Using the computed kinetic statistic features a leave-one-out SVM classifier was learned that performs with AUC=0.77 under the ROC curve, outperforming the conventional kinetic measures, including maximum peak enhancement (MPE) and signal enhancement ratio (SER), (AUCs of 0.61 and 0.57 respectively). These findings suggest that our DCE-MRI kinetic statistic features can be used to improve the prediction of axillary node status in breast cancer patients. Such features could ultimately be used as imaging biomarkers to guide personalized treatment choices for women diagnosed with breast cancer.
The integration of weighted human gene association networks based on link prediction.

Science.gov (United States)

Yang, Jian; Yang, Tinghong; Wu, Duzhi; Lin, Limei; Yang, Fan; Zhao, Jing

2017-01-31

Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise. Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network. The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.

Statistical and Machine-Learning Data Mining Techniques for Better Predictive Modeling and Analysis of Big Data

CERN Document Server

Ratner, Bruce

2011-01-01

The second edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. The first edition, titled Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, contained 17 chapters of innovative and practical statistical data mining techniques. In this second edition, renamed to reflect the increased coverage of machine-learning data mining techniques, the author has
RADSS: an integration of GIS, spatial statistics, and network service for regional data mining

Science.gov (United States)

Hu, Haitang; Bao, Shuming; Lin, Hui; Zhu, Qing

2005-10-01

Regional data mining, which aims at the discovery of knowledge about spatial patterns, clusters or association between regions, has widely applications nowadays in social science, such as sociology, economics, epidemiology, crime, and so on. Many applications in the regional or other social sciences are more concerned with the spatial relationship, rather than the precise geographical location. Based on the spatial continuity rule derived from Tobler's first law of geography: observations at two sites tend to be more similar to each other if the sites are close together than if far apart, spatial statistics, as an important means for spatial data mining, allow the users to extract the interesting and useful information like spatial pattern, spatial structure, spatial association, spatial outlier and spatial interaction, from the vast amount of spatial data or non-spatial data. Therefore, by integrating with the spatial statistical methods, the geographical information systems will become more powerful in gaining further insights into the nature of spatial structure of regional system, and help the researchers to be more careful when selecting appropriate models. However, the lack of such tools holds back the application of spatial data analysis techniques and development of new methods and models (e.g., spatio-temporal models). Herein, we make an attempt to develop such an integrated software and apply it into the complex system analysis for the Poyang Lake Basin. This paper presents a framework for integrating GIS, spatial statistics and network service in regional data mining, as well as their implementation. After discussing the spatial statistics methods involved in regional complex system analysis, we introduce RADSS (Regional Analysis and Decision Support System), our new regional data mining tool, by integrating GIS, spatial statistics and network service. RADSS includes the functions of spatial data visualization, exploratory spatial data analysis, and
A statistical approach to the prediction of pressure tube fracture toughness

International Nuclear Information System (INIS)

Pandey, M.D.; Radford, D.D.

2008-01-01

The fracture toughness of the zirconium alloy (Zr-2.5Nb) is an important parameter in determining the flaw tolerance for operation of pressure tubes in a nuclear reactor. Fracture toughness data have been generated by performing rising pressure burst tests on sections of pressure tubes removed from operating reactors. The test data were used to generate a lower-bound fracture toughness curve, which is used in defining the operational limits of pressure tubes. The paper presents a comprehensive statistical analysis of burst test data and develops a multivariate statistical model to relate toughness with material chemistry, mechanical properties, and operational history. The proposed model can be useful in predicting fracture toughness of specific in-service pressure tubes, thereby minimizing conservatism associated with a generic lower-bound approach
Predicting energy performance of a net-zero energy building: A statistical approach

International Nuclear Information System (INIS)

Kneifel, Joshua; Webb, David

2016-01-01

Highlights: • A regression model is applied to actual energy data from a net-zero energy building. • The model is validated through a rigorous statistical analysis. • Comparisons are made between model predictions and those of a physics-based model. • The model is a viable baseline for evaluating future models from the energy data. - Abstract: Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid Climate Zone, and compares these
Evaluating Computer-Based Simulations, Multimedia and Animations that Help Integrate Blended Learning with Lectures in First Year Statistics

Science.gov (United States)

Neumann, David L.; Neumann, Michelle M.; Hood, Michelle

2011-01-01

The discipline of statistics seems well suited to the integration of technology in a lecture as a means to enhance student learning and engagement. Technology can be used to simulate statistical concepts, create interactive learning exercises, and illustrate real world applications of statistics. The present study aimed to better understand the…
Predicting the Diagnostic and Statistical Manual of Mental Disorders (Fifth Edition): The Mystery of How to Constrain Unchecked Growth.

Science.gov (United States)

Blashfield, Roger K; Fuller, A Kenneth

2016-06-01

Twenty years ago, slightly after the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition was published, we predicted the characteristics of the future Diagnostic and Statistical Manual of Mental Disorders (fifth edition) (). Included in our predictions were how many diagnoses it would contain, the physical size of the Diagnostic and Statistical Manual of Mental Disorders (fifth edition), who its leader would be, how many professionals would be involved in creating it, the revenue generated, and the color of its cover. This article reports on the accuracy of our predictions. Our largest prediction error concerned financial revenue. The earnings growth of the DSM's has been remarkable. Drug company investments, insurance benefits, the financial need of the American Psychiatric Association, and the research grant process are factors that have stimulated the growth of the DSM's. Restoring order and simplicity to the classification of mental disorders will not be a trivial task.
Comparison of statistical and clinical predictions of functional outcome after ischemic stroke.

Directory of Open Access Journals (Sweden)

Douglas D Thompson

Full Text Available To determine whether the predictions of functional outcome after ischemic stroke made at the bedside using a doctor's clinical experience were more or less accurate than the predictions made by clinical prediction models (CPMs.A prospective cohort study of nine hundred and thirty one ischemic stroke patients recruited consecutively at the outpatient, inpatient and emergency departments of the Western General Hospital, Edinburgh between 2002 and 2005. Doctors made informal predictions of six month functional outcome on the Oxford Handicap Scale (OHS. Patients were followed up at six months with a validated postal questionnaire. For each patient we calculated the absolute predicted risk of death or dependence (OHS≥3 using five previously described CPMs. The specificity of a doctor's informal predictions of OHS≥3 at six months was good 0.96 (95% CI: 0.94 to 0.97 and similar to CPMs (range 0.94 to 0.96; however the sensitivity of both informal clinical predictions 0.44 (95% CI: 0.39 to 0.49 and clinical prediction models (range 0.38 to 0.45 was poor. The prediction of the level of disability after stroke was similar for informal clinical predictions (ordinal c-statistic 0.74 with 95% CI 0.72 to 0.76 and CPMs (range 0.69 to 0.75. No patient or clinician characteristic affected the accuracy of informal predictions, though predictions were more accurate in outpatients.CPMs are at least as good as informal clinical predictions in discriminating between good and bad functional outcome after ischemic stroke. The place of these models in clinical practice has yet to be determined.
Symbol recognition via statistical integration of pixel-level constraint histograms: a new descriptor.

Science.gov (United States)

Yang, Su

2005-02-01

A new descriptor for symbol recognition is proposed. 1) A histogram is constructed for every pixel to figure out the distribution of the constraints among the other pixels. 2) All the histograms are statistically integrated to form a feature vector with fixed dimension. The robustness and invariance were experimentally confirmed.
Integrating geophysics and hydrology for reducing the uncertainty of groundwater model predictions and improved prediction performance

DEFF Research Database (Denmark)

Christensen, Nikolaj Kruse; Christensen, Steen; Ferre, Ty

the integration of geophysical data in the construction of a groundwater model increases the prediction performance. We suggest that modelers should perform a hydrogeophysical “test-bench” analysis of the likely value of geophysics data for improving groundwater model prediction performance before actually...... and the resulting predictions can be compared with predictions from the ‘true’ model. By performing this analysis we expect to give the modeler insight into how the uncertainty of model-based prediction can be reduced.......A major purpose of groundwater modeling is to help decision-makers in efforts to manage the natural environment. Increasingly, it is recognized that both the predictions of interest and their associated uncertainties should be quantified to support robust decision making. In particular, decision...
Hybrid perturbation methods based on statistical time series models

Science.gov (United States)

San-Juan, Juan Félix; San-Martín, Montserrat; Pérez, Iván; López, Rosario

2016-04-01

In this work we present a new methodology for orbit propagation, the hybrid perturbation theory, based on the combination of an integration method and a prediction technique. The former, which can be a numerical, analytical or semianalytical theory, generates an initial approximation that contains some inaccuracies derived from the fact that, in order to simplify the expressions and subsequent computations, not all the involved forces are taken into account and only low-order terms are considered, not to mention the fact that mathematical models of perturbations not always reproduce physical phenomena with absolute precision. The prediction technique, which can be based on either statistical time series models or computational intelligence methods, is aimed at modelling and reproducing missing dynamics in the previously integrated approximation. This combination results in the precision improvement of conventional numerical, analytical and semianalytical theories for determining the position and velocity of any artificial satellite or space debris object. In order to validate this methodology, we present a family of three hybrid orbit propagators formed by the combination of three different orders of approximation of an analytical theory and a statistical time series model, and analyse their capability to process the effect produced by the flattening of the Earth. The three considered analytical components are the integration of the Kepler problem, a first-order and a second-order analytical theories, whereas the prediction technique is the same in the three cases, namely an additive Holt-Winters method.
Integrated Simulation for HVAC Performance Prediction: State-of-the-Art Illustration

NARCIS (Netherlands)

Hensen, J.L.M.; Clarke, J.A.

2000-01-01

This paper aims to outline the current state-of-the-art in integrated building simulation for performance prediction of heating, ventilating and air-conditioning (HVAC) systems. The ESP-r system is used as an example where integrated simulation is a core philosophy behind the development. The
A Unified Statistical Rain-Attenuation Model for Communication Link Fade Predictions and Optimal Stochastic Fade Control Design Using a Location-Dependent Rain-Statistic Database

Science.gov (United States)

Manning, Robert M.

1990-01-01

A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.
Statistical Energy Analysis (SEA) and Energy Finite Element Analysis (EFEA) Predictions for a Floor-Equipped Composite Cylinder

Science.gov (United States)

Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.

2011-01-01

Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
Predicting tube repair at French nuclear steam generators using statistical modeling

Energy Technology Data Exchange (ETDEWEB)

Mathon, C., E-mail: cedric.mathon@edf.fr [EDF Generation, Basic Design Department (SEPTEN), 69628 Villeurbanne (France); Chaudhary, A. [EDF Generation, Basic Design Department (SEPTEN), 69628 Villeurbanne (France); Gay, N.; Pitner, P. [EDF Generation, Nuclear Operation Division (UNIE), Saint-Denis (France)

2014-04-01

Electricité de France (EDF) currently operates a total of 58 Nuclear Pressurized Water Reactors (PWR) which are composed of 34 units of 900 MWe, 20 units of 1300 MWe and 4 units of 1450 MWe. This report provides an overall status of SG tube bundles on the 1300 MWe units. These units are 4 loop reactors using the AREVA 68/19 type SG model which are equipped either with Alloy 600 thermally treated (TT) tubes or Alloy 690 TT tubes. As of 2011, the effective full power years of operation (EFPY) ranges from 13 to 20 and during this time, the main degradation mechanisms observed on SG tubes are primary water stress corrosion cracking (PWSCC) and wear at anti-vibration bars (AVB) level. Statistical models have been developed for each type of degradation in order to predict the growth rate and number of affected tubes. Additional plugging is also performed to prevent other degradations such as tube wear due to foreign objects or high-cycle flow-induced fatigue. The contribution of these degradation mechanisms on the rate of tube plugging is described. The results from the statistical models are then used in predicting the long-term life of the steam generators and therefore providing a useful tool toward their effective life management and possible replacement.
About the statistical description of gas-liquid flows

Energy Technology Data Exchange (ETDEWEB)

Sanz, D.; Guido-Lavalle, G.; Carrica, P. [Centro Atomico Bariloche and Instituto Balseiro (Argentina)] [and others

1995-09-01

Elements of the probabilistic geometry are used to derive the bubble coalescence term of the statistical description of gas liquid flows. It is shown that the Boltzmann`s hypothesis, that leads to the kinetic theory of dilute gases, is not appropriate for this kind of flows. The resulting integro-differential transport equation is numerically integrated to study the flow development in slender bubble columns. The solution remarkably predicts the transition from bubbly to slug flow pattern. Moreover, a bubbly bimodal size distribution is predicted, which has already been observed experimentally.
DYNAMIC STABILITY OF THE SOLAR SYSTEM: STATISTICALLY INCONCLUSIVE RESULTS FROM ENSEMBLE INTEGRATIONS

Energy Technology Data Exchange (ETDEWEB)

Zeebe, Richard E., E-mail: zeebe@soest.hawaii.edu [School of Ocean and Earth Science and Technology, University of Hawaii at Manoa, 1000 Pope Road, MSB 629, Honolulu, HI 96822 (United States)

2015-01-01

Due to the chaotic nature of the solar system, the question of its long-term stability can only be answered in a statistical sense, for instance, based on numerical ensemble integrations of nearby orbits. Destabilization of the inner planets, leading to close encounters and/or collisions can be initiated through a large increase in Mercury's eccentricity, with a currently assumed likelihood of ∼1%. However, little is known at present about the robustness of this number. Here I report ensemble integrations of the full equations of motion of the eight planets and Pluto over 5 Gyr, including contributions from general relativity. The results show that different numerical algorithms lead to statistically different results for the evolution of Mercury's eccentricity (e{sub M}). For instance, starting at present initial conditions (e{sub M}≃0.21), Mercury's maximum eccentricity achieved over 5 Gyr is, on average, significantly higher in symplectic ensemble integrations using heliocentric rather than Jacobi coordinates and stricter error control. In contrast, starting at a possible future configuration (e{sub M}≃0.53), Mercury's maximum eccentricity achieved over the subsequent 500 Myr is, on average, significantly lower using heliocentric rather than Jacobi coordinates. For example, the probability for e{sub M} to increase beyond 0.53 over 500 Myr is >90% (Jacobi) versus only 40%-55% (heliocentric). This poses a dilemma because the physical evolution of the real system—and its probabilistic behavior—cannot depend on the coordinate system or the numerical algorithm chosen to describe it. Some tests of the numerical algorithms suggest that symplectic integrators using heliocentric coordinates underestimate the odds for destabilization of Mercury's orbit at high initial e{sub M}.
Statistical characterization of pitting corrosion process and life prediction

International Nuclear Information System (INIS)

Sheikh, A.K.; Younas, M.

1995-01-01

In order to prevent corrosion failures of machines and structures, it is desirable to know in advance when the corrosion damage will take place, and appropriate measures are needed to mitigate the damage. The corrosion predictions are needed both at development as well as operational stage of machines and structures. There are several forms of corrosion process through which varying degrees of damage can occur. Under certain conditions these corrosion processes at alone and in other set of conditions, several of these processes may occur simultaneously. For a certain type of machine elements and structures, such as gears, bearing, tubes, pipelines, containers, storage tanks etc., are particularly prone to pitting corrosion which is an insidious form of corrosion. The corrosion predictions are usually based on experimental results obtained from test coupons and/or field experiences of similar machines or parts of a structure. Considerable scatter is observed in corrosion processes. The probabilities nature and kinetics of pitting process makes in necessary to use statistical method to forecast the residual life of machine of structures. The focus of this paper is to characterization pitting as a time-dependent random process, and using this characterization the prediction of life to reach a critical level of pitting damage can be made. Using several data sets from literature on pitting corrosion, the extreme value modeling of pitting corrosion process, the evolution of the extreme value distribution in time, and their relationship to the reliability of machines and structure are explained. (author)
Comparison of four statistical and machine learning methods for crash severity prediction.

Science.gov (United States)

Iranitalab, Amirfarrokh; Khattak, Aemal

2017-11-01

Crash severity prediction models enable different agencies to predict the severity of a reported crash with unknown severity or the severity of crashes that may be expected to occur sometime in the future. This paper had three main objectives: comparison of the performance of four statistical and machine learning methods including Multinomial Logit (MNL), Nearest Neighbor Classification (NNC), Support Vector Machines (SVM) and Random Forests (RF), in predicting traffic crash severity; developing a crash costs-based approach for comparison of crash severity prediction methods; and investigating the effects of data clustering methods comprising K-means Clustering (KC) and Latent Class Clustering (LCC), on the performance of crash severity prediction models. The 2012-2015 reported crash data from Nebraska, United States was obtained and two-vehicle crashes were extracted as the analysis data. The dataset was split into training/estimation (2012-2014) and validation (2015) subsets. The four prediction methods were trained/estimated using the training/estimation dataset and the correct prediction rates for each crash severity level, overall correct prediction rate and a proposed crash costs-based accuracy measure were obtained for the validation dataset. The correct prediction rates and the proposed approach showed NNC had the best prediction performance in overall and in more severe crashes. RF and SVM had the next two sufficient performances and MNL was the weakest method. Data clustering did not affect the prediction results of SVM, but KC improved the prediction performance of MNL, NNC and RF, while LCC caused improvement in MNL and RF but weakened the performance of NNC. Overall correct prediction rate had almost the exact opposite results compared to the proposed approach, showing that neglecting the crash costs can lead to misjudgment in choosing the right prediction method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Statistical Models to Assess the Health Effects and to Forecast Ground Level Ozone

Czech Academy of Sciences Publication Activity Database

Schlink, U.; Herbath, O.; Richter, M.; Dorling, S.; Nunnari, G.; Cawley, G.; Pelikán, Emil

2006-01-01

Roč. 21, č. 4 (2006), s. 547-558 ISSN 1364-8152 R&D Projects: GA AV ČR 1ET400300414 Institutional research plan: CEZ:AV0Z10300504 Keywords : statistical models * ground level ozone * health effects * logistic model * forecasting * prediction performance * neural network * generalised additive model * integrated assessment Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.992, year: 2006
Edaphic history over seedling characters predicts integration and plasticity of integration across geologically variable populations of Arabidopsis thaliana.

Science.gov (United States)

Cousins, Elsa A; Murren, Courtney J

2017-12-01

Studies on phenotypic plasticity and plasticity of integration have uncovered functionally linked modules of aboveground traits and seedlings of Arabidopsis thaliana , but we lack details about belowground variation in adult plants. Functional modules can be comprised of additional suites of traits that respond to environmental variation. We assessed whether shoot and root responses to nutrient environments in adult A. thaliana were predictable from seedling traits or population-specific geologic soil characteristics at the site of origin. We compared 17 natural accessions from across the native range of A. thaliana using 14-day-old seedlings grown on agar or sand and plants grown to maturity across nutrient treatments in sand. We measured aboveground size, reproduction, timing traits, root length, and root diameter. Edaphic characteristics were obtained from a global-scale dataset and related to field data. We detected significant among-population variation in root traits of seedlings and adults and in plasticity in aboveground and belowground traits of adult plants. Phenotypic integration of roots and shoots varied by population and environment. Relative integration was greater in roots than in shoots, and integration was predicted by edaphic soil history, particularly organic carbon content, whereas seedling traits did not predict later ontogenetic stages. Soil environment of origin has significant effects on phenotypic plasticity in response to nutrients, and on phenotypic integration of root modules and shoot modules. Root traits varied among populations in reproductively mature individuals, indicating potential for adaptive and integrated functional responses of root systems in annuals. © 2017 Botanical Society of America.

COMPUTING THERAPY FOR PRECISION MEDICINE: COLLABORATIVE FILTERING INTEGRATES AND PREDICTS MULTI-ENTITY INTERACTIONS.

Science.gov (United States)

Regenbogen, Sam; Wilkins, Angela D; Lichtarge, Olivier

2016-01-01

Biomedicine produces copious information it cannot fully exploit. Specifically, there is considerable need to integrate knowledge from disparate studies to discover connections across domains. Here, we used a Collaborative Filtering approach, inspired by online recommendation algorithms, in which non-negative matrix factorization (NMF) predicts interactions among chemicals, genes, and diseases only from pairwise information about their interactions. Our approach, applied to matrices derived from the Comparative Toxicogenomics Database, successfully recovered Chemical-Disease, Chemical-Gene, and Disease-Gene networks in 10-fold cross-validation experiments. Additionally, we could predict each of these interaction matrices from the other two. Integrating all three CTD interaction matrices with NMF led to good predictions of STRING, an independent, external network of protein-protein interactions. Finally, this approach could integrate the CTD and STRING interaction data to improve Chemical-Gene cross-validation performance significantly, and, in a time-stamped study, it predicted information added to CTD after a given date, using only data prior to that date. We conclude that collaborative filtering can integrate information across multiple types of biological entities, and that as a first step towards precision medicine it can compute drug repurposing hypotheses.
Integrated Wind Power Planning Tool

DEFF Research Database (Denmark)

Rosgaard, M. H.; Hahmann, Andrea N.; Nielsen, T. S.

This poster describes the status as of April 2012 of the Public Service Obligation (PSO) funded project PSO 10464 \\Integrated Wind Power Planning Tool". The project goal is to integrate a meso scale numerical weather prediction (NWP) model with a statistical tool in order to better predict short...... term power variation from off shore wind farms, as well as to conduct forecast error assessment studies in preparation for later implementation of such a feature in an existing simulation model. The addition of a forecast error estimation feature will further increase the value of this tool, as it...
Optimal day-ahead wind-thermal unit commitment considering statistical and predicted features of wind speeds

International Nuclear Information System (INIS)

Sun, Yanan; Dong, Jizhe; Ding, Lijuan

2017-01-01

Highlights: • A day–ahead wind–thermal unit commitment model is presented. • Wind speed transfer matrix is formed to depict the sequential wind features. • Spinning reserve setting considering wind power accuracy and variation is proposed. • Verified study is performed to check the correctness of the program. - Abstract: The increasing penetration of intermittent wind power affects the secure operation of power systems and leads to a requirement of robust and economic generation scheduling. This paper presents an optimal day–ahead wind–thermal generation scheduling method that considers the statistical and predicted features of wind speeds. In this method, the statistical analysis of historical wind data, which represents the local wind regime, is first implemented. Then, according to the statistical results and the predicted wind power, the spinning reserve requirements for the scheduling period are calculated. Based on the calculated spinning reserve requirements, the wind–thermal generation scheduling is finally conducted. To validate the program, a verified study is performed on a test system. Then, numerical studies to demonstrate the effectiveness of the proposed method are conducted.
Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data

Directory of Open Access Journals (Sweden)

Alexander P. Kartun-Giles

2018-04-01

Full Text Available A projective network model is a model that enables predictions to be made based on a subsample of the network data, with the predictions remaining unchanged if a larger sample is taken into consideration. An exchangeable model is a model that does not depend on the order in which nodes are sampled. Despite a large variety of non-equilibrium (growing and equilibrium (static sparse complex network models that are widely used in network science, how to reconcile sparseness (constant average degree with the desired statistical properties of projectivity and exchangeability is currently an outstanding scientific problem. Here we propose a network process with hidden variables which is projective and can generate sparse power-law networks. Despite the model not being exchangeable, it can be closely related to exchangeable uncorrelated networks as indicated by its information theory characterization and its network entropy. The use of the proposed network process as a null model is here tested on real data, indicating that the model offers a promising avenue for statistical network modelling.
Comparison of accuracy in predicting emotional instability from MMPI data: fisherian versus contingent probability statistics

Energy Technology Data Exchange (ETDEWEB)

Berghausen, P.E. Jr.; Mathews, T.W.

1987-01-01

The security plans of nuclear power plants generally require that all personnel who are to have access to protected areas or vital islands be screened for emotional stability. In virtually all instances, the screening involves the administration of one or more psychological tests, usually including the Minnesota Multiphasic Personality Inventory (MMPI). At some plants, all employees receive a structured clinical interview after they have taken the MMPI and results have been obtained. At other plants, only those employees with dirty MMPI are interviewed. This latter protocol is referred to as interviews by exception. Behaviordyne Psychological Corp. has succeeded in removing some of the uncertainty associated with interview-by-exception protocols by developing an empirically based, predictive equation. This equation permits utility companies to make informed choices regarding the risks they are assuming. A conceptual problem exists with the predictive equation, however. Like most predictive equations currently in use, it is based on Fisherian statistics, involving least-squares analyses. Consequently, Behaviordyne Psychological Corp., in conjunction with T.W. Mathews and Associates, has just developed a second predictive equation, one based on contingent probability statistics. The particular technique used in the multi-contingent analysis of probability systems (MAPS) approach. The present paper presents a comparison of predictive accuracy of the two equations: the one derived using Fisherian techniques versus the one thing contingent probability techniques.
Comparison of accuracy in predicting emotional instability from MMPI data: fisherian versus contingent probability statistics

International Nuclear Information System (INIS)

Berghausen, P.E. Jr.; Mathews, T.W.

1987-01-01

The security plans of nuclear power plants generally require that all personnel who are to have access to protected areas or vital islands be screened for emotional stability. In virtually all instances, the screening involves the administration of one or more psychological tests, usually including the Minnesota Multiphasic Personality Inventory (MMPI). At some plants, all employees receive a structured clinical interview after they have taken the MMPI and results have been obtained. At other plants, only those employees with dirty MMPI are interviewed. This latter protocol is referred to as interviews by exception. Behaviordyne Psychological Corp. has succeeded in removing some of the uncertainty associated with interview-by-exception protocols by developing an empirically based, predictive equation. This equation permits utility companies to make informed choices regarding the risks they are assuming. A conceptual problem exists with the predictive equation, however. Like most predictive equations currently in use, it is based on Fisherian statistics, involving least-squares analyses. Consequently, Behaviordyne Psychological Corp., in conjunction with T.W. Mathews and Associates, has just developed a second predictive equation, one based on contingent probability statistics. The particular technique used in the multi-contingent analysis of probability systems (MAPS) approach. The present paper presents a comparison of predictive accuracy of the two equations: the one derived using Fisherian techniques versus the one thing contingent probability techniques
Online neural monitoring of statistical learning.

Science.gov (United States)

Batterink, Laura J; Paller, Ken A

2017-05-01

The extraction of patterns in the environment plays a critical role in many types of human learning, from motor skills to language acquisition. This process is known as statistical learning. Here we propose that statistical learning has two dissociable components: (1) perceptual binding of individual stimulus units into integrated composites and (2) storing those integrated representations for later use. Statistical learning is typically assessed using post-learning tasks, such that the two components are conflated. Our goal was to characterize the online perceptual component of statistical learning. Participants were exposed to a structured stream of repeating trisyllabic nonsense words and a random syllable stream. Online learning was indexed by an EEG-based measure that quantified neural entrainment at the frequency of the repeating words relative to that of individual syllables. Statistical learning was subsequently assessed using conventional measures in an explicit rating task and a reaction-time task. In the structured stream, neural entrainment to trisyllabic words was higher than in the random stream, increased as a function of exposure to track the progression of learning, and predicted performance on the reaction time (RT) task. These results demonstrate that monitoring this critical component of learning via rhythmic EEG entrainment reveals a gradual acquisition of knowledge whereby novel stimulus sequences are transformed into familiar composites. This online perceptual transformation is a critical component of learning. Copyright © 2017 Elsevier Ltd. All rights reserved.
Using Patient Demographics and Statistical Modeling to Predict Knee Tibia Component Sizing in Total Knee Arthroplasty.

Science.gov (United States)

Ren, Anna N; Neher, Robert E; Bell, Tyler; Grimm, James

2018-06-01

Preoperative planning is important to achieve successful implantation in primary total knee arthroplasty (TKA). However, traditional TKA templating techniques are not accurate enough to predict the component size to a very close range. With the goal of developing a general predictive statistical model using patient demographic information, ordinal logistic regression was applied to build a proportional odds model to predict the tibia component size. The study retrospectively collected the data of 1992 primary Persona Knee System TKA procedures. Of them, 199 procedures were randomly selected as testing data and the rest of the data were randomly partitioned between model training data and model evaluation data with a ratio of 7:3. Different models were trained and evaluated on the training and validation data sets after data exploration. The final model had patient gender, age, weight, and height as independent variables and predicted the tibia size within 1 size difference 96% of the time on the validation data, 94% of the time on the testing data, and 92% on a prospective cadaver data set. The study results indicated the statistical model built by ordinal logistic regression can increase the accuracy of tibia sizing information for Persona Knee preoperative templating. This research shows statistical modeling may be used with radiographs to dramatically enhance the templating accuracy, efficiency, and quality. In general, this methodology can be applied to other TKA products when the data are applicable. Copyright © 2018 Elsevier Inc. All rights reserved.
A Statistical Approach for Gain Bandwidth Prediction of Phoenix-Cell Based Reflect arrays

Directory of Open Access Journals (Sweden)

Hassan Salti

2018-01-01

Full Text Available A new statistical approach to predict the gain bandwidth of Phoenix-cell based reflectarrays is proposed. It combines the effects of both main factors that limit the bandwidth of reflectarrays: spatial phase delays and intrinsic bandwidth of radiating cells. As an illustration, the proposed approach is successfully applied to two reflectarrays based on new Phoenix cells.
Prediction of rotor blade-vortex interaction using Volterra integrals

Energy Technology Data Exchange (ETDEWEB)

Wong, A.; Nitzsche, F. [Carleton Univ., Dept. of Mechanical and Aerospace Engineering, Ottawa, Ontario (Canada)]. E-mail: Fred_Nitzsche@carleton.ca; Khalid, M. [National Research Council Canada, Inst. for Aerospace Research, Ottawa, Ontario (Canada)

2004-07-01

The theory of Volterra integral equations for nonlinear system is applied to the prediction of the nonlinear aerodynamic response of an NACA 0012 airfoil experiencing blade-vortex interaction. The phenomenon is first modeled in two-dimensions using an Euler/Navier-Stoke code, and the resulting unsteady aerodynamic flow field sequences are appropriately combined to form a training dataset. The Volterra kernels are identified in the time-domain characteristics of the selected data, which is in turn used to predict the nonlinear aerodynamic response of the airfoil. The Volterra kernel based data is then compared against a standard airfoil response. The predicted lift time histories of the airfoil are shown to be in good agreement with the aerodynamic data. (author)
Prediction of rotor blade-vortex interaction using Volterra integrals

International Nuclear Information System (INIS)

Wong, A.; Nitzsche, F.; Khalid, M.

2004-01-01

The theory of Volterra integral equations for nonlinear system is applied to the prediction of the nonlinear aerodynamic response of an NACA 0012 airfoil experiencing blade-vortex interaction. The phenomenon is first modeled in two-dimensions using an Euler/Navier-Stoke code, and the resulting unsteady aerodynamic flow field sequences are appropriately combined to form a training dataset. The Volterra kernels are identified in the time-domain characteristics of the selected data, which is in turn used to predict the nonlinear aerodynamic response of the airfoil. The Volterra kernel based data is then compared against a standard airfoil response. The predicted lift time histories of the airfoil are shown to be in good agreement with the aerodynamic data. (author)
Integrated prediction based on GIS for sandstone-type uranium deposits in the northwest of Ordos Basin

International Nuclear Information System (INIS)

Han Shaoyang; Ke Dan; Hu Shuiqing; Guo Qingyin; Hou Huiqun

2005-01-01

The integrated prediction model of sandstone-type uranium deposits and its integrated evaluation methods as well as flow of the work based on GIS are studied. A software for extracting metallogenic information is also developed. A multi-source exploring information database is established in the northwest of Ordos Basin, and an integrated digital mineral deposit prospecting model of sandstone-type uranium deposits is designed based on GIS. The authors have completed metallogenic information extraction and integrated evaluation of sandstone-type uranium deposits based on GIS in the study area. Research results prove that the integrated prediction of sandstone-type uranium deposits based on GIS may further delineate prospective target areas rapidly and improve the predictive precision. (authors)
Prediction of slant path rain attenuation statistics at various locations

Science.gov (United States)

Goldhirsh, J.

1977-01-01

The paper describes a method for predicting slant path attenuation statistics at arbitrary locations for variable frequencies and path elevation angles. The method involves the use of median reflectivity factor-height profiles measured with radar as well as the use of long-term point rain rate data and assumed or measured drop size distributions. The attenuation coefficient due to cloud liquid water in the presence of rain is also considered. Absolute probability fade distributions are compared for eight cases: Maryland (15 GHz), Texas (30 GHz), Slough, England (19 and 37 GHz), Fayetteville, North Carolina (13 and 18 GHz), and Cambridge, Massachusetts (13 and 18 GHz).
Predicting co-complexed protein pairs using genomic and proteomic data integration

Directory of Open Access Journals (Sweden)

King Oliver D

2004-04-01

Full Text Available Abstract Background Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H and affinity purification coupled with mass spectrometry (APMS have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship. Results Using a supervised machine learning approach – probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue, a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database, and the remaining predictions may potentially represent unknown CCPs. Conclusions We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation.
Integrative approaches to the prediction of protein functions based on the feature selection

Directory of Open Access Journals (Sweden)

Lee Hyunju

2009-12-01

Full Text Available Abstract Background Protein function prediction has been one of the most important issues in functional genomics. With the current availability of various genomic data sets, many researchers have attempted to develop integration models that combine all available genomic data for protein function prediction. These efforts have resulted in the improvement of prediction quality and the extension of prediction coverage. However, it has also been observed that integrating more data sources does not always increase the prediction quality. Therefore, selecting data sources that highly contribute to the protein function prediction has become an important issue. Results We present systematic feature selection methods that assess the contribution of genome-wide data sets to predict protein functions and then investigate the relationship between genomic data sources and protein functions. In this study, we use ten different genomic data sources in Mus musculus, including: protein-domains, protein-protein interactions, gene expressions, phenotype ontology, phylogenetic profiles and disease data sources to predict protein functions that are labelled with Gene Ontology (GO terms. We then apply two approaches to feature selection: exhaustive search feature selection using a kernel based logistic regression (KLR, and a kernel based L1-norm regularized logistic regression (KL1LR. In the first approach, we exhaustively measure the contribution of each data set for each function based on its prediction quality. In the second approach, we use the estimated coefficients of features as measures of contribution of data sources. Our results show that the proposed methods improve the prediction quality compared to the full integration of all data sources and other filter-based feature selection methods. We also show that contributing data sources can differ depending on the protein function. Furthermore, we observe that highly contributing data sets can be similar among
Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes.

Science.gov (United States)

Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M

2011-12-01

This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
Evaluating and Predicting Patient Safety for Medical Devices With Integral Information Technology

Science.gov (United States)

2005-01-01

323 Evaluating and Predicting Patient Safety for Medical Devices with Integral Information Technology Jiajie Zhang, Vimla L. Patel, Todd R...errors are due to inappropriate designs for user interactions, rather than mechanical failures. Evaluating and predicting patient safety in medical ...the users on the identified trouble spots in the devices. We developed two methods for evaluating and predicting patient safety in medical devices
Subtask 2.4 - Integration and Synthesis in Climate Change Predictive Modeling

Energy Technology Data Exchange (ETDEWEB)

Jaroslav Solc

2009-06-01

The Energy & Environmental Research Center (EERC) completed a brief evaluation of the existing status of predictive modeling to assess options for integration of our previous paleohydrologic reconstructions and their synthesis with current global climate scenarios. Results of our research indicate that short-term data series available from modern instrumental records are not sufficient to reconstruct past hydrologic events or predict future ones. On the contrary, reconstruction of paleoclimate phenomena provided credible information on past climate cycles and confirmed their integration in the context of regional climate history is possible. Similarly to ice cores and other paleo proxies, acquired data represent an objective, credible tool for model calibration and validation of currently observed trends. It remains a subject of future research whether further refinement of our results and synthesis with regional and global climate observations could contribute to improvement and credibility of climate predictions on a regional and global scale.
Improving Allergen Prediction in Main Crops Using a Weighted Integrative Method.

Science.gov (United States)

Li, Jing; Wang, Jing; Li, Jing

2017-12-01

As a public health problem, food allergy is frequently caused by food allergy proteins, which trigger a type-I hypersensitivity reaction in the immune system of atopic individuals. The food allergens in our daily lives are mainly from crops including rice, wheat, soybean and maize. However, allergens in these main crops are far from fully uncovered. Although some bioinformatics tools or methods predicting the potential allergenicity of proteins have been proposed, each method has their limitation. In this paper, we built a novel algorithm PREAL W , which integrated PREAL, FAO/WHO criteria and motif-based method by a weighted average score, to benefit the advantages of different methods. Our results illustrated PREAL W has better performance significantly in the crops' allergen prediction. This integrative allergen prediction algorithm could be useful for critical food safety matters. The PREAL W could be accessed at http://lilab.life.sjtu.edu.cn:8080/prealw .
Statistics and predictions of population, energy and environment problems

International Nuclear Information System (INIS)

Sobajima, Makoto

1999-03-01

In the situation that world's population, especially in developing countries, is rapidly growing, humankind is facing to global problems that they cannot steadily live unless they find individual places to live, obtain foods, and peacefully get energy necessary for living for centuries. For this purpose, humankind has to think what behavior they should take in the finite environment, talk, agree and execute. Though energy has been long respected as a symbol for improving living, demanded and used, they have come to limit the use making the global environment more serious. If there is sufficient energy not loading cost to the environment. If nuclear energy regarded as such one sustain the resource for long and has market competitiveness. What situation of realization of compensating new energy is now in the case the use of nuclear energy is restricted by the society fearing radioactivity. If there are promising ones for the future. One concerning with the study of energy cannot go without knowing these. The statistical materials compiled here are thought to be useful for that purpose, and are collected mainly from ones viewing future prediction based on past practices. Studies on the prediction is so important to have future measures that these data bases are expected to be improved for better accuracy. (author)

Information trimming: Sufficient statistics, mutual information, and predictability from effective channel states

Science.gov (United States)

James, Ryan G.; Mahoney, John R.; Crutchfield, James P.

2017-06-01

One of the most basic characterizations of the relationship between two random variables, X and Y , is the value of their mutual information. Unfortunately, calculating it analytically and estimating it empirically are often stymied by the extremely large dimension of the variables. One might hope to replace such a high-dimensional variable by a smaller one that preserves its relationship with the other. It is well known that either X (or Y ) can be replaced by its minimal sufficient statistic about Y (or X ) while preserving the mutual information. While intuitively reasonable, it is not obvious or straightforward that both variables can be replaced simultaneously. We demonstrate that this is in fact possible: the information X 's minimal sufficient statistic preserves about Y is exactly the information that Y 's minimal sufficient statistic preserves about X . We call this procedure information trimming. As an important corollary, we consider the case where one variable is a stochastic process' past and the other its future. In this case, the mutual information is the channel transmission rate between the channel's effective states. That is, the past-future mutual information (the excess entropy) is the amount of information about the future that can be predicted using the past. Translating our result about minimal sufficient statistics, this is equivalent to the mutual information between the forward- and reverse-time causal states of computational mechanics. We close by discussing multivariate extensions to this use of minimal sufficient statistics.
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

Science.gov (United States)

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

2012-01-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks.

Science.gov (United States)

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A; Kellis, Manolis

2012-07-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein-protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.
The use of machine learning and nonlinear statistical tools for ADME prediction.

Science.gov (United States)

Sakiyama, Yojiro

2009-02-01

Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.
Predicting The Exit Time Of Employees In An Organization Using Statistical Model

Directory of Open Access Journals (Sweden)

Ahmed Al Kuwaiti

2015-08-01

Full Text Available Employees are considered as an asset to any organization and each organization provide a better and flexible working environment to retain its best and resourceful workforce. As such continuous efforts are being taken to avoid or extend the exitwithdrawal of employees from the organization. Human resource managers are facing a challenge to predict the exit time of employees and there is no precise model existing at present in the literature. This study has been conducted to predict the probability of exit of an employee in an organization using appropriate statistical model. Accordingly authors designed a model using Additive Weibull distribution to predict the expected exit time of employee in an organization. In addition a Shock model approach is also executed to check how well the Additive Weibull distribution suits in an organization. The analytical results showed that when the inter-arrival time increases the expected time for the employees to exit also increases. This study concluded that Additive Weibull distribution can be considered as an alternative in the place of Shock model approach to predict the exit time of employee in an organization.
The Integrated Medical Model: Statistical Forecasting of Risks to Crew Health and Mission Success

Science.gov (United States)

Fitts, M. A.; Kerstman, E.; Butler, D. J.; Walton, M. E.; Minard, C. G.; Saile, L. G.; Toy, S.; Myers, J.

2008-01-01

The Integrated Medical Model (IMM) helps capture and use organizational knowledge across the space medicine, training, operations, engineering, and research domains. The IMM uses this domain knowledge in the context of a mission and crew profile to forecast crew health and mission success risks. The IMM is most helpful in comparing the risk of two or more mission profiles, not as a tool for predicting absolute risk. The process of building the IMM adheres to Probability Risk Assessment (PRA) techniques described in NASA Procedural Requirement (NPR) 8705.5, and uses current evidence-based information to establish a defensible position for making decisions that help ensure crew health and mission success. The IMM quantitatively describes the following input parameters: 1) medical conditions and likelihood, 2) mission duration, 3) vehicle environment, 4) crew attributes (e.g. age, sex), 5) crew activities (e.g. EVA's, Lunar excursions), 6) diagnosis and treatment protocols (e.g. medical equipment, consumables pharmaceuticals), and 7) Crew Medical Officer (CMO) training effectiveness. It is worth reiterating that the IMM uses the data sets above as inputs. Many other risk management efforts stop at determining only likelihood. The IMM is unique in that it models not only likelihood, but risk mitigations, as well as subsequent clinical outcomes based on those mitigations. Once the mathematical relationships among the above parameters are established, the IMM uses a Monte Carlo simulation technique (a random sampling of the inputs as described by their statistical distribution) to determine the probable outcomes. Because the IMM is a stochastic model (i.e. the input parameters are represented by various statistical distributions depending on the data type), when the mission is simulated 10-50,000 times with a given set of medical capabilities (risk mitigations), a prediction of the most probable outcomes can be generated. For each mission, the IMM tracks which conditions
Integrating community-based verbal autopsy into civil registration and vital statistics (CRVS): system-level considerations

Science.gov (United States)

de Savigny, Don; Riley, Ian; Chandramohan, Daniel; Odhiambo, Frank; Nichols, Erin; Notzon, Sam; AbouZahr, Carla; Mitra, Raj; Cobos Muñoz, Daniel; Firth, Sonja; Maire, Nicolas; Sankoh, Osman; Bronson, Gay; Setel, Philip; Byass, Peter; Jakob, Robert; Boerma, Ties; Lopez, Alan D.

2017-01-01

ABSTRACT Background: Reliable and representative cause of death (COD) statistics are essential to inform public health policy, respond to emerging health needs, and document progress towards Sustainable Development Goals. However, less than one-third of deaths worldwide are assigned a cause. Civil registration and vital statistics (CRVS) systems in low- and lower-middle-income countries are failing to provide timely, complete and accurate vital statistics, and it will still be some time before they can provide physician-certified COD for every death. Proposals: Verbal autopsy (VA) is a method to ascertain the probable COD and, although imperfect, it is the best alternative in the absence of medical certification. There is extensive experience with VA in research settings but only a few examples of its use on a large scale. Data collection using electronic questionnaires on mobile devices and computer algorithms to analyse responses and estimate probable COD have increased the potential for VA to be routinely applied in CRVS systems. However, a number of CRVS and health system integration issues should be considered in planning, piloting and implementing a system-wide intervention such as VA. These include addressing the multiplicity of stakeholders and sub-systems involved, integration with existing CRVS work processes and information flows, linking VA results to civil registration records, information technology requirements and data quality assurance. Conclusions: Integrating VA within CRVS systems is not simply a technical undertaking. It will have profound system-wide effects that should be carefully considered when planning for an effective implementation. This paper identifies and discusses the major system-level issues and emerging practices, provides a planning checklist of system-level considerations and proposes an overview for how VA can be integrated into routine CRVS systems. PMID:28137194
Solvability of a class of systems of infinite-dimensional integral equations and their application in statistical mechanics

International Nuclear Information System (INIS)

Gonchar, N.S.

1986-01-01

This paper presents a mathematical method developed for investigating a class of systems of infinite-dimensional integral equations which have application in statistical mechanics. Necessary and sufficient conditions are obtained for the uniqueness and bifurcation of the solution of this class of systems of equations. Problems of equilibrium statistical mechanics are considered on the basis of this method
Fast and General Method To Predict the Physicochemical Properties of Druglike Molecules Using the Integral Equation Theory of Molecular Liquids.

Science.gov (United States)

Palmer, David S; Mišin, Maksim; Fedorov, Maxim V; Llinas, Antonio

2015-09-08

We report a method to predict physicochemical properties of druglike molecules using a classical statistical mechanics based solvent model combined with machine learning. The RISM-MOL-INF method introduced here provides an accurate technique to characterize solvation and desolvation processes based on solute-solvent correlation functions computed by the 1D reference interaction site model of the integral equation theory of molecular liquids. These functions can be obtained in a matter of minutes for most small organic and druglike molecules using existing software (RISM-MOL) (Sergiievskyi, V. P.; Hackbusch, W.; Fedorov, M. V. J. Comput. Chem. 2011, 32, 1982-1992). Predictions of caco-2 cell permeability and hydration free energy obtained using the RISM-MOL-INF method are shown to be more accurate than the state-of-the-art tools for benchmark data sets. Due to the importance of solvation and desolvation effects in biological systems, it is anticipated that the RISM-MOL-INF approach will find many applications in biophysical and biomedical property prediction.
Statistical prediction of nanoparticle delivery: from culture media to cell

Science.gov (United States)

Rowan Brown, M.; Hondow, Nicole; Brydson, Rik; Rees, Paul; Brown, Andrew P.; Summers, Huw D.

2015-04-01

The application of nanoparticles (NPs) within medicine is of great interest; their innate physicochemical characteristics provide the potential to enhance current technology, diagnostics and therapeutics. Recently a number of NP-based diagnostic and therapeutic agents have been developed for treatment of various diseases, where judicious surface functionalization is exploited to increase efficacy of administered therapeutic dose. However, quantification of heterogeneity associated with absolute dose of a nanotherapeutic (NP number), how this is trafficked across biological barriers has proven difficult to achieve. The main issue being the quantitative assessment of NP number at the spatial scale of the individual NP, data which is essential for the continued growth and development of the next generation of nanotherapeutics. Recent advances in sample preparation and the imaging fidelity of transmission electron microscopy (TEM) platforms provide information at the required spatial scale, where individual NPs can be individually identified. High spatial resolution however reduces the sample frequency and as a result dynamic biological features or processes become opaque. However, the combination of TEM data with appropriate probabilistic models provide a means to extract biophysical information that imaging alone cannot. Previously, we demonstrated that limited cell sampling via TEM can be statistically coupled to large population flow cytometry measurements to quantify exact NP dose. Here we extended this concept to link TEM measurements of NP agglomerates in cell culture media to that encapsulated within vesicles in human osteosarcoma cells. By construction and validation of a data-driven transfer function, we are able to investigate the dynamic properties of NP agglomeration through endocytosis. In particular, we statistically predict how NP agglomerates may traverse a biological barrier, detailing inter-agglomerate merging events providing the basis for
Statistical prediction of nanoparticle delivery: from culture media to cell

International Nuclear Information System (INIS)

Brown, M Rowan; Rees, Paul; Summers, Huw D; Hondow, Nicole; Brydson, Rik; Brown, Andrew P

2015-01-01

The application of nanoparticles (NPs) within medicine is of great interest; their innate physicochemical characteristics provide the potential to enhance current technology, diagnostics and therapeutics. Recently a number of NP-based diagnostic and therapeutic agents have been developed for treatment of various diseases, where judicious surface functionalization is exploited to increase efficacy of administered therapeutic dose. However, quantification of heterogeneity associated with absolute dose of a nanotherapeutic (NP number), how this is trafficked across biological barriers has proven difficult to achieve. The main issue being the quantitative assessment of NP number at the spatial scale of the individual NP, data which is essential for the continued growth and development of the next generation of nanotherapeutics. Recent advances in sample preparation and the imaging fidelity of transmission electron microscopy (TEM) platforms provide information at the required spatial scale, where individual NPs can be individually identified. High spatial resolution however reduces the sample frequency and as a result dynamic biological features or processes become opaque. However, the combination of TEM data with appropriate probabilistic models provide a means to extract biophysical information that imaging alone cannot. Previously, we demonstrated that limited cell sampling via TEM can be statistically coupled to large population flow cytometry measurements to quantify exact NP dose. Here we extended this concept to link TEM measurements of NP agglomerates in cell culture media to that encapsulated within vesicles in human osteosarcoma cells. By construction and validation of a data-driven transfer function, we are able to investigate the dynamic properties of NP agglomeration through endocytosis. In particular, we statistically predict how NP agglomerates may traverse a biological barrier, detailing inter-agglomerate merging events providing the basis for
Multiple-point statistical prediction on fracture networks at Yucca Mountain

International Nuclear Information System (INIS)

Liu, X.Y; Zhang, C.Y.; Liu, Q.S.; Birkholzer, J.T.

2009-01-01

In many underground nuclear waste repository systems, such as at Yucca Mountain, water flow rate and amount of water seepage into the waste emplacement drifts are mainly determined by hydrological properties of fracture network in the surrounding rock mass. Natural fracture network system is not easy to describe, especially with respect to its connectivity which is critically important for simulating the water flow field. In this paper, we introduced a new method for fracture network description and prediction, termed multi-point-statistics (MPS). The process of the MPS method is to record multiple-point statistics concerning the connectivity patterns of a fracture network from a known fracture map, and to reproduce multiple-scale training fracture patterns in a stochastic manner, implicitly and directly. It is applied to fracture data to study flow field behavior at the Yucca Mountain waste repository system. First, the MPS method is used to create a fracture network with an original fracture training image from Yucca Mountain dataset. After we adopt a harmonic and arithmetic average method to upscale the permeability to a coarse grid, THM simulation is carried out to study near-field water flow in the surrounding waste emplacement drifts. Our study shows that connectivity or patterns of fracture networks can be grasped and reconstructed by MPS methods. In theory, it will lead to better prediction of fracture system characteristics and flow behavior. Meanwhile, we can obtain variance from flow field, which gives us a way to quantify model uncertainty even in complicated coupled THM simulations. It indicates that MPS can potentially characterize and reconstruct natural fracture networks in a fractured rock mass with advantages of quantifying connectivity of fracture system and its simulation uncertainty simultaneously.
An analysis of a large dataset on immigrant integration in Spain. The Statistical Mechanics perspective on Social Action

Science.gov (United States)

Barra, Adriano; Contucci, Pierluigi; Sandell, Rickard; Vernia, Cecilia

2014-02-01

How does immigrant integration in a country change with immigration density? Guided by a statistical mechanics perspective we propose a novel approach to this problem. The analysis focuses on classical integration quantifiers such as the percentage of jobs (temporary and permanent) given to immigrants, mixed marriages, and newborns with parents of mixed origin. We find that the average values of different quantifiers may exhibit either linear or non-linear growth on immigrant density and we suggest that social action, a concept identified by Max Weber, causes the observed non-linearity. Using the statistical mechanics notion of interaction to quantitatively emulate social action, a unified mathematical model for integration is proposed and it is shown to explain both growth behaviors observed. The linear theory instead, ignoring the possibility of interaction effects would underestimate the quantifiers up to 30% when immigrant densities are low, and overestimate them as much when densities are high. The capacity to quantitatively isolate different types of integration mechanisms makes our framework a suitable tool in the quest for more efficient integration policies.
Demonstrating the Effectiveness of an Integrated and Intensive Research Methods and Statistics Course Sequence

Science.gov (United States)

Pliske, Rebecca M.; Caldwell, Tracy L.; Calin-Jageman, Robert J.; Taylor-Ritzler, Tina

2015-01-01

We developed a two-semester series of intensive (six-contact hours per week) behavioral research methods courses with an integrated statistics curriculum. Our approach includes the use of team-based learning, authentic projects, and Excel and SPSS. We assessed the effectiveness of our approach by examining our students' content area scores on the…
Comparison of selected methods of prediction of wine exports and imports

Directory of Open Access Journals (Sweden)

Radka Šperková

2008-01-01

Full Text Available For prediction of future events, there exist a number of methods usable in managerial practice. Decision on which of them should be used in a particular situation depends not only on the amount and quality of input information, but also on a subjective managerial judgement. Paper performs a practical application and consequent comparison of results of two selected methods, which are statistical method and deductive method. Both methods were used for predicting wine exports and imports in (from the Czech Republic. Prediction was done in 2003 and it related to the economic years 2003/2004, 2004/2005, 2005/2006, and 2006/2007, within which it was compared with the real values of the given indicators.Within the deductive methods there were characterized the most important factors of external environment including the most important influence according to authors’ opinion, which was the integration of the Czech Republic into the EU from 1st May, 2004. On the contrary, the statistical method of time-series analysis did not regard the integration, which is comes out of its principle. Statistics only calculates based on data from the past, and cannot incorporate the influence of irregular future conditions, just as the EU integration. Because of this the prediction based on deductive method was more optimistic and more precise in terms of its difference from real development in the given field.
Wind gust estimation by combining numerical weather prediction model and statistical post-processing

Science.gov (United States)

Patlakas, Platon; Drakaki, Eleni; Galanis, George; Spyrou, Christos; Kallos, George

2017-04-01

The continuous rise of off-shore and near-shore activities as well as the development of structures, such as wind farms and various offshore platforms, requires the employment of state-of-the-art risk assessment techniques. Such analysis is used to set the safety standards and can be characterized as a climatologically oriented approach. Nevertheless, a reliable operational support is also needed in order to minimize cost drawbacks and human danger during the construction and the functioning stage as well as during maintenance activities. One of the most important parameters for this kind of analysis is the wind speed intensity and variability. A critical measure associated with this variability is the presence and magnitude of wind gusts as estimated in the reference level of 10m. The latter can be attributed to different processes that vary among boundary-layer turbulence, convection activities, mountain waves and wake phenomena. The purpose of this work is the development of a wind gust forecasting methodology combining a Numerical Weather Prediction model and a dynamical statistical tool based on Kalman filtering. To this end, the parameterization of Wind Gust Estimate method was implemented to function within the framework of the atmospheric model SKIRON/Dust. The new modeling tool combines the atmospheric model with a statistical local adaptation methodology based on Kalman filters. This has been tested over the offshore west coastline of the United States. The main purpose is to provide a useful tool for wind analysis and prediction and applications related to offshore wind energy (power prediction, operation and maintenance). The results have been evaluated by using observational data from the NOAA's buoy network. As it was found, the predicted output shows a good behavior that is further improved after the local adjustment post-process.
Integrated Computational Solution for Predicting Skin Sensitization Potential of Molecules.

Directory of Open Access Journals (Sweden)

Konda Leela Sarath Kumar

Full Text Available Skin sensitization forms a major toxicological endpoint for dermatology and cosmetic products. Recent ban on animal testing for cosmetics demands for alternative methods. We developed an integrated computational solution (SkinSense that offers a robust solution and addresses the limitations of existing computational tools i.e. high false positive rate and/or limited coverage.The key components of our solution include: QSAR models selected from a combinatorial set, similarity information and literature-derived sub-structure patterns of known skin protein reactive groups. Its prediction performance on a challenge set of molecules showed accuracy = 75.32%, CCR = 74.36%, sensitivity = 70.00% and specificity = 78.72%, which is better than several existing tools including VEGA (accuracy = 45.00% and CCR = 54.17% with 'High' reliability scoring, DEREK (accuracy = 72.73% and CCR = 71.44% and TOPKAT (accuracy = 60.00% and CCR = 61.67%. Although, TIMES-SS showed higher predictive power (accuracy = 90.00% and CCR = 92.86%, the coverage was very low (only 10 out of 77 molecules were predicted reliably.Owing to improved prediction performance and coverage, our solution can serve as a useful expert system towards Integrated Approaches to Testing and Assessment for skin sensitization. It would be invaluable to cosmetic/ dermatology industry for pre-screening their molecules, and reducing time, cost and animal testing.
Spatial Economics Model Predicting Transport Volume

Directory of Open Access Journals (Sweden)

Lu Bo

2016-10-01

Full Text Available It is extremely important to predict the logistics requirements in a scientific and rational way. However, in recent years, the improvement effect on the prediction method is not very significant and the traditional statistical prediction method has the defects of low precision and poor interpretation of the prediction model, which cannot only guarantee the generalization ability of the prediction model theoretically, but also cannot explain the models effectively. Therefore, in combination with the theories of the spatial economics, industrial economics, and neo-classical economics, taking city of Zhuanghe as the research object, the study identifies the leading industry that can produce a large number of cargoes, and further predicts the static logistics generation of the Zhuanghe and hinterlands. By integrating various factors that can affect the regional logistics requirements, this study established a logistics requirements potential model from the aspect of spatial economic principles, and expanded the way of logistics requirements prediction from the single statistical principles to an new area of special and regional economics.
Protein thermostability prediction within homologous families using temperature-dependent statistical potentials.

Directory of Open Access Journals (Sweden)

Fabrizio Pucci

Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.
Drivers and seasonal predictability of extreme wind speeds in the ECMWF System 4 and a statistical model

Science.gov (United States)

Walz, M. A.; Donat, M.; Leckebusch, G. C.

2017-12-01

As extreme wind speeds are responsible for large socio-economic losses in Europe, a skillful prediction would be of great benefit for disaster prevention as well as for the actuarial community. Here we evaluate patterns of large-scale atmospheric variability and the seasonal predictability of extreme wind speeds (e.g. >95th percentile) in the European domain in the dynamical seasonal forecast system ECMWF System 4, and compare to the predictability based on a statistical prediction model. The dominant patterns of atmospheric variability show distinct differences between reanalysis and ECMWF System 4, with most patterns in System 4 extended downstream in comparison to ERA-Interim. The dissimilar manifestations of the patterns within the two models lead to substantially different drivers associated with the occurrence of extreme winds in the respective model. While the ECMWF System 4 is shown to provide some predictive power over Scandinavia and the eastern Atlantic, only very few grid cells in the European domain have significant correlations for extreme wind speeds in System 4 compared to ERA-Interim. In contrast, a statistical model predicts extreme wind speeds during boreal winter in better agreement with the observations. Our results suggest that System 4 does not seem to capture the potential predictability of extreme winds that exists in the real world, and therefore fails to provide reliable seasonal predictions for lead months 2-4. This is likely related to the unrealistic representation of large-scale patterns of atmospheric variability. Hence our study points to potential improvements of dynamical prediction skill by improving the simulation of large-scale atmospheric dynamics.

Time integration and statistical regulation applied to mobile objects detection in a sequence of images

International Nuclear Information System (INIS)

Letang, Jean-Michel

1993-01-01

This PhD thesis deals with the detection of moving objects in monocular image sequences. The first section presents the inherent problems of motion analysis in real applications. We propose a method robust to perturbations frequently encountered during acquisition of outdoor scenes. It appears three main directions for investigations, all of them pointing out the importance of the temporal axis, which is a specific dimension for motion analysis. In the first part, the image sequence is considered as a set of temporal signals. The temporal multi-scale decomposition enables the characterization of various dynamical behaviors of the objects being in the scene at a given instant. A second module integrates motion information. This elementary trajectography of moving objects provides a temporal prediction map, giving a confidence level of motion presence. Interactions between both sets of data are expressed within a statistical regularization. Markov random field models supply a formal framework to convey a priori knowledge of the primitives to be evaluated. A calibration method with qualitative boxes is presented to estimate model parameters. Our approach requires only simple computations and leads to a rather fast algorithm, that we evaluate in the last section over various typical sequences. (author) [fr
Hydrogen-bond coordination in organic crystal structures: statistics, predictions and applications.

Science.gov (United States)

Galek, Peter T A; Chisholm, James A; Pidcock, Elna; Wood, Peter A

2014-02-01

Statistical models to predict the number of hydrogen bonds that might be formed by any donor or acceptor atom in a crystal structure have been derived using organic structures in the Cambridge Structural Database. This hydrogen-bond coordination behaviour has been uniquely defined for more than 70 unique atom types, and has led to the development of a methodology to construct hypothetical hydrogen-bond arrangements. Comparing the constructed hydrogen-bond arrangements with known crystal structures shows promise in the assessment of structural stability, and some initial examples of industrially relevant polymorphs, co-crystals and hydrates are described.
Environmental accounting and statistics

International Nuclear Information System (INIS)

Bartelmus, P.L.P.

1992-01-01

The objective of sustainable development is to integrate environmental concerns with mainstream socio-economic policies. Integrated policies need to be supported by integrated data. Environmental accounting achieves this integration by incorporating environmental costs and benefits into conventional national accounts. Modified accounting aggregates can thus be used in defining and measuring environmentally sound and sustainable economic growth. Further development objectives need to be assessed by more comprehensive, though necessarily less integrative, systems of environmental statistics and indicators. Integrative frameworks for the different statistical systems in the fields of economy, environment and population would facilitate the provision of comparable data for the analysis of integrated development. (author). 19 refs, 2 figs, 2 tabs
Generalized Hamiltonians, functional integration and statistics of continuous fluids and plasmas

International Nuclear Information System (INIS)

Tasso, H.

1985-05-01

Generalized Hamiltonian formalism including generalized Poisson brackets and Lie-Poisson brackets is presented in Section II. Gyroviscous magnetohydrodynamics is treated as a relevant example in Euler and Clebsch variables. Section III is devoted to a short review of functional integration containing the definition and a discussion of ambiguities and methods of evaluation. The main part of the contribution is given in Section IV, where some of the content of the previous sections is applied to Gibbs statistics of continuous fluids and plasmas. In particular, exact fluctuation spectra are calculated for relevant equations in fluids and plasmas. (orig.)
Predicting Statistical Response and Extreme Events in Uncertainty Quantification through Reduced-Order Models

Science.gov (United States)

Qi, D.; Majda, A.

2017-12-01

A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with
Supervisory Model Predictive Control of the Heat Integrated Distillation Column

DEFF Research Database (Denmark)

Meyer, Kristian; Bisgaard, Thomas; Huusom, Jakob Kjøbsted

2017-01-01

This paper benchmarks a centralized control system based on model predictive control for the operation of the heat integrated distillation column (HIDiC) against a fully decentralized control system using the most complete column model currently available in the literature. The centralized control...... system outperforms the decentralized system, because it handles the interactions in the HIDiC process better. The integral absolute error (IAE) is reduced by a factor of 2 and a factor of 4 for control of the top and bottoms compositions, respectively....
Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation.

Directory of Open Access Journals (Sweden)

Frank Technow

Full Text Available Genomic selection, enabled by whole genome prediction (WGP methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E, continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC, a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.
Integrating statistical machine learning in a semantic sensor web for proactive monitoring and control

CSIR Research Space (South Africa)

Adeleke, Jude Adekunle

2017-04-01

Full Text Available in an indoor air quality monitoring case study. A sliding window approach that employs the Multilayer Perceptron model to predict short term PM 2 . 5 pollution situations is integrated into the proactive monitoring and control framework. Results show...
Integration of Attributes from Non-Linear Characterization of Cardiovascular Time-Series for Prediction of Defibrillation Outcomes.

Directory of Open Access Journals (Sweden)

Sharad Shandilya

Full Text Available The timing of defibrillation is mostly at arbitrary intervals during cardio-pulmonary resuscitation (CPR, rather than during intervals when the out-of-hospital cardiac arrest (OOH-CA patient is physiologically primed for successful countershock. Interruptions to CPR may negatively impact defibrillation success. Multiple defibrillations can be associated with decreased post-resuscitation myocardial function. We hypothesize that a more complete picture of the cardiovascular system can be gained through non-linear dynamics and integration of multiple physiologic measures from biomedical signals.Retrospective analysis of 153 anonymized OOH-CA patients who received at least one defibrillation for ventricular fibrillation (VF was undertaken. A machine learning model, termed Multiple Domain Integrative (MDI model, was developed to predict defibrillation success. We explore the rationale for non-linear dynamics and statistically validate heuristics involved in feature extraction for model development. Performance of MDI is then compared to the amplitude spectrum area (AMSA technique.358 defibrillations were evaluated (218 unsuccessful and 140 successful. Non-linear properties (Lyapunov exponent > 0 of the ECG signals indicate a chaotic nature and validate the use of novel non-linear dynamic methods for feature extraction. Classification using MDI yielded ROC-AUC of 83.2% and accuracy of 78.8%, for the model built with ECG data only. Utilizing 10-fold cross-validation, at 80% specificity level, MDI (74% sensitivity outperformed AMSA (53.6% sensitivity. At 90% specificity level, MDI had 68.4% sensitivity while AMSA had 43.3% sensitivity. Integrating available end-tidal carbon dioxide features into MDI, for the available 48 defibrillations, boosted ROC-AUC to 93.8% and accuracy to 83.3% at 80% sensitivity.At clinically relevant sensitivity thresholds, the MDI provides improved performance as compared to AMSA, yielding fewer unsuccessful defibrillations
Feasibility of real-time calculation of correlation integral derived statistics applied to EGG time series

NARCIS (Netherlands)

van den Broek, PLC; van Egmond, J; van Rijn, CM; Takens, F; Coenen, AML; Booij, LHDJ

2005-01-01

Background: This study assessed the feasibility of online calculation of the correlation integral (C(r)) aiming to apply C(r)derived statistics. For real-time application it is important to reduce calculation time. It is shown how our method works for EEG time series. Methods: To achieve online
Feasibility of real-time calculation of correlation integral derived statistics applied to EEG time series

NARCIS (Netherlands)

Broek, P.L.C. van den; Egmond, J. van; Rijn, C.M. van; Takens, F.; Coenen, A.M.L.; Booij, L.H.D.J.

2005-01-01

This study assessed the feasibility of online calculation of the correlation integral (C(r)) aiming to apply C(r)-derived statistics. For real-time application it is important to reduce calculation time. It is shown how our method works for EEG time series. Methods: To achieve online calculation of
Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits

National Research Council Canada - National Science Library

Gravier, Michael

1999-01-01

.... This thesis draws on available data from the electronics integrated circuit industry to attempt to assess whether statistical modeling offers a viable method for predicting the presence of DMSMS...
Development of an integrated method for long-term water quality prediction using seasonal climate forecast

Directory of Open Access Journals (Sweden)

J. Cho

2016-10-01

Full Text Available The APEC Climate Center (APCC produces climate prediction information utilizing a multi-climate model ensemble (MME technique. In this study, four different downscaling methods, in accordance with the degree of utilizing the seasonal climate prediction information, were developed in order to improve predictability and to refine the spatial scale. These methods include: (1 the Simple Bias Correction (SBC method, which directly uses APCC's dynamic prediction data with a 3 to 6 month lead time; (2 the Moving Window Regression (MWR method, which indirectly utilizes dynamic prediction data; (3 the Climate Index Regression (CIR method, which predominantly uses observation-based climate indices; and (4 the Integrated Time Regression (ITR method, which uses predictors selected from both CIR and MWR. Then, a sampling-based temporal downscaling was conducted using the Mahalanobis distance method in order to create daily weather inputs to the Soil and Water Assessment Tool (SWAT model. Long-term predictability of water quality within the Wecheon watershed of the Nakdong River Basin was evaluated. According to the Korean Ministry of Environment's Provisions of Water Quality Prediction and Response Measures, modeling-based predictability was evaluated by using 3-month lead prediction data issued in February, May, August, and November as model input of SWAT. Finally, an integrated approach, which takes into account various climate information and downscaling methods for water quality prediction, was presented. This integrated approach can be used to prevent potential problems caused by extreme climate in advance.
More Gamma More Predictions: Gamma-Synchronization as a Key Mechanism for Efficient Integration of Classical Receptive Field Inputs with Surround Predictions

Science.gov (United States)

Vinck, Martin; Bosman, Conrado A.

2016-01-01

During visual stimulation, neurons in visual cortex often exhibit rhythmic and synchronous firing in the gamma-frequency (30–90 Hz) band. Whether this phenomenon plays a functional role during visual processing is not fully clear and remains heavily debated. In this article, we explore the function of gamma-synchronization in the context of predictive and efficient coding theories. These theories hold that sensory neurons utilize the statistical regularities in the natural world in order to improve the efficiency of the neural code, and to optimize the inference of the stimulus causes of the sensory data. In visual cortex, this relies on the integration of classical receptive field (CRF) data with predictions from the surround. Here we outline two main hypotheses about gamma-synchronization in visual cortex. First, we hypothesize that the precision of gamma-synchronization reflects the extent to which CRF data can be accurately predicted by the surround. Second, we hypothesize that different cortical columns synchronize to the extent that they accurately predict each other’s CRF visual input. We argue that these two hypotheses can account for a large number of empirical observations made on the stimulus dependencies of gamma-synchronization. Furthermore, we show that they are consistent with the known laminar dependencies of gamma-synchronization and the spatial profile of intercolumnar gamma-synchronization, as well as the dependence of gamma-synchronization on experience and development. Based on our two main hypotheses, we outline two additional hypotheses. First, we hypothesize that the precision of gamma-synchronization shows, in general, a negative dependence on RF size. In support, we review evidence showing that gamma-synchronization decreases in strength along the visual hierarchy, and tends to be more prominent in species with small V1 RFs. Second, we hypothesize that gamma-synchronized network dynamics facilitate the emergence of spiking output that
Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models

NARCIS (Netherlands)

Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A.; van t Veld, Aart A.

2012-01-01

PURPOSE: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator
Predicting Athletes’ Pre-Exercise Fluid Intake: A Theoretical Integration Approach

Directory of Open Access Journals (Sweden)

Chunxiao Li

2018-05-01

Full Text Available Pre-exercise fluid intake is an important healthy behavior for maintaining athletes’ sports performances and health. However, athletes’ behavioral adherence to fluid intake and its underlying psychological mechanisms have not been investigated. This prospective study aimed to use a health psychology model that integrates the self-determination theory and the theory of planned behavior for understanding pre-exercise fluid intake among athletes. Participants (n = 179 were athletes from college sport teams who completed surveys at two time points. Baseline (Time 1 assessment comprised psychological variables of the integrated model (i.e., autonomous and controlled motivation, attitude, subjective norm, perceived behavioral control, and intention and fluid intake (i.e., behavior was measured prospectively at one month (Time 2. Path analysis showed that the positive association between autonomous motivation and intention was mediated by subjective norm and perceived behavioral control. Controlled motivation positively predicted the subjective norm. Intentions positively predicted pre-exercise fluid intake behavior. Overall, the pattern of results was generally consistent with the integrated model, and it was suggested that athletes’ pre-exercise fluid intake behaviors were associated with the motivational and social cognitive factors of the model. The research findings could be informative for coaches and sport scientists to promote athletes’ pre-exercise fluid intake behaviors.
Causal inference and temporal predictions in audiovisual perception of speech and music.

Science.gov (United States)

Noppeney, Uta; Lee, Hwee Ling

2018-03-31

To form a coherent percept of the environment, the brain must integrate sensory signals emanating from a common source but segregate those from different sources. Temporal regularities are prominent cues for multisensory integration, particularly for speech and music perception. In line with models of predictive coding, we suggest that the brain adapts an internal model to the statistical regularities in its environment. This internal model enables cross-sensory and sensorimotor temporal predictions as a mechanism to arbitrate between integration and segregation of signals from different senses. © 2018 New York Academy of Sciences.
The statistical background to proposed ASME/MPC fracture toughness reference curves

International Nuclear Information System (INIS)

Oldfield, W.

1981-01-01

The ASME Pressure Vessel Codes define, in Sec. 11, lower bound fracture toughness curves. These curves are used to predict the lower bound fracture toughness on the basis of the RT test procedure. This test is used to remove heat to heat differences, by permitting the lower bound (reference) curve to be moved along the temperature scale according to the measured RT. Numerous objections have been raised to the procedure, and a Subcommittee (the ASME/MPC Working Group on Reference Toughness) is currently revising the codified procedures for fracture toughness prediction. The task has required a substantial amount of statistical work, since the new procedure are to have a statistical basis. Using initiation fracture toughness (J-Integral R curve procedures in the ductile domain) it was shown that when CVN energy data is properly transformed it is highly correlated with valid fracture toughness measurements. A single functional relationship can be used to predict the mean fracture toughness for a sample of steel from a set of CVN energy measurements, and the coefficients of the function tabulated. More importantly, the approximate lower statistical bounds to the initiation fracture toughness behaviour can be similarly predicted, and coefficients for selected bounds have also been tabulated. (orig.)
A Weibull statistics-based lignocellulose saccharification model and a built-in parameter accurately predict lignocellulose hydrolysis performance.

Science.gov (United States)

Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu

2015-09-01

Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A statistical rain attenuation prediction model with application to the advanced communication technology satellite project. 1: Theoretical development and application to yearly predictions for selected cities in the United States

Science.gov (United States)

Manning, Robert M.

1986-01-01

A rain attenuation prediction model is described for use in calculating satellite communication link availability for any specific location in the world that is characterized by an extended record of rainfall. Such a formalism is necessary for the accurate assessment of such availability predictions in the case of the small user-terminal concept of the Advanced Communication Technology Satellite (ACTS) Project. The model employs the theory of extreme value statistics to generate the necessary statistical rainrate parameters from rain data in the form compiled by the National Weather Service. These location dependent rain statistics are then applied to a rain attenuation model to obtain a yearly prediction of the occurrence of attenuation on any satellite link at that location. The predictions of this model are compared to those of the Crane Two-Component Rain Model and some empirical data and found to be very good. The model is then used to calculate rain attenuation statistics at 59 locations in the United States (including Alaska and Hawaii) for the 20 GHz downlinks and 30 GHz uplinks of the proposed ACTS system. The flexibility of this modeling formalism is such that it allows a complete and unified treatment of the temporal aspects of rain attenuation that leads to the design of an optimum stochastic power control algorithm, the purpose of which is to efficiently counter such rain fades on a satellite link.

Predicting Smoking Status Using Machine Learning Algorithms and Statistical Analysis

Directory of Open Access Journals (Sweden)

Charles Frank

2018-03-01

Full Text Available Smoking has been proven to negatively affect health in a multitude of ways. As of 2009, smoking has been considered the leading cause of preventable morbidity and mortality in the United States, continuing to plague the country’s overall health. This study aims to investigate the viability and effectiveness of some machine learning algorithms for predicting the smoking status of patients based on their blood tests and vital readings results. The analysis of this study is divided into two parts: In part 1, we use One-way ANOVA analysis with SAS tool to show the statistically significant difference in blood test readings between smokers and non-smokers. The results show that the difference in INR, which measures the effectiveness of anticoagulants, was significant in favor of non-smokers which further confirms the health risks associated with smoking. In part 2, we use five machine learning algorithms: Naïve Bayes, MLP, Logistic regression classifier, J48 and Decision Table to predict the smoking status of patients. To compare the effectiveness of these algorithms we use: Precision, Recall, F-measure and Accuracy measures. The results show that the Logistic algorithm outperformed the four other algorithms with Precision, Recall, F-Measure, and Accuracy of 83%, 83.4%, 83.2%, 83.44%, respectively.
Kalman-predictive-proportional-integral-derivative (KPPID)

International Nuclear Information System (INIS)

Fluerasu, A.; Sutton, M.

2004-01-01

With third generation synchrotron X-ray sources, it is possible to acquire detailed structural information about the system under study with time resolution orders of magnitude faster than was possible a few years ago. These advances have generated many new challenges for changing and controlling the state of the system on very short time scales, in a uniform and controlled manner. For our particular X-ray experiments on crystallization or order-disorder phase transitions in metallic alloys, we need to change the sample temperature by hundreds of degrees as fast as possible while avoiding over or under shooting. To achieve this, we designed and implemented a computer-controlled temperature tracking system which combines standard Proportional-Integral-Derivative (PID) feedback, thermal modeling and finite difference thermal calculations (feedforward), and Kalman filtering of the temperature readings in order to reduce the noise. The resulting Kalman-Predictive-Proportional-Integral-Derivative (KPPID) algorithm allows us to obtain accurate control, to minimize the response time and to avoid over/under shooting, even in systems with inherently noisy temperature readings and time delays. The KPPID temperature controller was successfully implemented at the Advanced Photon Source at Argonne National Laboratories and was used to perform coherent and time-resolved X-ray diffraction experiments.
Integrating Models of Diffusion and Behavior to Predict Innovation Adoption, Maintenance, and Social Diffusion.

Science.gov (United States)

Smith, Rachel A; Kim, Youllee; Zhu, Xun; Doudou, Dimi Théodore; Sternberg, Eleanore D; Thomas, Matthew B

2018-01-01

This study documents an investigation into the adoption and diffusion of eave tubes, a novel mosquito vector control, during a large-scale scientific field trial in West Africa. The diffusion of innovations (DOI) and the integrated model of behavior (IMB) were integrated (i.e., innovation attributes with attitudes and social pressures with norms) to predict participants' (N = 329) diffusion intentions. The findings showed that positive attitudes about the innovation's attributes were a consistent positive predictor of diffusion intentions: adopting it, maintaining it, and talking with others about it. As expected by the DOI and the IMB, the social pressure created by a descriptive norm positively predicted intentions to adopt and maintain the innovation. Drawing upon sharing research, we argued that the descriptive norm may dampen future talk about the innovation, because it may no longer be seen as a novel, useful topic to discuss. As predicted, the results showed that as the descriptive norm increased, the intention to talk about the innovation decreased. These results provide broad support for integrating the DOI and the IMB to predict diffusion and for efforts to draw on other research to understand motivations for social diffusion.
Optimal Prediction of Moving Sound Source Direction in the Owl.

Directory of Open Access Journals (Sweden)

Weston Cox

2015-07-01

Full Text Available Capturing nature's statistical structure in behavioral responses is at the core of the ability to function adaptively in the environment. Bayesian statistical inference describes how sensory and prior information can be combined optimally to guide behavior. An outstanding open question of how neural coding supports Bayesian inference includes how sensory cues are optimally integrated over time. Here we address what neural response properties allow a neural system to perform Bayesian prediction, i.e., predicting where a source will be in the near future given sensory information and prior assumptions. The work here shows that the population vector decoder will perform Bayesian prediction when the receptive fields of the neurons encode the target dynamics with shifting receptive fields. We test the model using the system that underlies sound localization in barn owls. Neurons in the owl's midbrain show shifting receptive fields for moving sources that are consistent with the predictions of the model. We predict that neural populations can be specialized to represent the statistics of dynamic stimuli to allow for a vector read-out of Bayes-optimal predictions.
Analysis Code - Data Analysis in 'Leveraging Multiple Statistical Methods for Inverse Prediction in Nuclear Forensics Applications' (LMSMIPNFA) v. 1.0

Energy Technology Data Exchange (ETDEWEB)

2018-03-19

R code that performs the analysis of a data set presented in the paper ‘Leveraging Multiple Statistical Methods for Inverse Prediction in Nuclear Forensics Applications’ by Lewis, J., Zhang, A., Anderson-Cook, C. It provides functions for doing inverse predictions in this setting using several different statistical methods. The data set is a publicly available data set from a historical Plutonium production experiment.
Prediction of interior noise due to random acoustic or turbulent boundary layer excitation using statistical energy analysis

Science.gov (United States)

Grosveld, Ferdinand W.

1990-01-01

The feasibility of predicting interior noise due to random acoustic or turbulent boundary layer excitation was investigated in experiments in which a statistical energy analysis model (VAPEPS) was used to analyze measurements of the acceleration response and sound transmission of flat aluminum, lucite, and graphite/epoxy plates exposed to random acoustic or turbulent boundary layer excitation. The noise reduction of the plate, when backed by a shallow cavity and excited by a turbulent boundary layer, was predicted using a simplified theory based on the assumption of adiabatic compression of the fluid in the cavity. The predicted plate acceleration response was used as input in the noise reduction prediction. Reasonable agreement was found between the predictions and the measured noise reduction in the frequency range 315-1000 Hz.
Fast integration-based prediction bands for ordinary differential equation models.

Science.gov (United States)

Hass, Helge; Kreutz, Clemens; Timmer, Jens; Kaschek, Daniel

2016-04-15

To gain a deeper understanding of biological processes and their relevance in disease, mathematical models are built upon experimental data. Uncertainty in the data leads to uncertainties of the model's parameters and in turn to uncertainties of predictions. Mechanistic dynamic models of biochemical networks are frequently based on nonlinear differential equation systems and feature a large number of parameters, sparse observations of the model components and lack of information in the available data. Due to the curse of dimensionality, classical and sampling approaches propagating parameter uncertainties to predictions are hardly feasible and insufficient. However, for experimental design and to discriminate between competing models, prediction and confidence bands are essential. To circumvent the hurdles of the former methods, an approach to calculate a profile likelihood on arbitrary observations for a specific time point has been introduced, which provides accurate confidence and prediction intervals for nonlinear models and is computationally feasible for high-dimensional models. In this article, reliable and smooth point-wise prediction and confidence bands to assess the model's uncertainty on the whole time-course are achieved via explicit integration with elaborate correction mechanisms. The corresponding system of ordinary differential equations is derived and tested on three established models for cellular signalling. An efficiency analysis is performed to illustrate the computational benefit compared with repeated profile likelihood calculations at multiple time points. The integration framework and the examples used in this article are provided with the software package Data2Dynamics, which is based on MATLAB and freely available at http://www.data2dynamics.org helge.hass@fdm.uni-freiburg.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e
Path integral molecular dynamics for exact quantum statistics of multi-electronic-state systems.

Science.gov (United States)

Liu, Xinzijian; Liu, Jian

2018-03-14

An exact approach to compute physical properties for general multi-electronic-state (MES) systems in thermal equilibrium is presented. The approach is extended from our recent progress on path integral molecular dynamics (PIMD), Liu et al. [J. Chem. Phys. 145, 024103 (2016)] and Zhang et al. [J. Chem. Phys. 147, 034109 (2017)], for quantum statistical mechanics when a single potential energy surface is involved. We first define an effective potential function that is numerically favorable for MES-PIMD and then derive corresponding estimators in MES-PIMD for evaluating various physical properties. Its application to several representative one-dimensional and multi-dimensional models demonstrates that MES-PIMD in principle offers a practical tool in either of the diabatic and adiabatic representations for studying exact quantum statistics of complex/large MES systems when the Born-Oppenheimer approximation, Condon approximation, and harmonic bath approximation are broken.
Integrative EEG biomarkers predict progression to Alzheimer's disease at the MCI stage

Directory of Open Access Journals (Sweden)

Simon-Shlomo ePoil

2013-10-01

Full Text Available Alzheimer's disease (AD is a devastating disorder of increasing prevalence in modern society. Mild cognitive impairment (MCI is considered a transitional stage between normal aging and AD; however, not all subjects with MCI progress to AD. Prediction of conversion to AD at an early stage would enable an earlier, and potentially more effective, treatment of AD. Electroencephalography (EEG biomarkers would provide a non-invasive and relatively cheap screening tool to predict conversion to AD; however, traditional EEG biomarkers have not been considered accurate enough to be useful in clinical practice. Here, we aim to combine the information from multiple EEG biomarkers into a diagnostic classification index in order to improve the accuracy of predicting conversion from MCI to AD within a two-year period. We followed 86 patients initially diagnosed with MCI for two years during which 25 patients converted to AD. We show that multiple EEG biomarkers mainly related to activity in the beta-frequency range (13–30 Hz can predict conversion from MCI to AD. Importantly, by integrating six EEG biomarkers into a diagnostic index using logistic regression the prediction improved compared with the classification using the individual biomarkers, with a sensitivity of 88% and specificity of 82%, compared with a sensitivity of 64% and specificity of 62% of the best individual biomarker in this index. In order to identify this diagnostic index we developed a data mining approach implemented in the Neurophysiological Biomarker Toolbox (http://www.nbtwiki.net/. We suggest that this approach can be used to identify optimal combinations of biomarkers (integrative biomarkers also in other modalities. Potentially, these integrative biomarkers could be more sensitive to disease progression and response to therapeutic intervention.
The statistical prediction of offshore winds from land-based data for wind-energy applications

DEFF Research Database (Denmark)

Walmsley, J.L.; Barthelmie, R.J.; Burrows, W.R.

2001-01-01

Land-based meteorological measurements at two locations on the Danish coast are used to predict offshore wind speeds. Offshore wind-speed data are used only for developing the statistical prediction algorithms and for verification. As a first step, the two datasets were separated into nine...... percentile-based bins, with a minimum of 30 data records in each bin. Next, the records were randomly selected with approximately 70% of the data in each bin being used as a training set for development of the prediction algorithms, and the remaining 30% being reserved as a test set for evaluation purposes....... The binning procedure ensured that both training and test sets fairly represented the overall data distribution. To base the conclusions on firmer ground, five permutations of these training and test sets were created. Thus, all calculations were based on five cases, each one representing a different random...
Spectral properties of a two dimensional photonic crystal with quasi-integrable geometry

International Nuclear Information System (INIS)

Cruz-Bueno, J J; Méndez-Bermúdez, J A; Arriaga, J

2013-01-01

In this paper we study the statistical properties of the allowed frequencies for electromagnetic waves propagating in two-dimensional photonic crystals with quasi-integrable geometry. We compute the level spacing, group velocity, and curvature distributions (P(s), P(v), and P(c), respectively) and compare them with the corresponding random matrix theory predictions. Due to the quasi-integrability of the crystal we observe signatures of intermediate statistics in P(s) and P(c) for high refractive index contrasts
Statistical significance of theoretical predictions: A new dimension in nuclear structure theories (I)

International Nuclear Information System (INIS)

DUDEK, J; SZPAK, B; FORNAL, B; PORQUET, M-G

2011-01-01

In this and the follow-up article we briefly discuss what we believe represents one of the most serious problems in contemporary nuclear structure: the question of statistical significance of parametrizations of nuclear microscopic Hamiltonians and the implied predictive power of the underlying theories. In the present Part I, we introduce the main lines of reasoning of the so-called Inverse Problem Theory, an important sub-field in the contemporary Applied Mathematics, here illustrated on the example of the Nuclear Mean-Field Approach.
Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models.

Science.gov (United States)

Ledonne, Norman C; Rissolo, Kevin; Bulgarelli, James; Tini, Leonard

2011-02-07

Standard approaches to address the performance of predictive models that used common statistical measurements for the entire data set provide an overview of the average performance of the models across the entire predictive space, but give little insight into applicability of the model across the prediction space. Guha and Van Drie recently proposed the use of structure-activity landscape index (SALI) curves via the SALI curve integral (SCI) as a means to map the predictive power of computational models within the predictive space. This approach evaluates model performance by assessing the accuracy of pairwise predictions, comparing compound pairs in a manner similar to that done by medicinal chemists. The SALI approach was used to evaluate the performance of continuous prediction models for MDR1-MDCK in vitro efflux potential. Efflux models were built with ADMET Predictor neural net, support vector machine, kernel partial least squares, and multiple linear regression engines, as well as SIMCA-P+ partial least squares, and random forest from Pipeline Pilot as implemented by AstraZeneca, using molecular descriptors from SimulationsPlus and AstraZeneca. The results indicate that the choice of training sets used to build the prediction models is of great importance in the resulting model quality and that the SCI values calculated for these models were very similar to their Kendall τ values, leading to our suggestion of an approach to use this SALI/SCI paradigm to evaluate predictive model performance that will allow more informed decisions regarding model utility. The use of SALI graphs and curves provides an additional level of quality assessment for predictive models.
Predicting adverse drug reaction profiles by integrating protein interaction networks with drug structures.

Science.gov (United States)

Huang, Liang-Chin; Wu, Xiaogang; Chen, Jake Y

2013-01-01

The prediction of adverse drug reactions (ADRs) has become increasingly important, due to the rising concern on serious ADRs that can cause drugs to fail to reach or stay in the market. We proposed a framework for predicting ADR profiles by integrating protein-protein interaction (PPI) networks with drug structures. We compared ADR prediction performances over 18 ADR categories through four feature groups-only drug targets, drug targets with PPI networks, drug structures, and drug targets with PPI networks plus drug structures. The results showed that the integration of PPI networks and drug structures can significantly improve the ADR prediction performance. The median AUC values for the four groups were 0.59, 0.61, 0.65, and 0.70. We used the protein features in the best two models, "Cardiac disorders" (median-AUC: 0.82) and "Psychiatric disorders" (median-AUC: 0.76), to build ADR-specific PPI networks with literature supports. For validation, we examined 30 drugs withdrawn from the U.S. market to see if our approach can predict their ADR profiles and explain why they were withdrawn. Except for three drugs having ADRs in the categories we did not predict, 25 out of 27 withdrawn drugs (92.6%) having severe ADRs were successfully predicted by our approach. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quantification of integrated HIV DNA by repetitive-sampling Alu-HIV PCR on the basis of poisson statistics.

Science.gov (United States)

De Spiegelaere, Ward; Malatinkova, Eva; Lynch, Lindsay; Van Nieuwerburgh, Filip; Messiaen, Peter; O'Doherty, Una; Vandekerckhove, Linos

2014-06-01

Quantification of integrated proviral HIV DNA by repetitive-sampling Alu-HIV PCR is a candidate virological tool to monitor the HIV reservoir in patients. However, the experimental procedures and data analysis of the assay are complex and hinder its widespread use. Here, we provide an improved and simplified data analysis method by adopting binomial and Poisson statistics. A modified analysis method on the basis of Poisson statistics was used to analyze the binomial data of positive and negative reactions from a 42-replicate Alu-HIV PCR by use of dilutions of an integration standard and on samples of 57 HIV-infected patients. Results were compared with the quantitative output of the previously described Alu-HIV PCR method. Poisson-based quantification of the Alu-HIV PCR was linearly correlated with the standard dilution series, indicating that absolute quantification with the Poisson method is a valid alternative for data analysis of repetitive-sampling Alu-HIV PCR data. Quantitative outputs of patient samples assessed by the Poisson method correlated with the previously described Alu-HIV PCR analysis, indicating that this method is a valid alternative for quantifying integrated HIV DNA. Poisson-based analysis of the Alu-HIV PCR data enables absolute quantification without the need of a standard dilution curve. Implementation of the CI estimation permits improved qualitative analysis of the data and provides a statistical basis for the required minimal number of technical replicates. © 2014 The American Association for Clinical Chemistry.
Post-fire debris flow prediction in Western United States: Advancements based on a nonparametric statistical technique

Science.gov (United States)

Nikolopoulos, E. I.; Destro, E.; Bhuiyan, M. A. E.; Borga, M., Sr.; Anagnostou, E. N.

2017-12-01

Fire disasters affect modern societies at global scale inducing significant economic losses and human casualties. In addition to their direct impacts they have various adverse effects on hydrologic and geomorphologic processes of a region due to the tremendous alteration of the landscape characteristics (vegetation, soil properties etc). As a consequence, wildfires often initiate a cascade of hazards such as flash floods and debris flows that usually follow the occurrence of a wildfire thus magnifying the overall impact in a region. Post-fire debris flows (PFDF) is one such type of hazards frequently occurring in Western United States where wildfires are a common natural disaster. Prediction of PDFD is therefore of high importance in this region and over the last years a number of efforts from United States Geological Survey (USGS) and National Weather Service (NWS) have been focused on the development of early warning systems that will help mitigate PFDF risk. This work proposes a prediction framework that is based on a nonparametric statistical technique (random forests) that allows predicting the occurrence of PFDF at regional scale with a higher degree of accuracy than the commonly used approaches that are based on power-law thresholds and logistic regression procedures. The work presented is based on a recently released database from USGS that reports a total of 1500 storms that triggered and did not trigger PFDF in a number of fire affected catchments in Western United States. The database includes information on storm characteristics (duration, accumulation, max intensity etc) and other auxiliary information of land surface properties (soil erodibility index, local slope etc). Results show that the proposed model is able to achieve a satisfactory prediction accuracy (threat score > 0.6) superior of previously published prediction frameworks highlighting the potential of nonparametric statistical techniques for development of PFDF prediction systems.
An integrated biochemical prediction model of all-cause mortality in patients undergoing lower extremity bypass surgery for advanced peripheral artery disease.

Science.gov (United States)

Owens, Christopher D; Kim, Ji Min; Hevelone, Nathanael D; Gasper, Warren J; Belkin, Michael; Creager, Mark A; Conte, Michael S

2012-09-01

Patients with advanced peripheral artery disease (PAD) have a high prevalence of cardiovascular (CV) risk factors and shortened life expectancy. However, CV risk factors poorly predict midterm (model was used to assess the main outcome of all-cause mortality. A clinical model was constructed with known CV risk factors, and the incremental value of the addition of clinical chemistry, lipid assessment, and a panel of 11 inflammatory parameters was investigated using the C statistic, the integrated discrimination improvement index, and Akaike information criterion. The study monitored 225 patients for a median of 893 days (interquartile range, 539-1315 days). In this study, 50 patients (22.22%) died during the follow-up period. By life-table analysis (expressed as percent surviving ± standard error), survival at 1, 2, 3, 4, and 5 years, respectively, was 90.5% ± 1.9%, 83.4% ± 2.5%, 77.5% ± 3.1%, 71.0% ± 3.8%, and 65.3% ± 6.5%. Compared with survivors, decedents were older, diabetic, had extant coronary artery disease, and were more likely to present with critical limb ischemia as their indication for bypass surgery (P model and produced a final C statistic of 0.82. A risk prediction model including traditional risk factors and parameters of inflammation, renal function, and nutrition had excellent discriminatory ability in predicting all-cause mortality in patients with clinically advanced PAD undergoing bypass surgery. Copyright © 2012 Society for Vascular Surgery. Published by Mosby, Inc. All rights reserved.
Neighborhood Integration and Connectivity Predict Cognitive Performance and Decline

Directory of Open Access Journals (Sweden)

Amber Watts PhD

2015-08-01

Full Text Available Objective: Neighborhood characteristics may be important for promoting walking, but little research has focused on older adults, especially those with cognitive impairment. We evaluated the role of neighborhood characteristics on cognitive function and decline over a 2-year period adjusting for measures of walking. Method: In a study of 64 older adults with and without mild Alzheimer’s disease (AD, we evaluated neighborhood integration and connectivity using geographical information systems data and space syntax analysis. In multiple regression analyses, we used these characteristics to predict 2-year declines in factor analytically derived cognitive scores (attention, verbal memory, mental status adjusting for age, sex, education, and self-reported walking. Results : Neighborhood integration and connectivity predicted cognitive performance at baseline, and changes in cognitive performance over 2 years. The relationships between neighborhood characteristics and cognitive performance were not fully explained by self-reported walking. Discussion : Clearer definitions of specific neighborhood characteristics associated with walkability are needed to better understand the mechanisms by which neighborhoods may impact cognitive outcomes. These results have implications for measuring neighborhood characteristics, design and maintenance of living spaces, and interventions to increase walking among older adults. We offer suggestions for future research measuring neighborhood characteristics and cognitive function.
Integrated predictive modelling simulations of burning plasma experiment designs

International Nuclear Information System (INIS)

Bateman, Glenn; Onjun, Thawatchai; Kritz, Arnold H

2003-01-01

Models for the height of the pedestal at the edge of H-mode plasmas (Onjun T et al 2002 Phys. Plasmas 9 5018) are used together with the Multi-Mode core transport model (Bateman G et al 1998 Phys. Plasmas 5 1793) in the BALDUR integrated predictive modelling code to predict the performance of the ITER (Aymar A et al 2002 Plasma Phys. Control. Fusion 44 519), FIRE (Meade D M et al 2001 Fusion Technol. 39 336), and IGNITOR (Coppi B et al 2001 Nucl. Fusion 41 1253) fusion reactor designs. The simulation protocol used in this paper is tested by comparing predicted temperature and density profiles against experimental data from 33 H-mode discharges in the JET (Rebut P H et al 1985 Nucl. Fusion 25 1011) and DIII-D (Luxon J L et al 1985 Fusion Technol. 8 441) tokamaks. The sensitivities of the predictions are evaluated for the burning plasma experimental designs by using variations of the pedestal temperature model that are one standard deviation above and below the standard model. Simulations of the fusion reactor designs are carried out for scans in which the plasma density and auxiliary heating power are varied
Development of an Integrated Moisture Index for predicting species composition

Science.gov (United States)

Louis R. Iverson; Charles T. Scott; Martin E. Dale; Anantha Prasad

1996-01-01

A geographic information system (GIS) approach was used to develop an Integrated Moisture Index (IMI), which was used to predict species composition for Ohio forests. Several landscape features (a slope-aspect shading index, cumulative flow of water downslope, curvature of the landscape, and the water-holding capacity of the soil) were derived from elevation and soils...

Integration of Fast Predictive Model and SLM Process Development Chamber, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — This STTR project seeks to develop a fast predictive model for selective laser melting (SLM) processes and then integrate that model with an SLM chamber that allows...
Machine learning and statistical techniques : an application to the prediction of insolvency in Spanish non-life insurance companies

OpenAIRE

Díaz, Zuleyka; Segovia, María Jesús; Fernández, José

2005-01-01

Prediction of insurance companies insolvency has arisen as an important problem in the field of financial research. Most methods applied in the past to tackle this issue are traditional statistical techniques which use financial ratios as explicative variables. However, these variables often do not satisfy statistical assumptions, which complicates the application of the mentioned methods. In this paper, a comparative study of the performance of two non-parametric machine learning techniques ...
Comparing statistical and machine learning classifiers: alternatives for predictive modeling in human factors research.

Science.gov (United States)

Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann

2003-01-01

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
Proposal for future diagnosis and management of vascular tumors by using automatic software for image processing and statistic prediction.

Science.gov (United States)

Popescu, M D; Draghici, L; Secheli, I; Secheli, M; Codrescu, M; Draghici, I

2015-01-01

Infantile Hemangiomas (IH) are the most frequent tumors of vascular origin, and the differential diagnosis from vascular malformations is difficult to establish. Specific types of IH due to the location, dimensions and fast evolution, can determine important functional and esthetic sequels. To avoid these unfortunate consequences it is necessary to establish the exact appropriate moment to begin the treatment and decide which the most adequate therapeutic procedure is. Based on clinical data collected by a serial clinical observations correlated with imaging data, and processed by a computer-aided diagnosis system (CAD), the study intended to develop a treatment algorithm to accurately predict the best final results, from the esthetical and functional point of view, for a certain type of lesion. The preliminary database was composed of 75 patients divided into 4 groups according to the treatment management they received: medical therapy, sclerotherapy, surgical excision and no treatment. The serial clinical observation was performed each month and all the data was processed by using CAD. The project goal was to create a software that incorporated advanced methods to accurately measure the specific IH lesions, integrated medical information, statistical methods and computational methods to correlate this information with that obtained from the processing of images. Based on these correlations, a prediction mechanism of the evolution of hemangioma, which helped determine the best method of therapeutic intervention to minimize further complications, was established.
Using the Student Research Project to Integrate Macroeconomics and Statistics in an Advanced Cost Accounting Course

Science.gov (United States)

Hassan, Mahamood M.; Schwartz, Bill N.

2014-01-01

This paper discusses a student research project that is part of an advanced cost accounting class. The project emphasizes active learning, integrates cost accounting with macroeconomics and statistics by "learning by doing" using real world data. Students analyze sales data for a publicly listed company by focusing on the company's…
Unbalanced Regressions and the Predictive Equation

DEFF Research Database (Denmark)

Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo

Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...... in the theoretical predictive equation by suggesting a data generating process, where returns are generated as linear functions of a lagged latent I(0) risk process. The observed predictor is a function of this latent I(0) process, but it is corrupted by a fractionally integrated noise. Such a process may arise due...... to aggregation or unexpected level shifts. In this setup, the practitioner estimates a misspecified, unbalanced, and endogenous predictive regression. We show that the OLS estimate of this regression is inconsistent, but standard inference is possible. To obtain a consistent slope estimate, we then suggest...
An assessment of machine and statistical learning approaches to inferring networks of protein-protein interactions

Directory of Open Access Journals (Sweden)

Browne Fiona

2006-12-01

Full Text Available Protein-protein interactions (PPI play a key role in many biological systems. Over the past few years, an explosion in availability of functional biological data obtained from high-throughput technologies to infer PPI has been observed. However, results obtained from such experiments show high rates of false positives and false negatives predictions as well as systematic predictive bias. Recent research has revealed that several machine and statistical learning methods applied to integrate relatively weak, diverse sources of large-scale functional data may provide improved predictive accuracy and coverage of PPI. In this paper we describe the effects of applying different computational, integrative methods to predict PPI in Saccharomyces cerevisiae. We investigated the predictive ability of combining different sets of relatively strong and weak predictive datasets. We analysed several genomic datasets ranging from mRNA co-expression to marginal essentiality. Moreover, we expanded an existing multi-source dataset from S. cerevisiae by constructing a new set of putative interactions extracted from Gene Ontology (GO- driven annotations in the Saccharomyces Genome Database. Different classification techniques: Simple Naive Bayesian (SNB, Multilayer Perceptron (MLP and K-Nearest Neighbors (KNN were evaluated. Relatively simple classification methods (i.e. less computing intensive and mathematically complex, such as SNB, have been proven to be proficient at predicting PPI. SNB produced the “highest” predictive quality obtaining an area under Receiver Operating Characteristic (ROC curve (AUC value of 0.99. The lowest AUC value of 0.90 was obtained by the KNN classifier. This assessment also demonstrates the strong predictive power of GO-driven models, which offered predictive performance above 0.90 using the different machine learning and statistical techniques. As the predictive power of single-source datasets became weaker MLP and SNB performed
Brain systems for probabilistic and dynamic prediction: computational specificity and integration.

Directory of Open Access Journals (Sweden)

Jill X O'Reilly

2013-09-01

Full Text Available A computational approach to functional specialization suggests that brain systems can be characterized in terms of the types of computations they perform, rather than their sensory or behavioral domains. We contrasted the neural systems associated with two computationally distinct forms of predictive model: a reinforcement-learning model of the environment obtained through experience with discrete events, and continuous dynamic forward modeling. By manipulating the precision with which each type of prediction could be used, we caused participants to shift computational strategies within a single spatial prediction task. Hence (using fMRI we showed that activity in two brain systems (typically associated with reward learning and motor control could be dissociated in terms of the forms of computations that were performed there, even when both systems were used to make parallel predictions of the same event. A region in parietal cortex, which was sensitive to the divergence between the predictions of the models and anatomically connected to both computational networks, is proposed to mediate integration of the two predictive modes to produce a single behavioral output.
The Prediction-Focused Approach: An opportunity for hydrogeophysical data integration and interpretation

Science.gov (United States)

Hermans, Thomas; Nguyen, Frédéric; Klepikova, Maria; Dassargues, Alain; Caers, Jef

2017-04-01

Hydrogeophysics is an interdisciplinary field of sciences aiming at a better understanding of subsurface hydrological processes. If geophysical surveys have been successfully used to qualitatively characterize the subsurface, two important challenges remain for a better quantification of hydrological processes: (1) the inversion of geophysical data and (2) their integration in hydrological subsurface models. The classical inversion approach using regularization suffers from spatially and temporally varying resolution and yields geologically unrealistic solutions without uncertainty quantification, making their utilization for hydrogeological calibration less consistent. More advanced techniques such as coupled inversion allow for a direct use of geophysical data for conditioning groundwater and solute transport model calibration. However, the technique is difficult to apply in complex cases and remains computationally demanding to estimate uncertainty. In a recent study, we investigate a prediction-focused approach (PFA) to directly estimate subsurface physical properties from geophysical data, circumventing the need for classic inversions. In PFA, we seek a direct relationship between the data and the subsurface variables we want to predict (the forecast). This relationship is obtained through a prior set of subsurface models for which both data and forecast are computed. A direct relationship can often be derived through dimension reduction techniques. PFA offers a framework for both hydrogeophysical "inversion" and hydrogeophysical data integration. For hydrogeophysical "inversion", the considered forecast variable is the subsurface variable, such as the salinity. An ensemble of possible solutions is generated, allowing uncertainty quantification. For hydrogeophysical data integration, the forecast variable becomes the prediction we want to make with our subsurface models, such as the concentration of contaminant in a drinking water production well. Geophysical
Predicting Elementary Education Candidates' Technology Integration during Their Field Placement Instruction.

Science.gov (United States)

Negishi, Meiko; Elder, Anastasia D.; Hamil, J. Burnette; Mzoughi, Taha

A growing concern in teacher education programs is technology training. Research confirms that training positively affects perservice teachers' attitudes and technology proficiency. However, little is known about the kinds of factors that may predict preservice teachers' integration of technology into their own instruction. The goal of this study…
A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information.

Science.gov (United States)

Luo, Yunan; Zhao, Xinbin; Zhou, Jingtian; Yang, Jinglin; Zhang, Yanqing; Kuang, Wenhua; Peng, Jian; Chen, Ligong; Zeng, Jianyang

2017-09-18

The emergence of large-scale genomic, chemical and pharmacological data provides new opportunities for drug discovery and repositioning. In this work, we develop a computational pipeline, called DTINet, to predict novel drug-target interactions from a constructed heterogeneous network, which integrates diverse drug-related information. DTINet focuses on learning a low-dimensional vector representation of features, which accurately explains the topological properties of individual nodes in the heterogeneous network, and then makes prediction based on these representations via a vector space projection scheme. DTINet achieves substantial performance improvement over other state-of-the-art methods for drug-target interaction prediction. Moreover, we experimentally validate the novel interactions between three drugs and the cyclooxygenase proteins predicted by DTINet, and demonstrate the new potential applications of these identified cyclooxygenase inhibitors in preventing inflammatory diseases. These results indicate that DTINet can provide a practically useful tool for integrating heterogeneous information to predict new drug-target interactions and repurpose existing drugs.Network-based data integration for drug-target prediction is a promising avenue for drug repositioning, but performance is wanting. Here, the authors introduce DTINet, whose performance is enhanced in the face of noisy, incomplete and high-dimensional biological data by learning low-dimensional vector representations.
Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models

Directory of Open Access Journals (Sweden)

LeDonne Norman C

2011-02-01

Full Text Available Abstract Background Standard approaches to address the performance of predictive models that used common statistical measurements for the entire data set provide an overview of the average performance of the models across the entire predictive space, but give little insight into applicability of the model across the prediction space. Guha and Van Drie recently proposed the use of structure-activity landscape index (SALI curves via the SALI curve integral (SCI as a means to map the predictive power of computational models within the predictive space. This approach evaluates model performance by assessing the accuracy of pairwise predictions, comparing compound pairs in a manner similar to that done by medicinal chemists. Results The SALI approach was used to evaluate the performance of continuous prediction models for MDR1-MDCK in vitro efflux potential. Efflux models were built with ADMET Predictor neural net, support vector machine, kernel partial least squares, and multiple linear regression engines, as well as SIMCA-P+ partial least squares, and random forest from Pipeline Pilot as implemented by AstraZeneca, using molecular descriptors from SimulationsPlus and AstraZeneca. Conclusion The results indicate that the choice of training sets used to build the prediction models is of great importance in the resulting model quality and that the SCI values calculated for these models were very similar to their Kendall τ values, leading to our suggestion of an approach to use this SALI/SCI paradigm to evaluate predictive model performance that will allow more informed decisions regarding model utility. The use of SALI graphs and curves provides an additional level of quality assessment for predictive models.
Predicting Statistical Distributions of Footbridge Vibrations

DEFF Research Database (Denmark)

Pedersen, Lars; Frier, Christian

2009-01-01

The paper considers vibration response of footbridges to pedestrian loading. Employing Newmark and Monte Carlo simulation methods, a statistical distribution of bridge vibration levels is calculated modelling walking parameters such as step frequency and stride length as random variables...
Predicting cycle time distributions for integrated processing workstations : an aggregate modeling approach

NARCIS (Netherlands)

Veeger, C.P.L.; Etman, L.F.P.; Lefeber, A.A.J.; Adan, I.J.B.F.; Herk, van J.; Rooda, J.E.

2011-01-01

To predict cycle time distributions of integrated processing workstations, detailed simulation models are almost exclusively used; these models require considerable development and maintenance effort. As an alternative, we propose an aggregate model that is a lumped-parameter representation of the
Integration of statistical modeling and high-content microscopy to systematically investigate cell-substrate interactions.

Science.gov (United States)

Chen, Wen Li Kelly; Likhitpanichkul, Morakot; Ho, Anthony; Simmons, Craig A

2010-03-01

Cell-substrate interactions are multifaceted, involving the integration of various physical and biochemical signals. The interactions among these microenvironmental factors cannot be facilely elucidated and quantified by conventional experimentation, and necessitate multifactorial strategies. Here we describe an approach that integrates statistical design and analysis of experiments with automated microscopy to systematically investigate the combinatorial effects of substrate-derived stimuli (substrate stiffness and matrix protein concentration) on mesenchymal stem cell (MSC) spreading, proliferation and osteogenic differentiation. C3H10T1/2 cells were grown on type I collagen- or fibronectin-coated polyacrylamide hydrogels with tunable mechanical properties. Experimental conditions, which were defined according to central composite design, consisted of specific permutations of substrate stiffness (3-144 kPa) and adhesion protein concentration (7-520 microg/mL). Spreading area, BrdU incorporation and Runx2 nuclear translocation were quantified using high-content microscopy and modeled as mathematical functions of substrate stiffness and protein concentration. The resulting response surfaces revealed distinct patterns of protein-specific, substrate stiffness-dependent modulation of MSC proliferation and differentiation, demonstrating the advantage of statistical modeling in the detection and description of higher-order cellular responses. In a broader context, this approach can be adapted to study other types of cell-material interactions and can facilitate the efficient screening and optimization of substrate properties for applications involving cell-material interfaces. Copyright 2009 Elsevier Ltd. All rights reserved.
Integration of Predictive Display and Aircraft Flight Control System

Directory of Open Access Journals (Sweden)

Efremov A.V.

2017-01-01

Full Text Available The synthesis of predictive display information and direct lift control system are considered for the path control tracking tasks (in particular landing task. The both solutions are based on pilot-vehicle system analysis and requirements to provide the highest accuracy and lowest pilot workload. The investigation was carried out for cases with and without time delay in aircraft dynamics. The efficiency of the both ways for the flying qualities improvement and their integration is tested by ground based simulation.
An electrically actuated imperfect microbeam: Dynamical integrity for interpreting and predicting the device response

KAUST Repository

Ruzziconi, Laura

2013-02-20

In this study we deal with a microelectromechanical system (MEMS) and develop a dynamical integrity analysis to interpret and predict the experimental response. The device consists of a clamped-clamped polysilicon microbeam, which is electrostatically and electrodynamically actuated. It has non-negligible imperfections, which are a typical consequence of the microfabrication process. A single-mode reduced-order model is derived and extensive numerical simulations are performed in a neighborhood of the first symmetric natural frequency, via frequency response diagrams and behavior chart. The typical softening behavior is observed and the overall scenario is explored, when both the frequency and the electrodynamic voltage are varied. We show that simulations based on direct numerical integration of the equation of motion in time yield satisfactory agreement with the experimental data. Nevertheless, these theoretical predictions are not completely fulfilled in some aspects. In particular, the range of existence of each attractor is smaller in practice than in the simulations. This is because these theoretical curves represent the ideal limit case where disturbances are absent, which never occurs under realistic conditions. A reliable prediction of the actual (and not only theoretical) range of existence of each attractor is essential in applications. To overcome this discrepancy and extend the results to the practical case where disturbances exist, a dynamical integrity analysis is developed. After introducing dynamical integrity concepts, integrity profiles and integrity charts are drawn. They are able to describe if each attractor is robust enough to tolerate the disturbances. Moreover, they detect the parameter range where each branch can be reliably observed in practice and where, instead, becomes vulnerable, i.e. they provide valuable information to operate the device in safe conditions according to the desired outcome and depending on the expected disturbances
Integration of HIV in the Human Genome: Which Sites Are Preferential? A Genetic and Statistical Assessment

Science.gov (United States)

Gonçalves, Juliana; Moreira, Elsa; Sequeira, Inês J.; Rodrigues, António S.; Rueff, José; Brás, Aldina

2016-01-01

Chromosomal fragile sites (FSs) are loci where gaps and breaks may occur and are preferential integration targets for some viruses, for example, Hepatitis B, Epstein-Barr virus, HPV16, HPV18, and MLV vectors. However, the integration of the human immunodeficiency virus (HIV) in Giemsa bands and in FSs is not yet completely clear. This study aimed to assess the integration preferences of HIV in FSs and in Giemsa bands using an in silico study. HIV integration positions from Jurkat cells were used and two nonparametric tests were applied to compare HIV integration in dark versus light bands and in FS versus non-FS (NFSs). The results show that light bands are preferential targets for integration of HIV-1 in Jurkat cells and also that it integrates with equal intensity in FSs and in NFSs. The data indicates that HIV displays different preferences for FSs compared to other viruses. The aim was to develop and apply an approach to predict the conditions and constraints of HIV insertion in the human genome which seems to adequately complement empirical data. PMID:27294106
Predicting the accumulated number of plugged tubes in a steam generator using statistical methodologies

International Nuclear Information System (INIS)

Ferng, Y.-M.; Fan, C.N.; Pei, B.S.; Li, H.-N.

2008-01-01

A steam generator (SG) plays a significant role not only with respect to the primary-to-secondary heat transfer but also as a fission product barrier to prevent the release of radionuclides. Tube plugging is an efficient way to avoid releasing radionuclides when SG tubes are severely degraded. However, this remedial action may cause the decrease of SG heat transfer capability, especially in transient or accident conditions. It is therefore crucial for the plant staff to understand the trend of plugged tubes for the SG operation and maintenance. Statistical methodologies are proposed in this paper to predict this trend. The accumulated numbers of SG plugged tubes versus the operation time are predicted using the Weibull and log-normal distributions, which correspond well with the plant measured data from a selected pressurized water reactor (PWR). With the help of these predictions, the accumulated number of SG plugged tubes can be reasonably extrapolated to the 40-year operation lifetime (or even longer than 40 years) of a PWR. This information can assist the plant policymakers to determine whether or when a SG must be replaced
Quantification of design margins and safety factors based on the prediction uncertainty in tritium production rate from fusion integral experiments of the USDOE/JAERI collaborative program on fusion blanket neutronics

International Nuclear Information System (INIS)

Youssef, M.Z.; Konno, C.; Maekawa, F.; Ikeda, Y.; Kosako, K.; Nakagawa, M.; Mori, T.; Maekawa, H.

1995-01-01

Several fusion integral experiments were performed within a collaboration between the USA and Japan on fusion breeder neutronics aimed at verifying the prediction accuracy of key neutronics parameters in a fusion reactor blanket based on current neutron transport codes and basic nuclear databases. The focus has been on the tritium production rate (TRP) as an important design parameter to resolve the issue of tritium self-sufficiency in a fusion reactor. In this paper, the calculational and experimental uncertainties (errors) in local TPR in each experiment performed i were interpolated and propagated to estimate the prediction uncertainty u i in the line-integrated TPR and its standard deviation σ i . The measured data are based on Li-glass and NE213 detectors. From the quantities u i and σ i , normalized density functions (NDFs) were constructed, considering all the experiments and their associated analyses performed independently by the UCLA and JAERI. Several statistical parameters were derived, including the mean prediction uncertainties u and the possible spread ±σ u around them. Design margins and safety factors were derived from these NDFs. Distinction was made between the results obtained by UCLA and JAERI and between calculational results based on the discrete ordinates and Monte Carlo methods. The prediction uncertainties, their standard deviations and the design margins and safety factors were derived for the line-integrated TPR from Li-6 T 6 , and Li-7 T 7 . These parameters were used to estimate the corresponding uncertainties and safety factor for the line-integrated TPR from natural lithium T n . (orig.)

Prediction of monthly average global solar radiation based on statistical distribution of clearness index

International Nuclear Information System (INIS)

Ayodele, T.R.; Ogunjuyigbe, A.S.O.

2015-01-01

In this paper, probability distribution of clearness index is proposed for the prediction of global solar radiation. First, the clearness index is obtained from the past data of global solar radiation, then, the parameters of the appropriate distribution that best fit the clearness index are determined. The global solar radiation is thereafter predicted from the clearness index using inverse transformation of the cumulative distribution function. To validate the proposed method, eight years global solar radiation data (2000–2007) of Ibadan, Nigeria are used to determine the parameters of appropriate probability distribution for clearness index. The calculated parameters are then used to predict the future monthly average global solar radiation for the following year (2008). The predicted values are compared with the measured values using four statistical tests: the Root Mean Square Error (RMSE), MAE (Mean Absolute Error), MAPE (Mean Absolute Percentage Error) and the coefficient of determination (R"2). The proposed method is also compared to the existing regression models. The results show that logistic distribution provides the best fit for clearness index of Ibadan and the proposed method is effective in predicting the monthly average global solar radiation with overall RMSE of 0.383 MJ/m"2/day, MAE of 0.295 MJ/m"2/day, MAPE of 2% and R"2 of 0.967. - Highlights: • Distribution of clearnes index is proposed for prediction of global solar radiation. • The clearness index is obtained from the past data of global solar radiation. • The parameters of distribution that best fit the clearness index are determined. • Solar radiation is predicted from the clearness index using inverse transformation. • The method is effective in predicting the monthly average global solar radiation.
Trends and Opportunities of BIM-GIS Integration in the Architecture, Engineering and Construction Industry: A Review from a Spatio-Temporal Statistical Perspective

Directory of Open Access Journals (Sweden)

Yongze Song

2017-12-01

Full Text Available The integration of building information modelling (BIM and geographic information system (GIS in construction management is a new and fast developing trend in recent years, from research to industrial practice. BIM has advantages on rich geometric and semantic information through the building life cycle, while GIS is a broad field covering geovisualization-based decision making and geospatial modelling. However, most current studies of BIM-GIS integration focus on the integration techniques but lack theories and methods for further data analysis and mathematic modelling. This paper reviews the applications and discusses future trends of BIM-GIS integration in the architecture, engineering and construction (AEC industry based on the studies of 96 high-quality research articles from a spatio-temporal statistical perspective. The analysis of these applications helps reveal the evolution progress of BIM-GIS integration. Results show that the utilization of BIM-GIS integration in the AEC industry requires systematic theories beyond integration technologies and deep applications of mathematical modeling methods, including spatio-temporal statistical modeling in GIS and 4D/nD BIM simulation and management. Opportunities of BIM-GIS integration are outlined as three hypotheses in the AEC industry for future research on the in-depth integration of BIM and GIS. BIM-GIS integration hypotheses enable more comprehensive applications through the life cycle of AEC projects.
Predicting losing and gaining river reaches in lowland New Zealand based on a statistical methodology

Science.gov (United States)

Yang, Jing; Zammit, Christian; Dudley, Bruce

2017-04-01

The phenomenon of losing and gaining in rivers normally takes place in lowland where often there are various, sometimes conflicting uses for water resources, e.g., agriculture, industry, recreation, and maintenance of ecosystem function. To better support water allocation decisions, it is crucial to understand the location and seasonal dynamics of these losses and gains. We present a statistical methodology to predict losing and gaining river reaches in New Zealand based on 1) information surveys with surface water and groundwater experts from regional government, 2) A collection of river/watershed characteristics, including climate, soil and hydrogeologic information, and 3) the random forests technique. The surveys on losing and gaining reaches were conducted face-to-face at 16 New Zealand regional government authorities, and climate, soil, river geometry, and hydrogeologic data from various sources were collected and compiled to represent river/watershed characteristics. The random forests technique was used to build up the statistical relationship between river reach status (gain and loss) and river/watershed characteristics, and then to predict for river reaches at Strahler order one without prior losing and gaining information. Results show that the model has a classification error of around 10% for "gain" and "loss". The results will assist further research, and water allocation decisions in lowland New Zealand.
At the Nexus of History, Ecology, and Hydrobiogeochemistry: Improved Predictions across Scales through Integration.

Science.gov (United States)

Stegen, James C

2018-01-01

To improve predictions of ecosystem function in future environments, we need to integrate the ecological and environmental histories experienced by microbial communities with hydrobiogeochemistry across scales. A key issue is whether we can derive generalizable scaling relationships that describe this multiscale integration. There is a strong foundation for addressing these challenges. We have the ability to infer ecological history with null models and reveal impacts of environmental history through laboratory and field experimentation. Recent developments also provide opportunities to inform ecosystem models with targeted omics data. A major next step is coupling knowledge derived from such studies with multiscale modeling frameworks that are predictive under non-steady-state conditions. This is particularly true for systems spanning dynamic interfaces, which are often hot spots of hydrobiogeochemical function. We can advance predictive capabilities through a holistic perspective focused on the nexus of history, ecology, and hydrobiogeochemistry.
Statistical Diagnosis Method of Conductor Motions in Superconducting Magnets to Predict their Quench Performance

CERN Document Server

Khomenko, B A; Rijllart, A; Sanfilippo, S; Siemko, A

2001-01-01

Premature training quenches are usually caused by the transient energy released within the magnet coil as it is energised. Two distinct varieties of disturbances exist. They are thought to be electrical and mechanical in origin. The first type of disturbance comes from non-uniform current distribution in superconducting cables whereas the second one usually originates from conductor motions or micro-fractures of insulating materials under the action of Lorentz forces. All of these mechanical events produce in general a rapid variation of the voltages in the so-called quench antennas and across the magnet coil, called spikes. A statistical method to treat the spatial localisation and the time occurrence of spikes will be presented. It allows identification of the mechanical weak points in the magnet without need to increase the current to provoke a quench. The prediction of the quench level from detailed analysis of the spike statistics can be expected.
Development and application of a statistical methodology to evaluate the predictive accuracy of building energy baseline models

Energy Technology Data Exchange (ETDEWEB)

Granderson, Jessica [Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Energy Technologies Area Div.; Price, Phillip N. [Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Energy Technologies Area Div.

2014-03-01

This paper documents the development and application of a general statistical methodology to assess the accuracy of baseline energy models, focusing on its application to Measurement and Verification (M&V) of whole-building energy savings. The methodology complements the principles addressed in resources such as ASHRAE Guideline 14 and the International Performance Measurement and Verification Protocol. It requires fitting a baseline model to data from a ``training period’’ and using the model to predict total electricity consumption during a subsequent ``prediction period.’’ We illustrate the methodology by evaluating five baseline models using data from 29 buildings. The training period and prediction period were varied, and model predictions of daily, weekly, and monthly energy consumption were compared to meter data to determine model accuracy. Several metrics were used to characterize the accuracy of the predictions, and in some cases the best-performing model as judged by one metric was not the best performer when judged by another metric.
Predicting the Pullout Capacity of Small Ground Anchors Using Nonlinear Integrated Computing Techniques

Directory of Open Access Journals (Sweden)

Mosbeh R. Kaloop

2017-01-01

Full Text Available This study investigates predicting the pullout capacity of small ground anchors using nonlinear computing techniques. The input-output prediction model for the nonlinear Hammerstein-Wiener (NHW and delay inputs for the adaptive neurofuzzy inference system (DANFIS are developed and utilized to predict the pullout capacity. The results of the developed models are compared with previous studies that used artificial neural networks and least square support vector machine techniques for the same case study. The in situ data collection and statistical performances are used to evaluate the models performance. Results show that the developed models enhance the precision of predicting the pullout capacity when compared with previous studies. Also, the DANFIS model performance is proven to be better than other models used to detect the pullout capacity of ground anchors.
The value of model averaging and dynamical climate model predictions for improving statistical seasonal streamflow forecasts over Australia

Science.gov (United States)

Pokhrel, Prafulla; Wang, Q. J.; Robertson, David E.

2013-10-01

Seasonal streamflow forecasts are valuable for planning and allocation of water resources. In Australia, the Bureau of Meteorology employs a statistical method to forecast seasonal streamflows. The method uses predictors that are related to catchment wetness at the start of a forecast period and to climate during the forecast period. For the latter, a predictor is selected among a number of lagged climate indices as candidates to give the "best" model in terms of model performance in cross validation. This study investigates two strategies for further improvement in seasonal streamflow forecasts. The first is to combine, through Bayesian model averaging, multiple candidate models with different lagged climate indices as predictors, to take advantage of different predictive strengths of the multiple models. The second strategy is to introduce additional candidate models, using rainfall and sea surface temperature predictions from a global climate model as predictors. This is to take advantage of the direct simulations of various dynamic processes. The results show that combining forecasts from multiple statistical models generally yields more skillful forecasts than using only the best model and appears to moderate the worst forecast errors. The use of rainfall predictions from the dynamical climate model marginally improves the streamflow forecasts when viewed over all the study catchments and seasons, but the use of sea surface temperature predictions provide little additional benefit.
Predictive Systems Toxicology

KAUST Repository

Kiani, Narsis A.; Shang, Ming-Mei; Zenil, Hector; Tegner, Jesper

2018-01-01

In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.
Predictive Systems Toxicology

KAUST Repository

Kiani, Narsis A.

2018-01-15

In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.
Integrated petrophysical and reservoir characterization workflow to enhance permeability and water saturation prediction

Science.gov (United States)

Al-Amri, Meshal; Mahmoud, Mohamed; Elkatatny, Salaheldin; Al-Yousef, Hasan; Al-Ghamdi, Tariq

2017-07-01

Accurate estimation of permeability is essential in reservoir characterization and in determining fluid flow in porous media which greatly assists optimize the production of a field. Some of the permeability prediction techniques such as Porosity-Permeability transforms and recently artificial intelligence and neural networks are encouraging but still show moderate to good match to core data. This could be due to limitation to homogenous media while the knowledge about geology and heterogeneity is indirectly related or absent. The use of geological information from core description as in Lithofacies which includes digenetic information show a link to permeability when categorized into rock types exposed to similar depositional environment. The objective of this paper is to develop a robust combined workflow integrating geology and petrophysics and wireline logs in an extremely heterogeneous carbonate reservoir to accurately predict permeability. Permeability prediction is carried out using pattern recognition algorithm called multi-resolution graph-based clustering (MRGC). We will bench mark the prediction results with hard data from core and well test analysis. As a result, we showed how much better improvements are achieved in the permeability prediction when geology is integrated within the analysis. Finally, we use the predicted permeability as an input parameter in J-function and correct for uncertainties in saturation calculation produced by wireline logs using the classical Archie equation. Eventually, high level of confidence in hydrocarbon volumes estimation is reached when robust permeability and saturation height functions are estimated in presence of important geological details that are petrophysically meaningful.
A RANS knock model to predict the statistical occurrence of engine knock

International Nuclear Information System (INIS)

D'Adamo, Alessandro; Breda, Sebastiano; Fontanesi, Stefano; Irimescu, Adrian; Merola, Simona Silvia; Tornatore, Cinzia

2017-01-01

Highlights: • Development of a new RANS model for SI engine knock probability. • Turbulence-derived transport equations for variances of mixture fraction and enthalpy. • Gasoline autoignition delay times calculated from detailed chemical kinetics. • Knock probability validated against experiments on optically accessible GDI unit. • PDF-based knock model accounting for the random nature of SI engine knock in RANS simulations. - Abstract: In the recent past engine knock emerged as one of the main limiting aspects for the achievement of higher efficiency targets in modern spark-ignition (SI) engines. To attain these requirements, engine operating points must be moved as close as possible to the onset of abnormal combustions, although the turbulent nature of flow field and SI combustion leads to possibly ample fluctuations between consecutive engine cycles. This forces engine designers to distance the target condition from its theoretical optimum in order to prevent abnormal combustion, which can potentially damage engine components because of few individual heavy-knocking cycles. A statistically based RANS knock model is presented in this study, whose aim is the prediction not only of the ensemble average knock occurrence, poorly meaningful in such a stochastic event, but also of a knock probability. The model is based on look-up tables of autoignition times from detailed chemistry, coupled with transport equations for the variance of mixture fraction and enthalpy. The transported perturbations around the ensemble average value are based on variable gradients and on a local turbulent time scale. A multi-variate cell-based Gaussian-PDF model is proposed for the unburnt mixture, resulting in a statistical distribution for the in-cell reaction rate. An average knock precursor and its variance are independently calculated and transported; this results in the prediction of an earliest knock probability preceding the ensemble average knock onset, as confirmed by
The duration of uncertain times: audiovisual information about intervals is integrated in a statistically optimal fashion.

Directory of Open Access Journals (Sweden)

Jess Hartcher-O'Brien

Full Text Available Often multisensory information is integrated in a statistically optimal fashion where each sensory source is weighted according to its precision. This integration scheme isstatistically optimal because it theoretically results in unbiased perceptual estimates with the highest precisionpossible.There is a current lack of consensus about how the nervous system processes multiple sensory cues to elapsed time.In order to shed light upon this, we adopt a computational approach to pinpoint the integration strategy underlying duration estimationof audio/visual stimuli. One of the assumptions of our computational approach is that the multisensory signals redundantly specify the same stimulus property. Our results clearly show that despite claims to the contrary, perceived duration is the result of an optimal weighting process, similar to that adopted for estimates of space. That is, participants weight the audio and visual information to arrive at the most precise, single duration estimate possible. The work also disentangles how different integration strategies - i.e. consideringthe time of onset/offset ofsignals - might alter the final estimate. As such we provide the first concrete evidence of an optimal integration strategy in human duration estimates.
Applied systems ecology: models, data, and statistical methods

Energy Technology Data Exchange (ETDEWEB)

Eberhardt, L L

1976-01-01

In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.
Statistical Viewer: a tool to upload and integrate linkage and association data as plots displayed within the Ensembl genome browser

Directory of Open Access Journals (Sweden)

Hauser Elizabeth R

2005-04-01

Full Text Available Abstract Background To facilitate efficient selection and the prioritization of candidate complex disease susceptibility genes for association analysis, increasingly comprehensive annotation tools are essential to integrate, visualize and analyze vast quantities of disparate data generated by genomic screens, public human genome sequence annotation and ancillary biological databases. We have developed a plug-in package for Ensembl called "Statistical Viewer" that facilitates the analysis of genomic features and annotation in the regions of interest defined by linkage analysis. Results Statistical Viewer is an add-on package to the open-source Ensembl Genome Browser and Annotation System that displays disease study-specific linkage and/or association data as 2 dimensional plots in new panels in the context of Ensembl's Contig View and Cyto View pages. An enhanced upload server facilitates the upload of statistical data, as well as additional feature annotation to be displayed in DAS tracts, in the form of Excel Files. The Statistical View panel, drawn directly under the ideogram, illustrates lod score values for markers from a study of interest that are plotted against their position in base pairs. A module called "Get Map" easily converts the genetic locations of markers to genomic coordinates. The graph is placed under the corresponding ideogram features a synchronized vertical sliding selection box that is seamlessly integrated into Ensembl's Contig- and Cyto- View pages to choose the region to be displayed in Ensembl's "Overview" and "Detailed View" panels. To resolve Association and Fine mapping data plots, a "Detailed Statistic View" plot corresponding to the "Detailed View" may be displayed underneath. Conclusion Features mapping to regions of linkage are accentuated when Statistic View is used in conjunction with the Distributed Annotation System (DAS to display supplemental laboratory information such as differentially expressed disease
Response statistics of rotating shaft with non-linear elastic restoring forces by path integration

Science.gov (United States)

Gaidai, Oleg; Naess, Arvid; Dimentberg, Michael

2017-07-01

Extreme statistics of random vibrations is studied for a Jeffcott rotor under uniaxial white noise excitation. Restoring force is modelled as elastic non-linear; comparison is done with linearized restoring force to see the force non-linearity effect on the response statistics. While for the linear model analytical solutions and stability conditions are available, it is not generally the case for non-linear system except for some special cases. The statistics of non-linear case is studied by applying path integration (PI) method, which is based on the Markov property of the coupled dynamic system. The Jeffcott rotor response statistics can be obtained by solving the Fokker-Planck (FP) equation of the 4D dynamic system. An efficient implementation of PI algorithm is applied, namely fast Fourier transform (FFT) is used to simulate dynamic system additive noise. The latter allows significantly reduce computational time, compared to the classical PI. Excitation is modelled as Gaussian white noise, however any kind distributed white noise can be implemented with the same PI technique. Also multidirectional Markov noise can be modelled with PI in the same way as unidirectional. PI is accelerated by using Monte Carlo (MC) estimated joint probability density function (PDF) as initial input. Symmetry of dynamic system was utilized to afford higher mesh resolution. Both internal (rotating) and external damping are included in mechanical model of the rotor. The main advantage of using PI rather than MC is that PI offers high accuracy in the probability distribution tail. The latter is of critical importance for e.g. extreme value statistics, system reliability, and first passage probability.
Prediction of noise in ships by the application of “statistical energy analysis.”

DEFF Research Database (Denmark)

Jensen, John Ødegaard

1979-01-01

If it will be possible effectively to reduce the noise level in the accomodation on board ships, by introducing appropriate noise abatement measures already at an early design stage, it is quite essential that sufficiently accurate prediction methods are available for the naval architects...... or for a special noise abatement measure, e.g., increased structural damping. The paper discusses whether it might be possible to derive an alternative calculation model based on the “statistical energy analysis” approach (SEA). By considering the hull of a ship to be constructed from plate elements connected...
Departure Queue Prediction for Strategic and Tactical Surface Scheduler Integration

Science.gov (United States)

Zelinski, Shannon; Windhorst, Robert

2016-01-01

A departure metering concept to be demonstrated at Charlotte Douglas International Airport (CLT) will integrate strategic and tactical surface scheduling components to enable the respective collaborative decision making and improved efficiency benefits these two methods of scheduling provide. This study analyzes the effect of tactical scheduling on strategic scheduler predictability. Strategic queue predictions and target gate pushback times to achieve a desired queue length are compared between fast time simulations of CLT surface operations with and without tactical scheduling. The use of variable departure rates as a strategic scheduler input was shown to substantially improve queue predictions over static departure rates. With target queue length calibration, the strategic scheduler can be tuned to produce average delays within one minute of the tactical scheduler. However, root mean square differences between strategic and tactical delays were between 12 and 15 minutes due to the different methods the strategic and tactical schedulers use to predict takeoff times and generate gate pushback clearances. This demonstrates how difficult it is for the strategic scheduler to predict tactical scheduler assigned gate delays on an individual flight basis as the tactical scheduler adjusts departure sequence to accommodate arrival interactions. Strategic/tactical scheduler compatibility may be improved by providing more arrival information to the strategic scheduler and stabilizing tactical scheduler changes to runway sequence in response to arrivals.
Energy-Efficient Integration of Continuous Context Sensing and Prediction into Smartwatches

Directory of Open Access Journals (Sweden)

Reza Rawassizadeh

2015-09-01

Full Text Available As the availability and use of wearables increases, they are becoming a promising platform for context sensing and context analysis. Smartwatches are a particularly interesting platform for this purpose, as they offer salient advantages, such as their proximity to the human body. However, they also have limitations associated with their small form factor, such as processing power and battery life, which makes it difficult to simply transfer smartphone-based context sensing and prediction models to smartwatches. In this paper, we introduce an energy-efficient, generic, integrated framework for continuous context sensing and prediction on smartwatches. Our work extends previous approaches for context sensing and prediction on wrist-mounted wearables that perform predictive analytics outside the device. We offer a generic sensing module and a novel energy-efficient, on-device prediction module that is based on a semantic abstraction approach to convert sensor data into meaningful information objects, similar to human perception of a behavior. Through six evaluations, we analyze the energy efficiency of our framework modules, identify the optimal file structure for data access and demonstrate an increase in accuracy of prediction through our semantic abstraction method. The proposed framework is hardware independent and can serve as a reference model for implementing context sensing and prediction on small wearable devices beyond smartwatches, such as body-mounted cameras.
Energy-Efficient Integration of Continuous Context Sensing and Prediction into Smartwatches.

Science.gov (United States)

Rawassizadeh, Reza; Tomitsch, Martin; Nourizadeh, Manouchehr; Momeni, Elaheh; Peery, Aaron; Ulanova, Liudmila; Pazzani, Michael

2015-09-08

As the availability and use of wearables increases, they are becoming a promising platform for context sensing and context analysis. Smartwatches are a particularly interesting platform for this purpose, as they offer salient advantages, such as their proximity to the human body. However, they also have limitations associated with their small form factor, such as processing power and battery life, which makes it difficult to simply transfer smartphone-based context sensing and prediction models to smartwatches. In this paper, we introduce an energy-efficient, generic, integrated framework for continuous context sensing and prediction on smartwatches. Our work extends previous approaches for context sensing and prediction on wrist-mounted wearables that perform predictive analytics outside the device. We offer a generic sensing module and a novel energy-efficient, on-device prediction module that is based on a semantic abstraction approach to convert sensor data into meaningful information objects, similar to human perception of a behavior. Through six evaluations, we analyze the energy efficiency of our framework modules, identify the optimal file structure for data access and demonstrate an increase in accuracy of prediction through our semantic abstraction method. The proposed framework is hardware independent and can serve as a reference model for implementing context sensing and prediction on small wearable devices beyond smartwatches, such as body-mounted cameras.

Shape-correlated deformation statistics for respiratory motion prediction in 4D lung

Science.gov (United States)

Liu, Xiaoxiao; Oguz, Ipek; Pizer, Stephen M.; Mageras, Gig S.

2010-02-01

4D image-guided radiation therapy (IGRT) for free-breathing lungs is challenging due to the complicated respiratory dynamics. Effective modeling of respiratory motion is crucial to account for the motion affects on the dose to tumors. We propose a shape-correlated statistical model on dense image deformations for patient-specic respiratory motion estimation in 4D lung IGRT. Using the shape deformations of the high-contrast lungs as the surrogate, the statistical model trained from the planning CTs can be used to predict the image deformation during delivery verication time, with the assumption that the respiratory motion at both times are similar for the same patient. Dense image deformation fields obtained by diffeomorphic image registrations characterize the respiratory motion within one breathing cycle. A point-based particle optimization algorithm is used to obtain the shape models of lungs with group-wise surface correspondences. Canonical correlation analysis (CCA) is adopted in training to maximize the linear correlation between the shape variations of the lungs and the corresponding dense image deformations. Both intra- and inter-session CT studies are carried out on a small group of lung cancer patients and evaluated in terms of the tumor location accuracies. The results suggest potential applications using the proposed method.
Fast Quantum Algorithm for Predicting Descriptive Statistics of Stochastic Processes

Science.gov (United States)

Williams Colin P.

1999-01-01

Stochastic processes are used as a modeling tool in several sub-fields of physics, biology, and finance. Analytic understanding of the long term behavior of such processes is only tractable for very simple types of stochastic processes such as Markovian processes. However, in real world applications more complex stochastic processes often arise. In physics, the complicating factor might be nonlinearities; in biology it might be memory effects; and in finance is might be the non-random intentional behavior of participants in a market. In the absence of analytic insight, one is forced to understand these more complex stochastic processes via numerical simulation techniques. In this paper we present a quantum algorithm for performing such simulations. In particular, we show how a quantum algorithm can predict arbitrary descriptive statistics (moments) of N-step stochastic processes in just O(square root of N) time. That is, the quantum complexity is the square root of the classical complexity for performing such simulations. This is a significant speedup in comparison to the current state of the art.
Relative effects of statistical preprocessing and postprocessing on a regional hydrological ensemble prediction system

Science.gov (United States)

Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso

2018-03-01

The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases
An integrative approach to ortholog prediction for disease-focused and other functional studies.

Science.gov (United States)

Hu, Yanhui; Flockhart, Ian; Vinayagam, Arunachalam; Bergwitz, Clemens; Berger, Bonnie; Perrimon, Norbert; Mohr, Stephanie E

2011-08-31

Mapping of orthologous genes among species serves an important role in functional genomics by allowing researchers to develop hypotheses about gene function in one species based on what is known about the functions of orthologs in other species. Several tools for predicting orthologous gene relationships are available. However, these tools can give different results and identification of predicted orthologs is not always straightforward. We report a simple but effective tool, the Drosophila RNAi Screening Center Integrative Ortholog Prediction Tool (DIOPT; http://www.flyrnai.org/diopt), for rapid identification of orthologs. DIOPT integrates existing approaches, facilitating rapid identification of orthologs among human, mouse, zebrafish, C. elegans, Drosophila, and S. cerevisiae. As compared to individual tools, DIOPT shows increased sensitivity with only a modest decrease in specificity. Moreover, the flexibility built into the DIOPT graphical user interface allows researchers with different goals to appropriately 'cast a wide net' or limit results to highest confidence predictions. DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. This helps users identify the most appropriate matches among multiple possible orthologs. To facilitate using model organisms for functional analysis of human disease-associated genes, we used DIOPT to predict high-confidence orthologs of disease genes in Online Mendelian Inheritance in Man (OMIM) and genes in genome-wide association study (GWAS) data sets. The results are accessible through the DIOPT diseases and traits query tool (DIOPT-DIST; http://www.flyrnai.org/diopt-dist). DIOPT and DIOPT-DIST are useful resources for researchers working with model organisms, especially those who are interested in exploiting model organisms such as Drosophila to study the functions of human disease genes.
Controlling cyclic combustion timing variations using a symbol-statistics predictive approach in an HCCI engine

International Nuclear Information System (INIS)

Ghazimirsaied, Ahmad; Koch, Charles Robert

2012-01-01

Highlights: ► Misfire reduction in a combustion engine based on chaotic theory methods. ► Chaotic theory analysis of cyclic variation of a HCCI engine near misfire. ► Symbol sequence approach is used to predict ignition timing one cycle-ahead. ► Prediction is combined with feedback control to lower HCCI combustion variation. ► Feedback control extends the HCCI operating range into the misfire region. -- Abstract: Cyclic variation of a Homogeneous Charge Compression Ignition (HCCI) engine near misfire is analyzed using chaotic theory methods and feedback control is used to stabilize high cyclic variations. Variation of consecutive cycles of θ Pmax (the crank angle of maximum cylinder pressure over an engine cycle) for a Primary Reference Fuel engine is analyzed near misfire operation for five test points with similar conditions but different octane numbers. The return map of the time series of θ Pmax at each combustion cycle reveals the deterministic and random portions of the dynamics near misfire for this HCCI engine. A symbol-statistic approach is used to predict θ Pmax one cycle-ahead. Predicted θ Pmax has similar dynamical behavior to the experimental measurements. Based on this cycle ahead prediction, and using fuel octane as the input, feedback control is used to stabilize the instability of θ Pmax variations at this engine condition near misfire.
A generic statistical methodology to predict the maximum pit depth of a localized corrosion process

International Nuclear Information System (INIS)

Jarrah, A.; Bigerelle, M.; Guillemot, G.; Najjar, D.; Iost, A.; Nianga, J.-M.

2011-01-01

Highlights: → We propose a methodology to predict the maximum pit depth in a corrosion process. → Generalized Lambda Distribution and the Computer Based Bootstrap Method are combined. → GLD fit a large variety of distributions both in their central and tail regions. → Minimum thickness preventing perforation can be estimated with a safety margin. → Considering its applications, this new approach can help to size industrial pieces. - Abstract: This paper outlines a new methodology to predict accurately the maximum pit depth related to a localized corrosion process. It combines two statistical methods: the Generalized Lambda Distribution (GLD), to determine a model of distribution fitting with the experimental frequency distribution of depths, and the Computer Based Bootstrap Method (CBBM), to generate simulated distributions equivalent to the experimental one. In comparison with conventionally established statistical methods that are restricted to the use of inferred distributions constrained by specific mathematical assumptions, the major advantage of the methodology presented in this paper is that both the GLD and the CBBM enable a statistical treatment of the experimental data without making any preconceived choice neither on the unknown theoretical parent underlying distribution of pit depth which characterizes the global corrosion phenomenon nor on the unknown associated theoretical extreme value distribution which characterizes the deepest pits. Considering an experimental distribution of depths of pits produced on an aluminium sample, estimations of maximum pit depth using a GLD model are compared to similar estimations based on usual Gumbel and Generalized Extreme Value (GEV) methods proposed in the corrosion engineering literature. The GLD approach is shown having smaller bias and dispersion in the estimation of the maximum pit depth than the Gumbel approach both for its realization and mean. This leads to comparing the GLD approach to the GEV one
Statistical decay of giant resonances

International Nuclear Information System (INIS)

Dias, H.; Teruya, N.; Wolynec, E.

1986-01-01

Statistical calculations to predict the neutron spectrum resulting from the decay of Giant Resonances are discussed. The dependence of the resutls on the optical potential parametrization and on the level density of the residual nucleus is assessed. A Hauser-Feshbach calculation is performed for the decay of the monople giant resonance in 208 Pb using the experimental levels of 207 Pb from a recent compilation. The calculated statistical decay is in excelent agreement with recent experimental data, showing that the decay of this resonance is dominantly statistical, as predicted by continuum RPA calculations. (Author) [pt
Statistical decay of giant resonances

International Nuclear Information System (INIS)

Dias, H.; Teruya, N.; Wolynec, E.

1986-02-01

Statistical calculations to predict the neutron spectrum resulting from the decay of Giant Resonances are discussed. The dependence of the results on the optical potential parametrization and on the level density of the residual nucleus is assessed. A Hauser-Feshbach calculation is performed for the decay of the monopole giant resonance in 208 Pb using the experimental levels of 207 Pb from a recent compilation. The calculated statistical decay is in excellent agreement with recent experimental data, showing that decay of this resonance is dominantly statistical, as predicted by continuum RPA calculations. (Author) [pt
Early prediction of lung cancer recurrence after stereotactic radiotherapy using second order texture statistics

Science.gov (United States)

Mattonen, Sarah A.; Palma, David A.; Haasbeek, Cornelis J. A.; Senan, Suresh; Ward, Aaron D.

2014-03-01

Benign radiation-induced lung injury is a common finding following stereotactic ablative radiotherapy (SABR) for lung cancer, and is often difficult to differentiate from a recurring tumour due to the ablative doses and highly conformal treatment with SABR. Current approaches to treatment response assessment have shown limited ability to predict recurrence within 6 months of treatment. The purpose of our study was to evaluate the accuracy of second order texture statistics for prediction of eventual recurrence based on computed tomography (CT) images acquired within 6 months of treatment, and compare with the performance of first order appearance and lesion size measures. Consolidative and ground-glass opacity (GGO) regions were manually delineated on post-SABR CT images. Automatic consolidation expansion was also investigated to act as a surrogate for GGO position. The top features for prediction of recurrence were all texture features within the GGO and included energy, entropy, correlation, inertia, and first order texture (standard deviation of density). These predicted recurrence with 2-fold cross validation (CV) accuracies of 70-77% at 2- 5 months post-SABR, with energy, entropy, and first order texture having leave-one-out CV accuracies greater than 80%. Our results also suggest that automatic expansion of the consolidation region could eliminate the need for manual delineation, and produced reproducible results when compared to manually delineated GGO. If validated on a larger data set, this could lead to a clinically useful computer-aided diagnosis system for prediction of recurrence within 6 months of SABR and allow for early salvage therapy for patients with recurrence.
SOCR: Statistics Online Computational Resource

Directory of Open Access Journals (Sweden)

Ivo D. Dinov

2006-10-01

Full Text Available The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR. This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student's intuition and enhance their learning.
Falling in the elderly: Do statistical models matter for performance criteria of fall prediction? Results from two large population-based studies.

Science.gov (United States)

Kabeshova, Anastasiia; Launay, Cyrille P; Gromov, Vasilii A; Fantino, Bruno; Levinoff, Elise J; Allali, Gilles; Beauchet, Olivier

2016-01-01

To compare performance criteria (i.e., sensitivity, specificity, positive predictive value, negative predictive value, area under receiver operating characteristic curve and accuracy) of linear and non-linear statistical models for fall risk in older community-dwellers. Participants were recruited in two large population-based studies, "Prévention des Chutes, Réseau 4" (PCR4, n=1760, cross-sectional design, retrospective collection of falls) and "Prévention des Chutes Personnes Agées" (PCPA, n=1765, cohort design, prospective collection of falls). Six linear statistical models (i.e., logistic regression, discriminant analysis, Bayes network algorithm, decision tree, random forest, boosted trees), three non-linear statistical models corresponding to artificial neural networks (multilayer perceptron, genetic algorithm and neuroevolution of augmenting topologies [NEAT]) and the adaptive neuro fuzzy interference system (ANFIS) were used. Falls ≥1 characterizing fallers and falls ≥2 characterizing recurrent fallers were used as outcomes. Data of studies were analyzed separately and together. NEAT and ANFIS had better performance criteria compared to other models. The highest performance criteria were reported with NEAT when using PCR4 database and falls ≥1, and with both NEAT and ANFIS when pooling data together and using falls ≥2. However, sensitivity and specificity were unbalanced. Sensitivity was higher than specificity when identifying fallers, whereas the converse was found when predicting recurrent fallers. Our results showed that NEAT and ANFIS were non-linear statistical models with the best performance criteria for the prediction of falls but their sensitivity and specificity were unbalanced, underscoring that models should be used respectively for the screening of fallers and the diagnosis of recurrent fallers. Copyright © 2015 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
On the estimation of multiple random integrals and U-statistics

CERN Document Server

Major, Péter

2013-01-01

This work starts with the study of those limit theorems in probability theory for which classical methods do not work. In many cases some form of linearization can help to solve the problem, because the linearized version is simpler. But in order to apply such a method we have to show that the linearization causes a negligible error. The estimation of this error leads to some important large deviation type problems, and the main subject of this work is their investigation. We provide sharp estimates of the tail distribution of multiple integrals with respect to a normalized empirical measure and so-called degenerate U-statistics and also of the supremum of appropriate classes of such quantities. The proofs apply a number of useful techniques of modern probability that enable us to investigate the non-linear functionals of independent random variables. This lecture note yields insights into these methods, and may also be useful for those who only want some new tools to help them prove limit theorems when stand...
Statistical Model of Extreme Shear

DEFF Research Database (Denmark)

Larsen, Gunner Chr.; Hansen, Kurt Schaldemose

2004-01-01

In order to continue cost-optimisation of modern large wind turbines, it is important to continously increase the knowledge on wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describe the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of high-sampled full-scale time series measurements...... are consistent, given the inevitabel uncertainties associated with model as well as with the extreme value data analysis. Keywords: Statistical model, extreme wind conditions, statistical analysis, turbulence, wind loading, statistical analysis, turbulence, wind loading, wind shear, wind turbines....
Predictive Coding and Multisensory Integration: An Attentional Account of the Multisensory Mind

Directory of Open Access Journals (Sweden)

Durk eTalsma

2015-03-01

Full Text Available Multisensory integration involves a host of different cognitive processes, occurring at different stages of sensory processing. Here I argue that, despite recent insights suggesting that multisensory interactions can occur at very early latencies, the actual integration of individual sensory traces into an internally consistent mental representation is dependent on both top-down and bottom-up processes. Moreover, I argue that this integration is not limited to just sensory inputs, but that internal cognitive processes also shape the resulting mental representation. Studies showing that memory recall is affected by the initial multisensory context in which the stimuli were presented will be discussed, as well as several studies showing that mental imagery can affect multisensory illusions. This empirical evidence will be discussed from a predictive coding perspective, in which a central top-down attentional process is proposed to play a central role in coordinating the integration of all these inputs into a coherent mental representation.
Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances.

Science.gov (United States)

Abut, Fatih; Akay, Mehmet Fatih

2015-01-01

Maximal oxygen uptake (VO2max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance.
Predicted range expansion of Chinese tallow tree (Triadica sebifera) in forestlands of the southern United States

Science.gov (United States)

Hsiao-Hsuan Wang; William Grant; Todd Swannack; Jianbang Gan; William Rogers; Tomasz Koralewski; James Miller; John W. Taylor Jr.

2011-01-01

We present an integrated approach for predicting future range expansion of an invasive species (Chinese tallow tree) that incorporates statistical forecasting and analytical techniques within a spatially explicit, agent-based, simulation framework.
Testing Predictive Models of Technology Integration in Mexico and the United States

Science.gov (United States)

Velazquez, Cesareo Morales

2008-01-01

Data from Mexico City, Mexico (N = 978) and from Texas, USA (N = 932) were used to test the predictive validity of the teacher professional development component of the Will, Skill, Tool Model of Technology Integration in a cross-cultural context. Structural equation modeling (SEM) was used to test the model. Analyses of these data yielded…
Guidelineness of the parameters using integrated test in down syndrome risk prediction

International Nuclear Information System (INIS)

Lee, Jin Won; Go, Sung Jin; Kang, Se Sik; Kim, Chang Soo

2016-01-01

This study was an evaluation of the significance of each parameter through aimed at pregnant women subjected to screening test(integrated test) in predicting risk of Down syndrome. We retrospectively analysed the correlation of risk of Down's syndrome with Nuchal Translucency(NT) images measured by ultrasound, Pregnancy Associated Plasma Protein A(PAPP-A), alpha-fetoprotein(AFP), unconjugated estriol(uE3), human chorionic gonadotrophin(hCG) and Inhibin A by maternal serum. As a result, a significant correlation with NT, uE3, hCG, Inhibin A is revealed with Down's syndrome risk(P<.001). In ROC analysis, AUC of Inhibin A is analysed as the biggest predictor of Down's syndrome(0.859). And the criterion for cut-off was inhibin A 1.4 MoM(sensitivity 81.8%, specificity 75.9%). In conclusion, Inhibin A was the most useful in parameters to predict Down's syndrome in the integrated test. If we make up for the weakness based on the cut-off value of parameters they will be able to be used as an independent indicator in the risk of Down's syndrome screening
Statistical modelling coupled with LC-MS analysis to predict human upper intestinal absorption of phytochemical mixtures.

Science.gov (United States)

Selby-Pham, Sophie N B; Howell, Kate S; Dunshea, Frank R; Ludbey, Joel; Lutz, Adrian; Bennett, Louise

2018-04-15

A diet rich in phytochemicals confers benefits for health by reducing the risk of chronic diseases via regulation of oxidative stress and inflammation (OSI). For optimal protective bio-efficacy, the time required for phytochemicals and their metabolites to reach maximal plasma concentrations (T max ) should be synchronised with the time of increased OSI. A statistical model has been reported to predict T max of individual phytochemicals based on molecular mass and lipophilicity. We report the application of the model for predicting the absorption profile of an uncharacterised phytochemical mixture, herein referred to as the 'functional fingerprint'. First, chemical profiles of phytochemical extracts were acquired using liquid chromatography mass spectrometry (LC-MS), then the molecular features for respective components were used to predict their plasma absorption maximum, based on molecular mass and lipophilicity. This method of 'functional fingerprinting' of plant extracts represents a novel tool for understanding and optimising the health efficacy of plant extracts. Copyright © 2017 Elsevier Ltd. All rights reserved.
Two sample Bayesian prediction intervals for order statistics based on the inverse exponential-type distributions using right censored sample

Directory of Open Access Journals (Sweden)

M.M. Mohie El-Din

2011-10-01

Full Text Available In this paper, two sample Bayesian prediction intervals for order statistics (OS are obtained. This prediction is based on a certain class of the inverse exponential-type distributions using a right censored sample. A general class of prior density functions is used and the predictive cumulative function is obtained in the two samples case. The class of the inverse exponential-type distributions includes several important distributions such the inverse Weibull distribution, the inverse Burr distribution, the loglogistic distribution, the inverse Pareto distribution and the inverse paralogistic distribution. Special cases of the inverse Weibull model such as the inverse exponential model and the inverse Rayleigh model are considered.

C-terminal motif prediction in eukaryotic proteomes using comparative genomics and statistical over-representation across protein families

Directory of Open Access Journals (Sweden)

Cutler Sean R

2007-06-01

Full Text Available Abstract Background The carboxy termini of proteins are a frequent site of activity for a variety of biologically important functions, ranging from post-translational modification to protein targeting. Several short peptide motifs involved in protein sorting roles and dependent upon their proximity to the C-terminus for proper function have already been characterized. As a limited number of such motifs have been identified, the potential exists for genome-wide statistical analysis and comparative genomics to reveal novel peptide signatures functioning in a C-terminal dependent manner. We have applied a novel methodology to the prediction of C-terminal-anchored peptide motifs involving a simple z-statistic and several techniques for improving the signal-to-noise ratio. Results We examined the statistical over-representation of position-specific C-terminal tripeptides in 7 eukaryotic proteomes. Sequence randomization models and simple-sequence masking were applied to the successful reduction of background noise. Similarly, as C-terminal homology among members of large protein families may artificially inflate tripeptide counts in an irrelevant and obfuscating manner, gene-family clustering was performed prior to the analysis in order to assess tripeptide over-representation across protein families as opposed to across all proteins. Finally, comparative genomics was used to identify tripeptides significantly occurring in multiple species. This approach has been able to predict, to our knowledge, all C-terminally anchored targeting motifs present in the literature. These include the PTS1 peroxisomal targeting signal (SKL*, the ER-retention signal (K/HDEL*, the ER-retrieval signal for membrane bound proteins (KKxx*, the prenylation signal (CC* and the CaaX box prenylation motif. In addition to a high statistical over-representation of these known motifs, a collection of significant tripeptides with a high propensity for biological function exists
Statistical Optics

Science.gov (United States)

Goodman, Joseph W.

2000-07-01

The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Robert G. Bartle The Elements of Integration and Lebesgue Measure George E. P. Box & Norman R. Draper Evolutionary Operation: A Statistical Method for Process Improvement George E. P. Box & George C. Tiao Bayesian Inference in Statistical Analysis R. W. Carter Finite Groups of Lie Type: Conjugacy Classes and Complex Characters R. W. Carter Simple Groups of Lie Type William G. Cochran & Gertrude M. Cox Experimental Designs, Second Edition Richard Courant Differential and Integral Calculus, Volume I RIchard Courant Differential and Integral Calculus, Volume II Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume I Richard Courant & D. Hilbert Methods of Mathematical Physics, Volume II D. R. Cox Planning of Experiments Harold S. M. Coxeter Introduction to Geometry, Second Edition Charles W. Curtis & Irving Reiner Representation Theory of Finite Groups and Associative Algebras Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume I Charles W. Curtis & Irving Reiner Methods of Representation Theory with Applications to Finite Groups and Orders, Volume II Cuthbert Daniel Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition Bruno de Finetti Theory of Probability, Volume I Bruno de Finetti Theory of Probability, Volume 2 W. Edwards Deming Sample Design in Business Research
The Integrated Medical Model: A Probabilistic Simulation Model Predicting In-Flight Medical Risks

Science.gov (United States)

Keenan, Alexandra; Young, Millennia; Saile, Lynn; Boley, Lynn; Walton, Marlei; Kerstman, Eric; Shah, Ronak; Goodenow, Debra A.; Myers, Jerry G., Jr.

2015-01-01

The Integrated Medical Model (IMM) is a probabilistic model that uses simulation to predict mission medical risk. Given a specific mission and crew scenario, medical events are simulated using Monte Carlo methodology to provide estimates of resource utilization, probability of evacuation, probability of loss of crew, and the amount of mission time lost due to illness. Mission and crew scenarios are defined by mission length, extravehicular activity (EVA) schedule, and crew characteristics including: sex, coronary artery calcium score, contacts, dental crowns, history of abdominal surgery, and EVA eligibility. The Integrated Medical Evidence Database (iMED) houses the model inputs for one hundred medical conditions using in-flight, analog, and terrestrial medical data. Inputs include incidence, event durations, resource utilization, and crew functional impairment. Severity of conditions is addressed by defining statistical distributions on the dichotomized best and worst-case scenarios for each condition. The outcome distributions for conditions are bounded by the treatment extremes of the fully treated scenario in which all required resources are available and the untreated scenario in which no required resources are available. Upon occurrence of a simulated medical event, treatment availability is assessed, and outcomes are generated depending on the status of the affected crewmember at the time of onset, including any pre-existing functional impairments or ongoing treatment of concurrent conditions. The main IMM outcomes, including probability of evacuation and loss of crew life, time lost due to medical events, and resource utilization, are useful in informing mission planning decisions. To date, the IMM has been used to assess mission-specific risks with and without certain crewmember characteristics, to determine the impact of eliminating certain resources from the mission medical kit, and to design medical kits that maximally benefit crew health while meeting
Integrating statistical and process-based models to produce probabilistic landslide hazard at regional scale

Science.gov (United States)

Strauch, R. L.; Istanbulluoglu, E.

2017-12-01

We develop a landslide hazard modeling approach that integrates a data-driven statistical model and a probabilistic process-based shallow landslide model for mapping probability of landslide initiation, transport, and deposition at regional scales. The empirical model integrates the influence of seven site attribute (SA) classes: elevation, slope, curvature, aspect, land use-land cover, lithology, and topographic wetness index, on over 1,600 observed landslides using a frequency ratio (FR) approach. A susceptibility index is calculated by adding FRs for each SA on a grid-cell basis. Using landslide observations we relate susceptibility index to an empirically-derived probability of landslide impact. This probability is combined with results from a physically-based model to produce an integrated probabilistic map. Slope was key in landslide initiation while deposition was linked to lithology and elevation. Vegetation transition from forest to alpine vegetation and barren land cover with lower root cohesion leads to higher frequency of initiation. Aspect effects are likely linked to differences in root cohesion and moisture controlled by solar insulation and snow. We demonstrate the model in the North Cascades of Washington, USA and identify locations of high and low probability of landslide impacts that can be used by land managers in their design, planning, and maintenance.
Integrating principal component analysis and vector quantization with support vector regression for sulfur content prediction in HDS process

Directory of Open Access Journals (Sweden)

Shokri Saeid

2015-01-01

Full Text Available An accurate prediction of sulfur content is very important for the proper operation and product quality control in hydrodesulfurization (HDS process. For this purpose, a reliable data- driven soft sensors utilizing Support Vector Regression (SVR was developed and the effects of integrating Vector Quantization (VQ with Principle Component Analysis (PCA were studied on the assessment of this soft sensor. First, in pre-processing step the PCA and VQ techniques were used to reduce dimensions of the original input datasets. Then, the compressed datasets were used as input variables for the SVR model. Experimental data from the HDS setup were employed to validate the proposed integrated model. The integration of VQ/PCA techniques with SVR model was able to increase the prediction accuracy of SVR. The obtained results show that integrated technique (VQ-SVR was better than (PCA-SVR in prediction accuracy. Also, VQ decreased the sum of the training and test time of SVR model in comparison with PCA. For further evaluation, the performance of VQ-SVR model was also compared to that of SVR. The obtained results indicated that VQ-SVR model delivered the best satisfactory predicting performance (AARE= 0.0668 and R2= 0.995 in comparison with investigated models.
An integrative approach to ortholog prediction for disease-focused and other functional studies

Directory of Open Access Journals (Sweden)

Perrimon Norbert

2011-08-01

Full Text Available Abstract Background Mapping of orthologous genes among species serves an important role in functional genomics by allowing researchers to develop hypotheses about gene function in one species based on what is known about the functions of orthologs in other species. Several tools for predicting orthologous gene relationships are available. However, these tools can give different results and identification of predicted orthologs is not always straightforward. Results We report a simple but effective tool, the Drosophila RNAi Screening Center Integrative Ortholog Prediction Tool (DIOPT; http://www.flyrnai.org/diopt, for rapid identification of orthologs. DIOPT integrates existing approaches, facilitating rapid identification of orthologs among human, mouse, zebrafish, C. elegans, Drosophila, and S. cerevisiae. As compared to individual tools, DIOPT shows increased sensitivity with only a modest decrease in specificity. Moreover, the flexibility built into the DIOPT graphical user interface allows researchers with different goals to appropriately 'cast a wide net' or limit results to highest confidence predictions. DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. This helps users identify the most appropriate matches among multiple possible orthologs. To facilitate using model organisms for functional analysis of human disease-associated genes, we used DIOPT to predict high-confidence orthologs of disease genes in Online Mendelian Inheritance in Man (OMIM and genes in genome-wide association study (GWAS data sets. The results are accessible through the DIOPT diseases and traits query tool (DIOPT-DIST; http://www.flyrnai.org/diopt-dist. Conclusions DIOPT and DIOPT-DIST are useful resources for researchers working with model organisms, especially those who are interested in exploiting model organisms such as Drosophila to study the functions of human disease genes.
REMAINING LIFE TIME PREDICTION OF BEARINGS USING K-STAR ALGORITHM – A STATISTICAL APPROACH

Directory of Open Access Journals (Sweden)

R. SATISHKUMAR

2017-01-01

Full Text Available The role of bearings is significant in reducing the down time of all rotating machineries. The increasing trend of bearing failures in recent times has triggered the need and importance of deployment of condition monitoring. There are multiple factors associated to a bearing failure while it is in operation. Hence, a predictive strategy is required to evaluate the current state of the bearings in operation. In past, predictive models with regression techniques were widely used for bearing lifetime estimations. The Objective of this paper is to estimate the remaining useful life of bearings through a machine learning approach. The ultimate objective of this study is to strengthen the predictive maintenance. The present study was done using classification approach following the concepts of machine learning and a predictive model was built to calculate the residual lifetime of bearings in operation. Vibration signals were acquired on a continuous basis from an experiment wherein the bearings are made to run till it fails naturally. It should be noted that the experiment was carried out with new bearings at pre-defined load and speed conditions until the bearing fails on its own. In the present work, statistical features were deployed and feature selection process was carried out using J48 decision tree and selected features were used to develop the prognostic model. The K-Star classification algorithm, a supervised machine learning technique is made use of in building a predictive model to estimate the lifetime of bearings. The performance of classifier was cross validated with distinct data. The result shows that the K-Star classification model gives 98.56% classification accuracy with selected features.
A Predictive Approach to Network Reverse-Engineering

Science.gov (United States)

Wiggins, Chris

2005-03-01

A central challenge of systems biology is the ``reverse engineering" of transcriptional networks: inferring which genes exert regulatory control over which other genes. Attempting such inference at the genomic scale has only recently become feasible, via data-intensive biological innovations such as DNA microrrays (``DNA chips") and the sequencing of whole genomes. In this talk we present a predictive approach to network reverse-engineering, in which we integrate DNA chip data and sequence data to build a model of the transcriptional network of the yeast S. cerevisiae capable of predicting the response of genes in unseen experiments. The technique can also be used to extract ``motifs,'' sequence elements which act as binding sites for regulatory proteins. We validate by a number of approaches and present comparison of theoretical prediction vs. experimental data, along with biological interpretations of the resulting model. En route, we will illustrate some basic notions in statistical learning theory (fitting vs. over-fitting; cross- validation; assessing statistical significance), highlighting ways in which physicists can make a unique contribution in data- driven approaches to reverse engineering.
Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances

Directory of Open Access Journals (Sweden)

Abut F

2015-08-01

Full Text Available Fatih Abut, Mehmet Fatih AkayDepartment of Computer Engineering, Çukurova University, Adana, TurkeyAbstract: Maximal oxygen uptake (VO2max indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance
Statistics Anxiety and Instructor Immediacy

Science.gov (United States)

Williams, Amanda S.

2010-01-01

The purpose of this study was to investigate the relationship between instructor immediacy and statistics anxiety. It was predicted that students receiving immediacy would report lower levels of statistics anxiety. Using a pretest-posttest-control group design, immediacy was measured using the Instructor Immediacy scale. Statistics anxiety was…
Accurate diffraction data integration by the EVAL15 profile prediction method : Application in chemical and biological crystallography

NARCIS (Netherlands)

Xian, X.

2009-01-01

Accurate integration of reflection intensities plays an essential role in structure determination of the crystallized compound. A new diffraction data integration method, EVAL15, is presented in this thesis. This method uses the principle of general impacts to predict ab inito three-dimensional
First trimester prediction of maternal glycemic status.

Science.gov (United States)

Gabbay-Benziv, Rinat; Doyle, Lauren E; Blitzer, Miriam; Baschat, Ahmet A

2015-05-01

To predict gestational diabetes mellitus (GDM) or normoglycemic status using first trimester maternal characteristics. We used data from a prospective cohort study. First trimester maternal characteristics were compared between women with and without GDM. Association of these variables with sugar values at glucose challenge test (GCT) and subsequent GDM was tested to identify key parameters. A predictive algorithm for GDM was developed and receiver operating characteristics (ROC) statistics was used to derive the optimal risk score. We defined normoglycemic state, when GCT and all four sugar values at oral glucose tolerance test, whenever obtained, were normal. Using same statistical approach, we developed an algorithm to predict the normoglycemic state. Maternal age, race, prior GDM, first trimester BMI, and systolic blood pressure (SBP) were all significantly associated with GDM. Age, BMI, and SBP were also associated with GCT values. The logistic regression analysis constructed equation and the calculated risk score yielded sensitivity, specificity, positive predictive value, and negative predictive value of 85%, 62%, 13.8%, and 98.3% for a cut-off value of 0.042, respectively (ROC-AUC - area under the curve 0.819, CI - confidence interval 0.769-0.868). The model constructed for normoglycemia prediction demonstrated lower performance (ROC-AUC 0.707, CI 0.668-0.746). GDM prediction can be achieved during the first trimester encounter by integration of maternal characteristics and basic measurements while normoglycemic status prediction is less effective.
Dynamics and spike trains statistics in conductance-based integrate-and-fire neural networks with chemical and electric synapses

International Nuclear Information System (INIS)

Cofré, Rodrigo; Cessac, Bruno

2013-01-01

We investigate the effect of electric synapses (gap junctions) on collective neuronal dynamics and spike statistics in a conductance-based integrate-and-fire neural network, driven by Brownian noise, where conductances depend upon spike history. We compute explicitly the time evolution operator and show that, given the spike-history of the network and the membrane potentials at a given time, the further dynamical evolution can be written in a closed form. We show that spike train statistics is described by a Gibbs distribution whose potential can be approximated with an explicit formula, when the noise is weak. This potential form encompasses existing models for spike trains statistics analysis such as maximum entropy models or generalized linear models (GLM). We also discuss the different types of correlations: those induced by a shared stimulus and those induced by neurons interactions
Stock return predictability and market integration: The role of global and local information

Directory of Open Access Journals (Sweden)

David G. McMillan

2016-12-01

Full Text Available This paper examines the predictability of a range of international stock markets where we allow the presence of both local and global predictive factors. Recent research has argued that US returns have predictive power for international stock returns. We expand this line of research, following work on market integration, to include a more general definition of the global factor, based on principal components analysis. Results identify three global expected returns factors, one related to the major stock markets of the US, UK and Asia and one related to the other markets analysed. The third component is related to dividend growth. A single dominant realised returns factor is also noted. A forecasting exercise comparing the principal components based factors to a US return factor and local market only factors, as well as the historical mean benchmark finds supportive evidence for the former approach. It is hoped that the results from this paper will be informative on three counts. First, to academics interested in understanding the dynamics asset price movement. Second, to market participants who aim to time the market and engage in portfolio and risk management. Third, to those (policy makers and others who are interested in linkages across international markets and the nature and degree of integration.
Prediction of failure enthalpy and reliability of irradiated fuel rod under reactivity-initiated accidents by means of statistical approach

International Nuclear Information System (INIS)

Nam, Cheol; Choi, Byeong Kwon; Jeong, Yong Hwan; Jung, Youn Ho

2001-01-01

During the last decade, the failure behavior of high-burnup fuel rods under RIA has been an extensive concern since observations of fuel rod failures at low enthalpy. Of great importance is placed on failure prediction of fuel rod in the point of licensing criteria and safety in extending burnup achievement. To address the issue, a statistics-based methodology is introduced to predict failure probability of irradiated fuel rods. Based on RIA simulation results in literature, a failure enthalpy correlation for irradiated fuel rod is constructed as a function of oxide thickness, fuel burnup, and pulse width. From the failure enthalpy correlation, a single damage parameter, equivalent enthalpy, is defined to reflect the effects of the three primary factors as well as peak fuel enthalpy. Moreover, the failure distribution function with equivalent enthalpy is derived, applying a two-parameter Weibull statistical model. Using these equations, the sensitivity analysis is carried out to estimate the effects of burnup, corrosion, peak fuel enthalpy, pulse width and cladding materials used
Computational Efficient Upscaling Methodology for Predicting Thermal Conductivity of Nuclear Waste forms

International Nuclear Information System (INIS)

Li, Dongsheng; Sun, Xin; Khaleel, Mohammad A.

2011-01-01

This study evaluated different upscaling methods to predict thermal conductivity in loaded nuclear waste form, a heterogeneous material system. The efficiency and accuracy of these methods were compared. Thermal conductivity in loaded nuclear waste form is an important property specific to scientific researchers, in waste form Integrated performance and safety code (IPSC). The effective thermal conductivity obtained from microstructure information and local thermal conductivity of different components is critical in predicting the life and performance of waste form during storage. How the heat generated during storage is directly related to thermal conductivity, which in turn determining the mechanical deformation behavior, corrosion resistance and aging performance. Several methods, including the Taylor model, Sachs model, self-consistent model, and statistical upscaling models were developed and implemented. Due to the absence of experimental data, prediction results from finite element method (FEM) were used as reference to determine the accuracy of different upscaling models. Micrographs from different loading of nuclear waste were used in the prediction of thermal conductivity. Prediction results demonstrated that in term of efficiency, boundary models (Taylor and Sachs model) are better than self consistent model, statistical upscaling method and FEM. Balancing the computation resource and accuracy, statistical upscaling is a computational efficient method in predicting effective thermal conductivity for nuclear waste form.
Flow prediction models using macroclimatic variables and multivariate statistical techniques in the Cauca River Valley

International Nuclear Information System (INIS)

Carvajal Escobar Yesid; Munoz, Flor Matilde

2007-01-01

The project this centred in the revision of the state of the art of the ocean-atmospheric phenomena that you affect the Colombian hydrology especially The Phenomenon Enos that causes a socioeconomic impact of first order in our country, it has not been sufficiently studied; therefore it is important to approach the thematic one, including the variable macroclimates associated to the Enos in the analyses of water planning. The analyses include revision of statistical techniques of analysis of consistency of hydrological data with the objective of conforming a database of monthly flow of the river reliable and homogeneous Cauca. Statistical methods are used (Analysis of data multivariante) specifically The analysis of principal components to involve them in the development of models of prediction of flows monthly means in the river Cauca involving the Lineal focus as they are the model autoregressive AR, ARX and Armax and the focus non lineal Net Artificial Network.
Time series prediction: statistical and neural techniques

Science.gov (United States)

Zahirniak, Daniel R.; DeSimio, Martin P.

1996-03-01

In this paper we compare the performance of nonlinear neural network techniques to those of linear filtering techniques in the prediction of time series. Specifically, we compare the results of using the nonlinear systems, known as multilayer perceptron and radial basis function neural networks, with the results obtained using the conventional linear Wiener filter, Kalman filter and Widrow-Hoff adaptive filter in predicting future values of stationary and non- stationary time series. Our results indicate the performance of each type of system is heavily dependent upon the form of the time series being predicted and the size of the system used. In particular, the linear filters perform adequately for linear or near linear processes while the nonlinear systems perform better for nonlinear processes. Since the linear systems take much less time to be developed, they should be tried prior to using the nonlinear systems when the linearity properties of the time series process are unknown.
Prediction and reconstruction of future and missing unobservable modified Weibull lifetime based on generalized order statistics

Directory of Open Access Journals (Sweden)

Amany E. Aly

2016-04-01

Full Text Available When a system consisting of independent components of the same type, some appropriate actions may be done as soon as a portion of them have failed. It is, therefore, important to be able to predict later failure times from earlier ones. One of the well-known failure distributions commonly used to model component life, is the modified Weibull distribution (MWD. In this paper, two pivotal quantities are proposed to construct prediction intervals for future unobservable lifetimes based on generalized order statistics (gos from MWD. Moreover, a pivotal quantity is developed to reconstruct missing observations at the beginning of experiment. Furthermore, Monte Carlo simulation studies are conducted and numerical computations are carried out to investigate the efficiency of presented results. Finally, two illustrative examples for real data sets are analyzed.
Statistical prediction of parametric roll using FORM

DEFF Research Database (Denmark)

Jensen, Jørgen Juncher; Choi, Ju-hyuck; Nielsen, Ulrik Dam

2017-01-01

Previous research has shown that the First Order Reliability Method (FORM) can be an efficient method for estimation of outcrossing rates and extreme value statistics for stationary stochastic processes. This is so also for bifurcation type of processes like parametric roll of ships. The present...

Statistical prediction of immunity to placental malaria based on multi-assay antibody data for malarial antigens

DEFF Research Database (Denmark)

Siriwardhana, Chathura; Fang, Rui; Salanti, Ali

2017-01-01

Background Plasmodium falciparum infections are especially severe in pregnant women because infected erythrocytes (IE) express VAR2CSA, a ligand that binds to placental trophoblasts, causing IE to accumulate in the placenta. Resulting inflammation and pathology increases a woman’s risk of anemia...... to 28 malarial antigens and used the data to develop statistical models for predicting if a woman has sufficient immunity to prevent PM. Methods Archival plasma samples from 1377 women were screened in a bead-based multiplex assay for Ab to 17 VAR2CSA-associated antigens (full length VAR2CSA (FV2), DBL...... in the following seven statistical approaches: logistic regression full model, logistic regression reduced model, recursive partitioning, random forests, linear discriminant analysis, quadratic discriminant analysis, and support vector machine. Results The best and simplest model proved to be the logistic...
Accuracy statistics in predicting Independent Activities of Daily Living (IADL) capacity with comprehensive and brief neuropsychological test batteries.

Science.gov (United States)

Karzmark, Peter; Deutsch, Gayle K

2018-01-01

This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2

Directory of Open Access Journals (Sweden)

Gutmann Michael

2005-02-01

Full Text Available Abstract Background It has been shown that the classical receptive fields of simple and complex cells in the primary visual cortex emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse or independent. We investigate how to learn features beyond the primary visual cortex from the statistical properties of modelled complex-cell outputs. In previous work, we showed that a new model, non-negative sparse coding, led to the emergence of features which code for contours of a given spatial frequency band. Results We applied ordinary independent component analysis to modelled outputs of complex cells that span different frequency bands. The analysis led to the emergence of features which pool spatially coherent across-frequency activity in the modelled primary visual cortex. Thus, the statistically optimal way of processing complex-cell outputs abandons separate frequency channels, while preserving and even enhancing orientation tuning and spatial localization. As a technical aside, we found that the non-negativity constraint is not necessary: ordinary independent component analysis produces essentially the same results as our previous work. Conclusion We propose that the pooling that emerges allows the features to code for realistic low-level image features related to step edges. Further, the results prove the viability of statistical modelling of natural images as a framework that produces quantitative predictions of visual processing.
Saccadic gain adaptation is predicted by the statistics of natural fluctuations in oculomotor function

Directory of Open Access Journals (Sweden)

Mark V Albert

2012-12-01

Full Text Available Due to multiple factors such as fatigue, muscle strengthening, and neural plasticity, the responsiveness of the motor apparatus to neural commands changes over time. To enable precise movements the nervous system must adapt to compensate for these changes. Recent models of motor adaptation derive from assumptions about the way the motor apparatus changes. Characterizing these changes is difficult because motor adaptation happens at the same time, masking most of the effects of ongoing changes. Here, we analyze eye movements of monkeys with lesions to the posterior cerebellar vermis that impair adaptation. Their fluctuations better reveal the underlying changes of the motor system over time. When these measured, unadapted changes are used to derive optimal motor adaptation rules the prediction precision significantly improves. Among three models that similarly fit single-day adaptation results, the model that also matches the temporal correlations of the nonadapting saccades most accurately predicts multiple day adaptation. Saccadic gain adaptation is well matched to the natural statistics of fluctuations of the oculomotor plant.
Guidelineness of the parameters using integrated test in down syndrome risk prediction

Energy Technology Data Exchange (ETDEWEB)

Lee, Jin Won [Graduate School of Catholic University of Pusan, Busan (Korea, Republic of); Go, Sung Jin; Kang, Se Sik; Kim, Chang Soo [Dept. Radiological Science, College of Health Sciences, Catholic University of Pusan, Busan (Korea, Republic of)

2016-12-15

This study was an evaluation of the significance of each parameter through aimed at pregnant women subjected to screening test(integrated test) in predicting risk of Down syndrome. We retrospectively analysed the correlation of risk of Down's syndrome with Nuchal Translucency(NT) images measured by ultrasound, Pregnancy Associated Plasma Protein A(PAPP-A), alpha-fetoprotein(AFP), unconjugated estriol(uE3), human chorionic gonadotrophin(hCG) and Inhibin A by maternal serum. As a result, a significant correlation with NT, uE3, hCG, Inhibin A is revealed with Down's syndrome risk(P<.001). In ROC analysis, AUC of Inhibin A is analysed as the biggest predictor of Down's syndrome(0.859). And the criterion for cut-off was inhibin A 1.4 MoM(sensitivity 81.8%, specificity 75.9%). In conclusion, Inhibin A was the most useful in parameters to predict Down's syndrome in the integrated test. If we make up for the weakness based on the cut-off value of parameters they will be able to be used as an independent indicator in the risk of Down's syndrome screening.
In silico environmental chemical science: properties and processes from statistical and computational modelling

Energy Technology Data Exchange (ETDEWEB)

Tratnyek, P. G.; Bylaska, Eric J.; Weber, Eric J.

2017-01-01

Quantitative structure–activity relationships (QSARs) have long been used in the environmental sciences. More recently, molecular modeling and chemoinformatic methods have become widespread. These methods have the potential to expand and accelerate advances in environmental chemistry because they complement observational and experimental data with “in silico” results and analysis. The opportunities and challenges that arise at the intersection between statistical and theoretical in silico methods are most apparent in the context of properties that determine the environmental fate and effects of chemical contaminants (degradation rate constants, partition coefficients, toxicities, etc.). The main example of this is the calibration of QSARs using descriptor variable data calculated from molecular modeling, which can make QSARs more useful for predicting property data that are unavailable, but also can make them more powerful tools for diagnosis of fate determining pathways and mechanisms. Emerging opportunities for “in silico environmental chemical science” are to move beyond the calculation of specific chemical properties using statistical models and toward more fully in silico models, prediction of transformation pathways and products, incorporation of environmental factors into model predictions, integration of databases and predictive models into more comprehensive and efficient tools for exposure assessment, and extending the applicability of all the above from chemicals to biologicals and materials.
In silico environmental chemical science: properties and processes from statistical and computational modelling.

Science.gov (United States)

Tratnyek, Paul G; Bylaska, Eric J; Weber, Eric J

2017-03-22

Quantitative structure-activity relationships (QSARs) have long been used in the environmental sciences. More recently, molecular modeling and chemoinformatic methods have become widespread. These methods have the potential to expand and accelerate advances in environmental chemistry because they complement observational and experimental data with "in silico" results and analysis. The opportunities and challenges that arise at the intersection between statistical and theoretical in silico methods are most apparent in the context of properties that determine the environmental fate and effects of chemical contaminants (degradation rate constants, partition coefficients, toxicities, etc.). The main example of this is the calibration of QSARs using descriptor variable data calculated from molecular modeling, which can make QSARs more useful for predicting property data that are unavailable, but also can make them more powerful tools for diagnosis of fate determining pathways and mechanisms. Emerging opportunities for "in silico environmental chemical science" are to move beyond the calculation of specific chemical properties using statistical models and toward more fully in silico models, prediction of transformation pathways and products, incorporation of environmental factors into model predictions, integration of databases and predictive models into more comprehensive and efficient tools for exposure assessment, and extending the applicability of all the above from chemicals to biologicals and materials.
Improvement of Risk Prediction After Transcatheter Aortic Valve Replacement by Combining Frailty With Conventional Risk Scores.

Science.gov (United States)

Schoenenberger, Andreas W; Moser, André; Bertschi, Dominic; Wenaweser, Peter; Windecker, Stephan; Carrel, Thierry; Stuck, Andreas E; Stortecky, Stefan

2018-02-26

This study sought to evaluate whether frailty improves mortality prediction in combination with the conventional scores. European System for Cardiac Operative Risk Evaluation (EuroSCORE) or Society of Thoracic Surgeons (STS) score have not been evaluated in combined models with frailty for mortality prediction after transcatheter aortic valve replacement (TAVR). This prospective cohort comprised 330 consecutive TAVR patients ≥70 years of age. Conventional scores and a frailty index (based on assessment of cognition, mobility, nutrition, and activities of daily living) were evaluated to predict 1-year all-cause mortality using Cox proportional hazards regression (providing hazard ratios [HRs] with confidence intervals [CIs]) and measures of test performance (providing likelihood ratio [LR] chi-square test statistic and C-statistic [CS]). All risk scores were predictive of the outcome (EuroSCORE, HR: 1.90 [95% CI: 1.45 to 2.48], LR chi-square test statistic 19.29, C-statistic 0.67; STS score, HR: 1.51 [95% CI: 1.21 to 1.88], LR chi-square test statistic 11.05, C-statistic 0.64; frailty index, HR: 3.29 [95% CI: 1.98 to 5.47], LR chi-square test statistic 22.28, C-statistic 0.66). A combination of the frailty index with either EuroSCORE (LR chi-square test statistic 38.27, C-statistic 0.72) or STS score (LR chi-square test statistic 28.71, C-statistic 0.68) improved mortality prediction. The frailty index accounted for 58.2% and 77.6% of the predictive information in the combined model with EuroSCORE and STS score, respectively. Net reclassification improvement and integrated discrimination improvement confirmed that the added frailty index improved risk prediction. This is the first study showing that the assessment of frailty significantly enhances prediction of 1-year mortality after TAVR in combined risk models with conventional risk scores and relevantly contributes to this improvement. Copyright © 2018 American College of Cardiology Foundation
Analysis of predicted and measured performance of an integrated compound parabolic concentrator (ICPC)

Energy Technology Data Exchange (ETDEWEB)

Winston, R.; O' Gallagher, J.J.; Muschaweck, J.; Mahoney, A.R.; Dudley, V.

1999-07-01

A variety of configurations of evacuated Integrated Compound Parabolic Concentrator (ICPC) tubes have been under development for many years. A particularly favorable optical design corresponds to the unit concentration limit for a fin CPC solution which is then coupled to a practical, thin, wedge-shaped absorber. Prototype collector modules using tubes with two different fin orientations (horizontal and vertical) have been fabricated and tested. Comprehensive measurements of the optical characteristics of the reflector and absorber have been used together with a detailed ray trace analysis to predict the optical performance characteristics of these designs. The observed performance agrees well with the predicted performance.
A vision for an ultra-high resolution integrated water cycle observation and prediction system

Science.gov (United States)

Houser, P. R.

2013-05-01

Society's welfare, progress, and sustainable economic growth—and life itself—depend on the abundance and vigorous cycling and replenishing of water throughout the global environment. The water cycle operates on a continuum of time and space scales and exchanges large amounts of energy as water undergoes phase changes and is moved from one part of the Earth system to another. We must move toward an integrated observation and prediction paradigm that addresses broad local-to-global science and application issues by realizing synergies associated with multiple, coordinated observations and prediction systems. A central challenge of a future water and energy cycle observation strategy is to progress from single variable water-cycle instruments to multivariable integrated instruments in electromagnetic-band families. The microwave range in the electromagnetic spectrum is ideally suited for sensing the state and abundance of water because of water's dielectric properties. Eventually, a dedicated high-resolution water-cycle microwave-based satellite mission may be possible based on large-aperture antenna technology that can harvest the synergy that would be afforded by simultaneous multichannel active and passive microwave measurements. A partial demonstration of these ideas can even be realized with existing microwave satellite observations to support advanced multivariate retrieval methods that can exploit the totality of the microwave spectral information. The simultaneous multichannel active and passive microwave retrieval would allow improved-accuracy retrievals that are not possible with isolated measurements. Furthermore, the simultaneous monitoring of several of the land, atmospheric, oceanic, and cryospheric states brings synergies that will substantially enhance understanding of the global water and energy cycle as a system. The multichannel approach also affords advantages to some constituent retrievals—for instance, simultaneous retrieval of vegetation
Integration of RNA-Seq and RPPA data for survival time prediction in cancer patients.

Science.gov (United States)

Isik, Zerrin; Ercan, Muserref Ece

2017-10-01

Integration of several types of patient data in a computational framework can accelerate the identification of more reliable biomarkers, especially for prognostic purposes. This study aims to identify biomarkers that can successfully predict the potential survival time of a cancer patient by integrating the transcriptomic (RNA-Seq), proteomic (RPPA), and protein-protein interaction (PPI) data. The proposed method -RPBioNet- employs a random walk-based algorithm that works on a PPI network to identify a limited number of protein biomarkers. Later, the method uses gene expression measurements of the selected biomarkers to train a classifier for the survival time prediction of patients. RPBioNet was applied to classify kidney renal clear cell carcinoma (KIRC), glioblastoma multiforme (GBM), and lung squamous cell carcinoma (LUSC) patients based on their survival time classes (long- or short-term). The RPBioNet method correctly identified the survival time classes of patients with between 66% and 78% average accuracy for three data sets. RPBioNet operates with only 20 to 50 biomarkers and can achieve on average 6% higher accuracy compared to the closest alternative method, which uses only RNA-Seq data in the biomarker selection. Further analysis of the most predictive biomarkers highlighted genes that are common for both cancer types, as they may be driver proteins responsible for cancer progression. The novelty of this study is the integration of a PPI network with mRNA and protein expression data to identify more accurate prognostic biomarkers that can be used for clinical purposes in the future. Copyright © 2017 Elsevier Ltd. All rights reserved.
Predictors of Traditional Medical Practices in Illness Behavior in Northwestern Ethiopia: An Integrated Model of Behavioral Prediction Based Logistic Regression Analysis

Directory of Open Access Journals (Sweden)

Abenezer Yared

2017-01-01

Full Text Available This study aimed at investigating traditional medical beliefs and practices in illness behavior as well as predictors of the practices in Gondar city, northwestern Ethiopia, by using the integrated model of behavioral prediction. A cross-sectional quantitative survey was conducted to collect data through interviewer administered structured questionnaires from 496 individuals selected by probability proportional to size sampling technique. Unadjusted bivariate and adjusted multivariate logistic regression analyses were performed, and the results indicated that sociocultural predictors of normative response and attitude as well as psychosocial individual difference variables of traditional understanding of illness causation and perceived efficacy had statistically significant associations with traditional medical practices. Due to the influence of these factors, majority of the study population (85% thus relied on both herbal and spiritual varieties of traditional medicine to respond to their perceived illnesses, supporting the conclusion that characterized the illness behavior of the people as mainly involving traditional medical practices. The results implied two-way medicine needs to be developed with ongoing research, and health educations must take the traditional customs into consideration, for integrating interventions in the health care system in ways that the general public accepts yielding a better health outcome.
The Integrated Medical Model: A Probabilistic Simulation Model for Predicting In-Flight Medical Risks

Science.gov (United States)

Keenan, Alexandra; Young, Millennia; Saile, Lynn; Boley, Lynn; Walton, Marlei; Kerstman, Eric; Shah, Ronak; Goodenow, Debra A.; Myers, Jerry G.

2015-01-01

The Integrated Medical Model (IMM) is a probabilistic model that uses simulation to predict mission medical risk. Given a specific mission and crew scenario, medical events are simulated using Monte Carlo methodology to provide estimates of resource utilization, probability of evacuation, probability of loss of crew, and the amount of mission time lost due to illness. Mission and crew scenarios are defined by mission length, extravehicular activity (EVA) schedule, and crew characteristics including: sex, coronary artery calcium score, contacts, dental crowns, history of abdominal surgery, and EVA eligibility. The Integrated Medical Evidence Database (iMED) houses the model inputs for one hundred medical conditions using in-flight, analog, and terrestrial medical data. Inputs include incidence, event durations, resource utilization, and crew functional impairment. Severity of conditions is addressed by defining statistical distributions on the dichotomized best and worst-case scenarios for each condition. The outcome distributions for conditions are bounded by the treatment extremes of the fully treated scenario in which all required resources are available and the untreated scenario in which no required resources are available. Upon occurrence of a simulated medical event, treatment availability is assessed, and outcomes are generated depending on the status of the affected crewmember at the time of onset, including any pre-existing functional impairments or ongoing treatment of concurrent conditions. The main IMM outcomes, including probability of evacuation and loss of crew life, time lost due to medical events, and resource utilization, are useful in informing mission planning decisions. To date, the IMM has been used to assess mission-specific risks with and without certain crewmember characteristics, to determine the impact of eliminating certain resources from the mission medical kit, and to design medical kits that maximally benefit crew health while meeting
On the limits of statistical learning: Intertrial contextual cueing is confined to temporally close contingencies.

Science.gov (United States)

Thomas, Cyril; Didierjean, André; Maquestiaux, François; Goujon, Annabelle

2018-04-12

Since the seminal study by Chun and Jiang (Cognitive Psychology, 36, 28-71, 1998), a large body of research based on the contextual-cueing paradigm has shown that the cognitive system is capable of extracting statistical contingencies from visual environments. Most of these studies have focused on how individuals learn regularities found within an intratrial temporal window: A context predicts the target position within a given trial. However, Ono, Jiang, and Kawahara (Journal of Experimental Psychology, 31, 703-712, 2005) provided evidence of an intertrial implicit-learning effect when a distractor configuration in preceding trials N - 1 predicted the target location in trials N. The aim of the present study was to gain further insight into this effect by examining whether it occurs when predictive relationships are impeded by interfering task-relevant noise (Experiments 2 and 3) or by a long delay (Experiments 1, 4, and 5). Our results replicated the intertrial contextual-cueing effect, which occurred in the condition of temporally close contingencies. However, there was no evidence of integration across long-range spatiotemporal contingencies, suggesting a temporal limitation of statistical learning.
Power flow as a complement to statistical energy analysis and finite element analysis

Science.gov (United States)

Cuschieri, J. M.

1987-01-01

Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.
Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

Science.gov (United States)

Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

2008-01-01

ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of
Understanding advanced statistical methods

CERN Document Server

Westfall, Peter

2013-01-01

Introduction: Probability, Statistics, and ScienceReality, Nature, Science, and ModelsStatistical Processes: Nature, Design and Measurement, and DataModelsDeterministic ModelsVariabilityParametersPurely Probabilistic Statistical ModelsStatistical Models with Both Deterministic and Probabilistic ComponentsStatistical InferenceGood and Bad ModelsUses of Probability ModelsRandom Variables and Their Probability DistributionsIntroductionTypes of Random Variables: Nominal, Ordinal, and ContinuousDiscrete Probability Distribution FunctionsContinuous Probability Distribution FunctionsSome Calculus-Derivatives and Least SquaresMore Calculus-Integrals and Cumulative Distribution FunctionsProbability Calculation and SimulationIntroductionAnalytic Calculations, Discrete and Continuous CasesSimulation-Based ApproximationGenerating Random NumbersIdentifying DistributionsIntroductionIdentifying Distributions from Theory AloneUsing Data: Estimating Distributions via the HistogramQuantiles: Theoretical and Data-Based Estimate...
A perceptual space of local image statistics.

Science.gov (United States)

Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M

2015-12-01

Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.
Generalized interpolative quantum statistics

International Nuclear Information System (INIS)

Ramanathan, R.

1992-01-01

A generalized interpolative quantum statistics is presented by conjecturing a certain reordering of phase space due to the presence of possible exotic objects other than bosons and fermions. Such an interpolation achieved through a Bose-counting strategy predicts the existence of an infinite quantum Boltzmann-Gibbs statistics akin to the one discovered by Greenberg recently
The non-equilibrium statistical mechanics of a simple geophysical fluid dynamics model

Science.gov (United States)

Verkley, Wim; Severijns, Camiel

2014-05-01

Lorenz [1] has devised a dynamical system that has proved to be very useful as a benchmark system in geophysical fluid dynamics. The system in its simplest form consists of a periodic array of variables that can be associated with an atmospheric field on a latitude circle. The system is driven by a constant forcing, is damped by linear friction and has a simple advection term that causes the model to behave chaotically if the forcing is large enough. Our aim is to predict the statistics of Lorenz' model on the basis of a given average value of its total energy - obtained from a numerical integration - and the assumption of statistical stationarity. Our method is the principle of maximum entropy [2] which in this case reads: the information entropy of the system's probability density function shall be maximal under the constraints of normalization, a given value of the average total energy and statistical stationarity. Statistical stationarity is incorporated approximately by using `stationarity constraints', i.e., by requiring that the average first and possibly higher-order time-derivatives of the energy are zero in the maximization of entropy. The analysis [3] reveals that, if the first stationarity constraint is used, the resulting probability density function rather accurately reproduces the statistics of the individual variables. If the second stationarity constraint is used as well, the correlations between the variables are also reproduced quite adequately. The method can be generalized straightforwardly and holds the promise of a viable non-equilibrium statistical mechanics of the forced-dissipative systems of geophysical fluid dynamics. [1] E.N. Lorenz, 1996: Predictability - A problem partly solved, in Proc. Seminar on Predictability (ECMWF, Reading, Berkshire, UK), Vol. 1, pp. 1-18. [2] E.T. Jaynes, 2003: Probability Theory - The Logic of Science (Cambridge University Press, Cambridge). [3] W.T.M. Verkley and C.A. Severijns, 2014: The maximum entropy

Microgravity Disturbance Predictions in the Combustion Integrated Rack

Science.gov (United States)

Just, M.; Grodsinsky, Carlos M.

2002-01-01

This paper will focus on the approach used to characterize microgravity disturbances in the Combustion Integrated Rack (CIR), currently scheduled for launch to the International Space Station (ISS) in 2005. Microgravity experiments contained within the CIR are extremely sensitive to vibratory and transient disturbances originating on-board and off-board the rack. Therefore, several techniques are implemented to isolate the critical science locations from external vibration. A combined testing and analysis approach is utilized to predict the resulting microgravity levels at the critical science location. The major topics to be addressed are: 1) CIR Vibration Isolation Approaches, 2) Disturbance Sources and Characterization, 3) Microgravity Predictive Modeling, 4) Science Microgravity Requirements, 6) Microgravity Control, and 7) On-Orbit Disturbance Measurement. The CIR is using the Passive Rack Isolation System (PaRIS) to isolate the rack from offboard rack disturbances. By utilizing this system, CIR is connected to the U.S. Lab module structure by either 13 or 14 umbilical lines and 8 spring / damper isolators. Some on-board CIR disturbers are locally isolated by grommets or wire ropes. CIR's environmental and science on board support equipment such as air circulation fans, pumps, water flow, air flow, solenoid valves, and computer hard drives cause disturbances within the rack. These disturbers along with the rack structure must be characterized to predict whether the on-orbit vibration levels during experimentation exceed the specified science microgravity vibration level requirements. Both vibratory and transient disturbance conditions are addressed. Disturbance levels/analytical inputs are obtained for each individual disturber in a "free floating" condition in the Glenn Research Center (GRC) Microgravity Emissions Lab (MEL). Flight spare hardware is tested on an Orbital Replacement Unit (ORU) basis. Based on test and analysis, maximum disturbance level
Crystal plasticity assisted prediction on the yield locus evolution and forming limit curves

Science.gov (United States)

Lian, Junhe; Liu, Wenqi; Shen, Fuhui; Münstermann, Sebastian

2017-10-01

The aim of this study is to predict the plastic anisotropy evolution and its associated forming limit curves of bcc steels purely based on their microstructural features by establishing an integrated multiscale modelling approach. Crystal plasticity models are employed to describe the micro deformation mechanism and correlate the microstructure with mechanical behaviour on micro and mesoscale. Virtual laboratory is performed considering the statistical information of the microstructure, which serves as the input for the phenomenological plasticity model on the macroscale. For both scales, the microstructure evolution induced evolving features, such as the anisotropic hardening, r-value and yield locus evolution are seamlessly integrated. The predicted plasticity behaviour by the numerical simulations are compared with experiments. These evolutionary features of the material deformation behaviour are eventually considered for the prediction of formability.
The Balance-Scale Task Revisited: A Comparison of Statistical Models for Rule-Based and Information-Integration Theories of Proportional Reasoning.

Directory of Open Access Journals (Sweden)

Abe D Hofman

Full Text Available We propose and test three statistical models for the analysis of children's responses to the balance scale task, a seminal task to study proportional reasoning. We use a latent class modelling approach to formulate a rule-based latent class model (RB LCM following from a rule-based perspective on proportional reasoning and a new statistical model, the Weighted Sum Model, following from an information-integration approach. Moreover, a hybrid LCM using item covariates is proposed, combining aspects of both a rule-based and information-integration perspective. These models are applied to two different datasets, a standard paper-and-pencil test dataset (N = 779, and a dataset collected within an online learning environment that included direct feedback, time-pressure, and a reward system (N = 808. For the paper-and-pencil dataset the RB LCM resulted in the best fit, whereas for the online dataset the hybrid LCM provided the best fit. The standard paper-and-pencil dataset yielded more evidence for distinct solution rules than the online data set in which quantitative item characteristics are more prominent in determining responses. These results shed new light on the discussion on sequential rule-based and information-integration perspectives of cognitive development.
IMP 2.0: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks.

Science.gov (United States)

Wong, Aaron K; Krishnan, Arjun; Yao, Victoria; Tadych, Alicja; Troyanskaya, Olga G

2015-07-01

IMP (Integrative Multi-species Prediction), originally released in 2012, is an interactive web server that enables molecular biologists to interpret experimental results and to generate hypotheses in the context of a large cross-organism compendium of functional predictions and networks. The system provides biologists with a framework to analyze their candidate gene sets in the context of functional networks, expanding or refining their sets using functional relationships predicted from integrated high-throughput data. IMP 2.0 integrates updated prior knowledge and data collections from the last three years in the seven supported organisms (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Caenorhabditis elegans, and Saccharomyces cerevisiae) and extends function prediction coverage to include human disease. IMP identifies homologs with conserved functional roles for disease knowledge transfer, allowing biologists to analyze disease contexts and predictions across all organisms. Additionally, IMP 2.0 implements a new flexible platform for experts to generate custom hypotheses about biological processes or diseases, making sophisticated data-driven methods easily accessible to researchers. IMP does not require any registration or installation and is freely available for use at http://imp.princeton.edu. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Machine remaining useful life prediction: An integrated adaptive neuro-fuzzy and high-order particle filtering approach

Science.gov (United States)

Chen, Chaochao; Vachtsevanos, George; Orchard, Marcos E.

2012-04-01

Machine prognosis can be considered as the generation of long-term predictions that describe the evolution in time of a fault indicator, with the purpose of estimating the remaining useful life (RUL) of a failing component/subsystem so that timely maintenance can be performed to avoid catastrophic failures. This paper proposes an integrated RUL prediction method using adaptive neuro-fuzzy inference systems (ANFIS) and high-order particle filtering, which forecasts the time evolution of the fault indicator and estimates the probability density function (pdf) of RUL. The ANFIS is trained and integrated in a high-order particle filter as a model describing the fault progression. The high-order particle filter is used to estimate the current state and carry out p-step-ahead predictions via a set of particles. These predictions are used to estimate the RUL pdf. The performance of the proposed method is evaluated via the real-world data from a seeded fault test for a UH-60 helicopter planetary gear plate. The results demonstrate that it outperforms both the conventional ANFIS predictor and the particle-filter-based predictor where the fault growth model is a first-order model that is trained via the ANFIS.
Frontoparietal white matter integrity predicts haptic performance in chronic stroke

Directory of Open Access Journals (Sweden)

Alexandra L. Borstad

2016-01-01

. Age strongly correlated with the shared variance across tracts in the control, but not in the poststroke participants. A moderate to good relationship was found between ipsilesional T–M1 MD and affected hand HASTe score (r = −0.62, p = 0.006 and less affected hand HASTe score (r = −0.53, p = 0.022. Regression analysis revealed approximately 90% of the variance in affected hand HASTe score was predicted by the white matter integrity in the frontoparietal network (as indexed by MD in poststroke participants while 87% of the variance in HASTe score was predicted in control participants. This study demonstrates the importance of frontoparietal white matter in mediating haptic performance and specifically identifies that T–M1 and precuneus interhemispheric tracts may be appropriate targets for piloting rehabilitation interventions, such as noninvasive brain stimulation, when the goal is to improve poststroke haptic performance.
Frontoparietal white matter integrity predicts haptic performance in chronic stroke.

Science.gov (United States)

Borstad, Alexandra L; Choi, Seongjin; Schmalbrock, Petra; Nichols-Larsen, Deborah S

2016-01-01

strongly correlated with the shared variance across tracts in the control, but not in the poststroke participants. A moderate to good relationship was found between ipsilesional T-M1 MD and affected hand HASTe score (r = - 0.62, p = 0.006) and less affected hand HASTe score (r = - 0.53, p = 0.022). Regression analysis revealed approximately 90% of the variance in affected hand HASTe score was predicted by the white matter integrity in the frontoparietal network (as indexed by MD) in poststroke participants while 87% of the variance in HASTe score was predicted in control participants. This study demonstrates the importance of frontoparietal white matter in mediating haptic performance and specifically identifies that T-M1 and precuneus interhemispheric tracts may be appropriate targets for piloting rehabilitation interventions, such as noninvasive brain stimulation, when the goal is to improve poststroke haptic performance.
SOCR: Statistics Online Computational Resource

OpenAIRE

Dinov, Ivo D.

2006-01-01

The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis...
RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

Energy Technology Data Exchange (ETDEWEB)

Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

2010-05-26

RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.
Reproducing tailing in breakthrough curves: Are statistical models equally representative and predictive?

Science.gov (United States)

Pedretti, Daniele; Bianchi, Marco

2018-03-01

Breakthrough curves (BTCs) observed during tracer tests in highly heterogeneous aquifers display strong tailing. Power laws are popular models for both the empirical fitting of these curves, and the prediction of transport using upscaling models based on best-fitted estimated parameters (e.g. the power law slope or exponent). The predictive capacity of power law based upscaling models can be however questioned due to the difficulties to link model parameters with the aquifers' physical properties. This work analyzes two aspects that can limit the use of power laws as effective predictive tools: (a) the implication of statistical subsampling, which often renders power laws undistinguishable from other heavily tailed distributions, such as the logarithmic (LOG); (b) the difficulties to reconcile fitting parameters obtained from models with different formulations, such as the presence of a late-time cutoff in the power law model. Two rigorous and systematic stochastic analyses, one based on benchmark distributions and the other on BTCs obtained from transport simulations, are considered. It is found that a power law model without cutoff (PL) results in best-fitted exponents (αPL) falling in the range of typical experimental values reported in the literature (1.5 tailing becomes heavier. Strong fluctuations occur when the number of samples is limited, due to the effects of subsampling. On the other hand, when the power law model embeds a cutoff (PLCO), the best-fitted exponent (αCO) is insensitive to the degree of tailing and to the effects of subsampling and tends to a constant αCO ≈ 1. In the PLCO model, the cutoff rate (λ) is the parameter that fully reproduces the persistence of the tailing and is shown to be inversely correlated to the LOG scale parameter (i.e. with the skewness of the distribution). The theoretical results are consistent with the fitting analysis of a tracer test performed during the MADE-5 experiment. It is shown that a simple
Progress in space weather predictions and applications

Science.gov (United States)

Lundstedt, H.

The methods of today's predictions of space weather and effects are so much more advanced and yesterday's statistical methods are now replaced by integrated knowledge-based neuro-computing models and MHD methods. Within the ESA Space Weather Programme Study a real-time forecast service has been developed for space weather and effects. This prototype is now being implemented for specific users. Today's applications are not only so many more but also so much more advanced and user-oriented. A scientist needs real-time predictions of a global index as input for an MHD model calculating the radiation dose for EVAs. A power company system operator needs a prediction of the local value of a geomagnetically induced current. A science tourist needs to know whether or not aurora will occur. Soon we might even be able to predict the tropospheric climate changes and weather caused by the space weather.
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets

Energy Technology Data Exchange (ETDEWEB)

Sandstrom, Mary M. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Brown, Geoffrey W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Preston, Daniel N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Pollard, Colin J. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Warner, Kirstin F. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Sorensen, Daniel N. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Remmers, Daniel L. [Naval Surface Warfare Center (NSWC), Indian Head, MD (United States). Indian Head Division; Phillips, Jason J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Shelley, Timothy J. [Air Force Research Lab. (AFRL), Tyndall AFB, FL (United States); Reyes, Jose A. [Applied Research Associates, Tyndall AFB, FL (United States); Hsu, Peter C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Reynolds, John G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2015-10-30

The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statistically significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.
Statistical Model of Extreme Shear

DEFF Research Database (Denmark)

Hansen, Kurt Schaldemose; Larsen, Gunner Chr.

2005-01-01

In order to continue cost-optimisation of modern large wind turbines, it is important to continuously increase the knowledge of wind field parameters relevant to design loads. This paper presents a general statistical model that offers site-specific prediction of the probability density function...... by a model that, on a statistically consistent basis, describes the most likely spatial shape of an extreme wind shear event. Predictions from the model have been compared with results from an extreme value data analysis, based on a large number of full-scale measurements recorded with a high sampling rate...
Integrated predictive modeling simulations of the Mega-Amp Spherical Tokamak

International Nuclear Information System (INIS)

Nguyen, Canh N.; Bateman, Glenn; Kritz, Arnold H.; Akers, Robert; Byrom, Calum; Sykes, Alan

2002-01-01

Integrated predictive modeling simulations are carried out using the BALDUR transport code [Singer et al., Comput. Phys. Commun. 49, 275 (1982)] for high confinement mode (H-mode) and low confinement mode (L-mode) discharges in the Mega-Amp Spherical Tokamak (MAST) [Sykes et al., Phys. Plasmas 8, 2101 (2001)]. Simulation results, obtained using either the Multi-Mode transport model (MMM95) or, alternatively, the mixed-Bohm/gyro-Bohm transport model, are compared with experimental data. In addition to the anomalous transport, neoclassical transport is included in the simulations and the ion thermal diffusivity in the inner third of the plasma is found to be predominantly neoclassical. The sawtooth oscillations in the simulations radially spread the neutral beam injection heating profiles across a broad sawtooth mixing region. The broad sawtooth oscillations also flatten the central temperature and electron density profiles. Simulation results for the electron temperature and density profiles are compared with experimental data to test the applicability of these models and the BALDUR integrated modeling code in the limit of low aspect ratio toroidal plasmas
Integrated circuit manufacture and tuning of subassemblies of a statistical analyzer of voltage oscillations (AKON). Izgotovleniye na integral'nykh skhemakh i nastroyka uzlov statisticheskogo analizatora kolebaniy napryazheniya (AKON)

Energy Technology Data Exchange (ETDEWEB)

Yermakov, V.F.; Oleynik, V.I.; Sambarov, Yu.M.

1982-01-01

The basic circuits and instructions for tuning subassemblies of a statistical analyzer of voltage oscillation are described. The device is intended for monitoring quality of voltage in electric networks in accordance with GOST13109-67. The component base of the device includes integrated circuits of the series 140, 155, 218 and 228.
A practical model-based statistical approach for generating functional test cases: application in the automotive industry

OpenAIRE

Awédikian , Roy; Yannou , Bernard

2012-01-01

International audience; With the growing complexity of industrial software applications, industrials are looking for efficient and practical methods to validate the software. This paper develops a model-based statistical testing approach that automatically generates online and offline test cases for embedded software. It discusses an integrated framework that combines solutions for three major software testing research questions: (i) how to select test inputs; (ii) how to predict the expected...
PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites.

Directory of Open Access Journals (Sweden)

Jiangning Song

Full Text Available The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s. Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner in order to efficiently identify protein substrates by in silico approaches. To address this problem, we present PROSPER, an integrated feature-based server for in silico identification of protease substrates and their cleavage sites for twenty-four different proteases. PROSPER utilizes established specificity information for these proteases (derived from the MEROPS database with a machine learning approach to predict protease cleavage sites by using different, but complementary sequence and structure characteristics. Features used by PROSPER include local amino acid sequence profile, predicted secondary structure, solvent accessibility and predicted native disorder. Thus, for proteases with known amino acid specificity, PROSPER provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. Systematic prediction analysis for the twenty-four proteases thus far included in the database revealed that the features we have included in the tool strongly improve performance in terms of cleavage site prediction, as evidenced by their contribution to performance improvement in terms of identifying known cleavage sites in substrates for these enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, PROSPER achieves greater accuracy and coverage. To our knowledge, PROSPER is the first comprehensive server capable of predicting cleavage sites of multiple proteases within a single substrate
Predicting the risk of cucurbit downy mildew in the eastern United States using an integrated aerobiological model

Science.gov (United States)

Neufeld, K. N.; Keinath, A. P.; Gugino, B. K.; McGrath, M. T.; Sikora, E. J.; Miller, S. A.; Ivey, M. L.; Langston, D. B.; Dutta, B.; Keever, T.; Sims, A.; Ojiambo, P. S.

2017-11-01

Cucurbit downy mildew caused by the obligate oomycete, Pseudoperonospora cubensis, is considered one of the most economically important diseases of cucurbits worldwide. In the continental United States, the pathogen overwinters in southern Florida and along the coast of the Gulf of Mexico. Outbreaks of the disease in northern states occur annually via long-distance aerial transport of sporangia from infected source fields. An integrated aerobiological modeling system has been developed to predict the risk of disease occurrence and to facilitate timely use of fungicides for disease management. The forecasting system, which combines information on known inoculum sources, long-distance atmospheric spore transport and spore deposition modules, was tested to determine its accuracy in predicting risk of disease outbreak. Rainwater samples at disease monitoring sites in Alabama, Georgia, Louisiana, New York, North Carolina, Ohio, Pennsylvania and South Carolina were collected weekly from planting to the first appearance of symptoms at the field sites during the 2013, 2014, and 2015 growing seasons. A conventional PCR assay with primers specific to P. cubensis was used to detect the presence of sporangia in rain water samples. Disease forecasts were monitored and recorded for each site after each rain event until initial disease symptoms appeared. The pathogen was detected in 38 of the 187 rainwater samples collected during the study period. The forecasting system correctly predicted the risk of disease outbreak based on the presence of sporangia or appearance of initial disease symptoms with an overall accuracy rate of 66 and 75%, respectively. In addition, the probability that the forecasting system correctly classified the presence or absence of disease was ≥ 73%. The true skill statistic calculated based on the appearance of disease symptoms in cucurbit field plantings ranged from 0.42 to 0.58, indicating that the disease forecasting system had an acceptable to good
Statistical and Machine Learning Models to Predict Programming Performance

OpenAIRE

Bergin, Susan

2006-01-01

This thesis details a longitudinal study on factors that influence introductory programming success and on the development of machine learning models to predict incoming student performance. Although numerous studies have developed models to predict programming success, the models struggled to achieve high accuracy in predicting the likely performance of incoming students. Our approach overcomes this by providing a machine learning technique, using a set of three significant...
Integration of Tuyere, Raceway and Shaft Models for Predicting Blast Furnace Process

Science.gov (United States)

Fu, Dong; Tang, Guangwu; Zhao, Yongfu; D'Alessio, John; Zhou, Chenn Q.

2018-06-01

A novel modeling strategy is presented for simulating the blast furnace iron making process. Such physical and chemical phenomena are taking place across a wide range of length and time scales, and three models are developed to simulate different regions of the blast furnace, i.e., the tuyere model, the raceway model and the shaft model. This paper focuses on the integration of the three models to predict the entire blast furnace process. Mapping output and input between models and an iterative scheme are developed to establish communications between models. The effects of tuyere operation and burden distribution on blast furnace fuel efficiency are investigated numerically. The integration of different models provides a way to realistically simulate the blast furnace by improving the modeling resolution on local phenomena and minimizing the model assumptions.

iSPUW: integrated sensing and prediction of urban water for sustainable cities

Science.gov (United States)

Noh, S. J.; Nazari, B.; Habibi, H.; Norouzi, A.; Nabatian, M.; Seo, D. J.; Bartos, M. D.; Kerkez, B.; Lakshman, L.; Zink, M.; Lee, J.

2016-12-01

Many cities face tremendous water-related challenges in this Century of the City. Urban areas are particularly susceptible not only to excesses and shortages of water but also to impaired water quality. To addresses these challenges, we synergistically integrate advances in computing and cyber-infrastructure, environmental modeling, geoscience, and information science to develop integrative solutions for urban water challenges. In this presentation, we describe the various efforts that are currently ongoing in the Dallas-Fort Worth Metroplex (DFW) area for iSPUW: real-time high-resolution flash flood forecasting, inundation mapping for large urban areas, crowdsourcing of water observations in urban areas, real-time assimilation of crowdsourced observations for street and river flooding, integrated control of lawn irrigation and rainwater harvesting for water conservation and stormwater management, feature mining with causal discovery for flood prediction, and development of the Arlington Urban Hydroinformatics Testbed. Analyzed is the initial data of sensor network for water level and lawn monitoring, and cellphone applications for crowdsourcing flood reports. New data assimilation approaches to deal with categorical and continuous observations are also evaluated via synthetic experiments.
Statistics: a Bayesian perspective

National Research Council Canada - National Science Library

Berry, Donald A

1996-01-01

...: it is the only introductory textbook based on Bayesian ideas, it combines concepts and methods, it presents statistics as a means of integrating data into the significant process, it develops ideas...
Intuitive introductory statistics

CERN Document Server

Wolfe, Douglas A

2017-01-01

This textbook is designed to give an engaging introduction to statistics and the art of data analysis. The unique scope includes, but also goes beyond, classical methodology associated with the normal distribution. What if the normal model is not valid for a particular data set? This cutting-edge approach provides the alternatives. It is an introduction to the world and possibilities of statistics that uses exercises, computer analyses, and simulations throughout the core lessons. These elementary statistical methods are intuitive. Counting and ranking features prominently in the text. Nonparametric methods, for instance, are often based on counts and ranks and are very easy to integrate into an introductory course. The ease of computation with advanced calculators and statistical software, both of which factor into this text, allows important techniques to be introduced earlier in the study of statistics. This book's novel scope also includes measuring symmetry with Walsh averages, finding a nonp...
Statistical mechanics of Fermi-Pasta-Ulam chains with the canonical ensemble

Science.gov (United States)

Demirel, Melik C.; Sayar, Mehmet; Atılgan, Ali R.

1997-03-01

Low-energy vibrations of a Fermi-Pasta-Ulam-Β (FPU-Β) chain with 16 repeat units are analyzed with the aid of numerical experiments and the statistical mechanics equations of the canonical ensemble. Constant temperature numerical integrations are performed by employing the cubic coupling scheme of Kusnezov et al. [Ann. Phys. 204, 155 (1990)]. Very good agreement is obtained between numerical results and theoretical predictions for the probability distributions of the generalized coordinates and momenta both of the chain and of the thermal bath. It is also shown that the average energy of the chain scales linearly with the bath temperature.
Sensory processing patterns predict the integration of information held in visual working memory.

Science.gov (United States)

Lowe, Matthew X; Stevenson, Ryan A; Wilson, Kristin E; Ouslis, Natasha E; Barense, Morgan D; Cant, Jonathan S; Ferber, Susanne

2016-02-01

Given the limited resources of visual working memory, multiple items may be remembered as an averaged group or ensemble. As a result, local information may be ill-defined, but these ensemble representations provide accurate diagnostics of the natural world by combining gist information with item-level information held in visual working memory. Some neurodevelopmental disorders are characterized by sensory processing profiles that predispose individuals to avoid or seek-out sensory stimulation, fundamentally altering their perceptual experience. Here, we report such processing styles will affect the computation of ensemble statistics in the general population. We identified stable adult sensory processing patterns to demonstrate that individuals with low sensory thresholds who show a greater proclivity to engage in active response strategies to prevent sensory overstimulation are less likely to integrate mean size information across a set of similar items and are therefore more likely to be biased away from the mean size representation of an ensemble display. We therefore propose the study of ensemble processing should extend beyond the statistics of the display, and should also consider the statistics of the observer. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Statistical downscaling of historical monthly mean winds over a coastal region of complex terrain. II. Predicting wind components

Energy Technology Data Exchange (ETDEWEB)

Kamp, Derek van der [University of Victoria, Pacific Climate Impacts Consortium, Victoria, BC (Canada); University of Victoria, School of Earth and Ocean Sciences, Victoria, BC (Canada); Curry, Charles L. [Environment Canada University of Victoria, Canadian Centre for Climate Modelling and Analysis, Victoria, BC (Canada); University of Victoria, School of Earth and Ocean Sciences, Victoria, BC (Canada); Monahan, Adam H. [University of Victoria, School of Earth and Ocean Sciences, Victoria, BC (Canada)

2012-04-15

A regression-based downscaling technique was applied to monthly mean surface wind observations from stations throughout western Canada as well as from buoys in the Northeast Pacific Ocean over the period 1979-2006. A predictor set was developed from principal component analysis of the three wind components at 500 hPa and mean sea-level pressure taken from the NCEP Reanalysis II. Building on the results of a companion paper, Curry et al. (Clim Dyn 2011), the downscaling was applied to both wind speed and wind components, in an effort to evaluate the utility of each type of predictand. Cross-validated prediction skill varied strongly with season, with autumn and summer displaying the highest and lowest skill, respectively. In most cases wind components were predicted with better skill than wind speeds. The predictive ability of wind components was found to be strongly related to their orientation. Wind components with the best predictions were often oriented along topographically significant features such as constricted valleys, mountain ranges or ocean channels. This influence of directionality on predictive ability is most prominent during autumn and winter at inland sites with complex topography. Stations in regions with relatively flat terrain (where topographic steering is minimal) exhibit inter-station consistencies including region-wide seasonal shifts in the direction of the best predicted wind component. The conclusion that wind components can be skillfully predicted only over a limited range of directions at most stations limits the scope of statistically downscaled wind speed predictions. It seems likely that such limitations apply to other regions of complex terrain as well. (orig.)
Motif statistics and spike correlations in neuronal networks

International Nuclear Information System (INIS)

Hu, Yu; Shea-Brown, Eric; Trousdale, James; Josić, Krešimir

2013-01-01

Motifs are patterns of subgraphs of complex networks. We studied the impact of such patterns of connectivity on the level of correlated, or synchronized, spiking activity among pairs of cells in a recurrent network of integrate and fire neurons. For a range of network architectures, we find that the pairwise correlation coefficients, averaged across the network, can be closely approximated using only three statistics of network connectivity. These are the overall network connection probability and the frequencies of two second order motifs: diverging motifs, in which one cell provides input to two others, and chain motifs, in which two cells are connected via a third intermediary cell. Specifically, the prevalence of diverging and chain motifs tends to increase correlation. Our method is based on linear response theory, which enables us to express spiking statistics using linear algebra, and a resumming technique, which extrapolates from second order motifs to predict the overall effect of coupling on network correlation. Our motif-based results seek to isolate the effect of network architecture perturbatively from a known network state. (paper)
Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models.

Science.gov (United States)

Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A

2012-03-15

To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

Energy Technology Data Exchange (ETDEWEB)

Xu Chengjian, E-mail: c.j.xu@umcg.nl [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands); Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van' t [Department of Radiation Oncology, University of Groningen, University Medical Center Groningen, Groningen (Netherlands)

2012-03-15

Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.
Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models

International Nuclear Information System (INIS)

Xu Chengjian; Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van’t

2012-01-01

Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.
Computational Performance Optimisation for Statistical Analysis of the Effect of Nano-CMOS Variability on Integrated Circuits

Directory of Open Access Journals (Sweden)

Zheng Xie

2013-01-01

Full Text Available The intrinsic variability of nanoscale VLSI technology must be taken into account when analyzing circuit designs to predict likely yield. Monte-Carlo- (MC- and quasi-MC- (QMC- based statistical techniques do this by analysing many randomised or quasirandomised copies of circuits. The randomisation must model forms of variability that occur in nano-CMOS technology, including “atomistic” effects without intradie correlation and effects with intradie correlation between neighbouring devices. A major problem is the computational cost of carrying out sufficient analyses to produce statistically reliable results. The use of principal components analysis, behavioural modeling, and an implementation of “Statistical Blockade” (SB is shown to be capable of achieving significant reduction in the computational costs. A computation time reduction of 98.7% was achieved for a commonly used asynchronous circuit element. Replacing MC by QMC analysis can achieve further computation reduction, and this is illustrated for more complex circuits, with the results being compared with those of transistor-level simulations. The “yield prediction” analysis of SRAM arrays is taken as a case study, where the arrays contain up to 1536 transistors modelled using parameters appropriate to 35 nm technology. It is reported that savings of up to 99.85% in computation time were obtained.
Statistical model to predict dry sliding wear behaviour of Aluminium-Jute bast ash particulate composite produced by stir-casting

Directory of Open Access Journals (Sweden)

Gambo Anthony VICTOR

2017-06-01

Full Text Available A model to predict the dry sliding wear behaviour of Aluminium-Jute bast ash particulate composites produced by double stir-casting method was developed in terms of weight fraction of jute bast ash (JBA. Experiments were designed on the basis of the Design of Experiments (DOE technique. A 2k factorial, where k is the number of variables, with central composite second-order rotatable design was used to improve the reliability of results and to reduce the size of experimentation without loss of accuracy. The factors considered in this study were sliding velocity, sliding distance, normal load and mass fraction of JBA reinforcement in the matrix. The developed regression model was validated by statistical software MINITAB-R14 and statistical tool such as analysis of variance (ANOVA. It was found that the developed regression model could be effectively used to predict the wear rate at 95% confidence level. The wear rate of cast Al-JBAp composite decreased with an increase in the mass fraction of JBA and increased with an increase of the sliding velocity, sliding distance and normal load acting on the composite specimen.
Predicting Examination Performance Using an Expanded Integrated Hierarchical Model of Test Emotions and Achievement Goals

Science.gov (United States)

Putwain, Dave; Deveney, Carolyn

2009-01-01

The aim of this study was to examine an expanded integrative hierarchical model of test emotions and achievement goal orientations in predicting the examination performance of undergraduate students. Achievement goals were theorised as mediating the relationship between test emotions and performance. 120 undergraduate students completed…
Sentinel node status prediction by four statistical models: results from a large bi-institutional series (n = 1132).

Science.gov (United States)

Mocellin, Simone; Thompson, John F; Pasquali, Sandro; Montesco, Maria C; Pilati, Pierluigi; Nitti, Donato; Saw, Robyn P; Scolyer, Richard A; Stretch, Jonathan R; Rossi, Carlo R

2009-12-01

To improve selection for sentinel node (SN) biopsy (SNB) in patients with cutaneous melanoma using statistical models predicting SN status. About 80% of patients currently undergoing SNB are node negative. In the absence of conclusive evidence of a SNBassociated survival benefit, these patients may be over-treated. Here, we tested the efficiency of 4 different models in predicting SN status. The clinicopathologic data (age, gender, tumor thickness, Clark level, regression, ulceration, histologic subtype, and mitotic index) of 1132 melanoma patients who had undergone SNB at institutions in Italy and Australia were analyzed. Logistic regression, classification tree, random forest, and support vector machine models were fitted to the data. The predictive models were built with the aim of maximizing the negative predictive value (NPV) and reducing the rate of SNB procedures though minimizing the error rate. After cross-validation logistic regression, classification tree, random forest, and support vector machine predictive models obtained clinically relevant NPV (93.6%, 94.0%, 97.1%, and 93.0%, respectively), SNB reduction (27.5%, 29.8%, 18.2%, and 30.1%, respectively), and error rates (1.8%, 1.8%, 0.5%, and 2.1%, respectively). Using commonly available clinicopathologic variables, predictive models can preoperatively identify a proportion of patients ( approximately 25%) who might be spared SNB, with an acceptable (1%-2%) error. If validated in large prospective series, these models might be implemented in the clinical setting for improved patient selection, which ultimately would lead to better quality of life for patients and optimization of resource allocation for the health care system.
Vitamin D and ferritin correlation with chronic neck pain using standard statistics and a novel artificial neural network prediction model.

Science.gov (United States)

Eloqayli, Haytham; Al-Yousef, Ali; Jaradat, Raid

2018-02-15

Despite the high prevalence of chronic neck pain, there is limited consensus about the primary etiology, risk factors, diagnostic criteria and therapeutic outcome. Here, we aimed to determine if Ferritin and Vitamin D are modifiable risk factors with chronic neck pain using slandered statistics and artificial intelligence neural network (ANN). Fifty-four patients with chronic neck pain treated between February 2016 and August 2016 in King Abdullah University Hospital and 54 patients age matched controls undergoing outpatient or minor procedures were enrolled. Patients and control demographic parameters, height, weight and single measurement of serum vitamin D, Vitamin B12, ferritin, calcium, phosphorus, zinc were obtained. An ANN prediction model was developed. The statistical analysis reveals that patients with chronic neck pain have significantly lower serum Vitamin D and Ferritin (p-value artificial neural network can be of future benefit in classification and prediction models for chronic neck pain. We hope this initial work will encourage a future larger cohort study addressing vitamin D and iron correction as modifiable factors and the application of artificial intelligence models in clinical practice.
Using NCAP to predict RFI effects in linear bipolar integrated circuits

Science.gov (United States)

Fang, T.-F.; Whalen, J. J.; Chen, G. K. C.

1980-11-01

Applications of the Nonlinear Circuit Analysis Program (NCAP) to calculate RFI effects in electronic circuits containing discrete semiconductor devices have been reported upon previously. The objective of this paper is to demonstrate that the computer program NCAP also can be used to calcuate RFI effects in linear bipolar integrated circuits (IC's). The IC's reported upon are the microA741 operational amplifier (op amp) which is one of the most widely used IC's, and a differential pair which is a basic building block in many linear IC's. The microA741 op amp was used as the active component in a unity-gain buffer amplifier. The differential pair was used in a broad-band cascode amplifier circuit. The computer program NCAP was used to predict how amplitude-modulated RF signals are demodulated in the IC's to cause undesired low-frequency responses. The predicted and measured results for radio frequencies in the 0.050-60-MHz range are in good agreement.
General quadrupolar statistical anisotropy: Planck limits

Energy Technology Data Exchange (ETDEWEB)

Ramazanov, S. [Gran Sasso Science Institute (INFN), Viale Francesco Crispi 7, I-67100 L' Aquila (Italy); Rubtsov, G. [Institute for Nuclear Research of the Russian Academy of Sciences, Prospect of the 60th Anniversary of October 7a, 117312 Moscow (Russian Federation); Thorsrud, M. [Faculty of Engineering, Østfold University College, P.O. Box 700, 1757 Halden (Norway); Urban, F.R., E-mail: sabir.ramazanov@gssi.infn.it, E-mail: grisha@ms2.inr.ac.ru, E-mail: mikjel.thorsrud@hiof.no, E-mail: federico.urban@kbfi.ee [National Institute of Chemical Physics and Biophysics, Rävala 10, 10143 Tallinn (Estonia)

2017-03-01

Several early Universe scenarios predict a direction-dependent spectrum of primordial curvature perturbations. This translates into the violation of the statistical isotropy of cosmic microwave background radiation. Previous searches for statistical anisotropy mainly focussed on a quadrupolar direction-dependence characterised by a single multipole vector and an overall amplitude g {sub *}. Generically, however, the quadrupole has a more complicated geometry described by two multipole vectors and g {sub *}. This is the subject of the present work. In particular, we limit the amplitude g {sub *} for different shapes of the quadrupole by making use of Planck 2015 maps. We also constrain certain inflationary scenarios which predict this kind of more general quadrupolar statistical anisotropy.
Monthly to seasonal low flow prediction: statistical versus dynamical models

Science.gov (United States)

Ionita-Scholz, Monica; Klein, Bastian; Meissner, Dennis; Rademacher, Silke

2016-04-01

the Alfred Wegener Institute a purely statistical scheme to generate streamflow forecasts for several months ahead. Instead of directly using teleconnection indices (e.g. NAO, AO) the idea is to identify regions with stable teleconnections between different global climate information (e.g. sea surface temperature, geopotential height etc.) and streamflow at different gauges relevant for inland waterway transport. So-called stability (correlation) maps are generated showing regions where streamflow and climate variable from previous months are significantly correlated in a 21 (31) years moving window. Finally, the optimal forecast model is established based on a multiple regression analysis of the stable predictors. We will present current results of the aforementioned approaches with focus on the River Rhine (being one of the world's most frequented waterways and the backbone of the European inland waterway network) and the Elbe River. Overall, our analysis reveals the existence of a valuable predictability of the low flows at monthly and seasonal time scales, a result that may be useful to water resources management. Given that all predictors used in the models are available at the end of each month, the forecast scheme can be used operationally to predict extreme events and to provide early warnings for upcoming low flows.
EFFECT OF MEASUREMENT ERRORS ON PREDICTED COSMOLOGICAL CONSTRAINTS FROM SHEAR PEAK STATISTICS WITH LARGE SYNOPTIC SURVEY TELESCOPE

Energy Technology Data Exchange (ETDEWEB)

Bard, D.; Chang, C.; Kahn, S. M.; Gilmore, K.; Marshall, S. [KIPAC, Stanford University, 452 Lomita Mall, Stanford, CA 94309 (United States); Kratochvil, J. M.; Huffenberger, K. M. [Department of Physics, University of Miami, Coral Gables, FL 33124 (United States); May, M. [Physics Department, Brookhaven National Laboratory, Upton, NY 11973 (United States); AlSayyad, Y.; Connolly, A.; Gibson, R. R.; Jones, L.; Krughoff, S. [Department of Astronomy, University of Washington, Seattle, WA 98195 (United States); Ahmad, Z.; Bankert, J.; Grace, E.; Hannel, M.; Lorenz, S. [Department of Physics, Purdue University, West Lafayette, IN 47907 (United States); Haiman, Z.; Jernigan, J. G., E-mail: djbard@slac.stanford.edu [Department of Astronomy and Astrophysics, Columbia University, New York, NY 10027 (United States); and others

2013-09-01

We study the effect of galaxy shape measurement errors on predicted cosmological constraints from the statistics of shear peak counts with the Large Synoptic Survey Telescope (LSST). We use the LSST Image Simulator in combination with cosmological N-body simulations to model realistic shear maps for different cosmological models. We include both galaxy shape noise and, for the first time, measurement errors on galaxy shapes. We find that the measurement errors considered have relatively little impact on the constraining power of shear peak counts for LSST.
Statistical mechanics in the context of special relativity.

Science.gov (United States)

Kaniadakis, G

2002-11-01

In Ref. [Physica A 296, 405 (2001)], starting from the one parameter deformation of the exponential function exp(kappa)(x)=(sqrt[1+kappa(2)x(2)]+kappax)(1/kappa), a statistical mechanics has been constructed which reduces to the ordinary Boltzmann-Gibbs statistical mechanics as the deformation parameter kappa approaches to zero. The distribution f=exp(kappa)(-beta E+betamu) obtained within this statistical mechanics shows a power law tail and depends on the nonspecified parameter beta, containing all the information about the temperature of the system. On the other hand, the entropic form S(kappa)= integral d(3)p(c(kappa) f(1+kappa)+c(-kappa) f(1-kappa)), which after maximization produces the distribution f and reduces to the standard Boltzmann-Shannon entropy S0 as kappa-->0, contains the coefficient c(kappa) whose expression involves, beside the Boltzmann constant, another nonspecified parameter alpha. In the present effort we show that S(kappa) is the unique existing entropy obtained by a continuous deformation of S0 and preserving unaltered its fundamental properties of concavity, additivity, and extensivity. These properties of S(kappa) permit to determine unequivocally the values of the above mentioned parameters beta and alpha. Subsequently, we explain the origin of the deformation mechanism introduced by kappa and show that this deformation emerges naturally within the Einstein special relativity. Furthermore, we extend the theory in order to treat statistical systems in a time dependent and relativistic context. Then, we show that it is possible to determine in a self consistent scheme within the special relativity the values of the free parameter kappa which results to depend on the light speed c and reduces to zero as c--> infinity recovering in this way the ordinary statistical mechanics and thermodynamics. The statistical mechanics here presented, does not contain free parameters, preserves unaltered the mathematical and epistemological structure of

A statistical-based material and process guidelines for design of carbon nanotube field-effect transistors in gigascale integrated circuits.

Science.gov (United States)

Ghavami, Behnam; Raji, Mohsen; Pedram, Hossein

2011-08-26

Carbon nanotube field-effect transistors (CNFETs) show great promise as building blocks of future integrated circuits. However, synthesizing single-walled carbon nanotubes (CNTs) with accurate chirality and exact positioning control has been widely acknowledged as an exceedingly complex task. Indeed, density and chirality variations in CNT growth can compromise the reliability of CNFET-based circuits. In this paper, we present a novel statistical compact model to estimate the failure probability of CNFETs to provide some material and process guidelines for the design of CNFETs in gigascale integrated circuits. We use measured CNT spacing distributions within the framework of detailed failure analysis to demonstrate that both the CNT density and the ratio of metallic to semiconducting CNTs play dominant roles in defining the failure probability of CNFETs. Besides, it is argued that the large-scale integration of these devices within an integrated circuit will be feasible only if a specific range of CNT density with an acceptable ratio of semiconducting to metallic CNTs can be adjusted in a typical synthesis process.
Statistical learning and prejudice.

Science.gov (United States)

Madison, Guy; Ullén, Fredrik

2012-12-01

Human behavior is guided by evolutionarily shaped brain mechanisms that make statistical predictions based on limited information. Such mechanisms are important for facilitating interpersonal relationships, avoiding dangers, and seizing opportunities in social interaction. We thus suggest that it is essential for analyses of prejudice and prejudice reduction to take the predictive accuracy and adaptivity of the studied prejudices into account.
Modern applied U-statistics

CERN Document Server

Kowalski, Jeanne

2008-01-01

A timely and applied approach to the newly discovered methods and applications of U-statisticsBuilt on years of collaborative research and academic experience, Modern Applied U-Statistics successfully presents a thorough introduction to the theory of U-statistics using in-depth examples and applications that address contemporary areas of study including biomedical and psychosocial research. Utilizing a "learn by example" approach, this book provides an accessible, yet in-depth, treatment of U-statistics, as well as addresses key concepts in asymptotic theory by integrating translational and cross-disciplinary research.The authors begin with an introduction of the essential and theoretical foundations of U-statistics such as the notion of convergence in probability and distribution, basic convergence results, stochastic Os, inference theory, generalized estimating equations, as well as the definition and asymptotic properties of U-statistics. With an emphasis on nonparametric applications when and where applic...
Geostatistical integration and uncertainty in pollutant concentration surface under preferential sampling

Directory of Open Access Journals (Sweden)

Laura Grisotto

2016-04-01

Full Text Available In this paper the focus is on environmental statistics, with the aim of estimating the concentration surface and related uncertainty of an air pollutant. We used air quality data recorded by a network of monitoring stations within a Bayesian framework to overcome difficulties in accounting for prediction uncertainty and to integrate information provided by deterministic models based on emissions meteorology and chemico-physical characteristics of the atmosphere. Several authors have proposed such integration, but all the proposed approaches rely on representativeness and completeness of existing air pollution monitoring networks. We considered the situation in which the spatial process of interest and the sampling locations are not independent. This is known in the literature as the preferential sampling problem, which if ignored in the analysis, can bias geostatistical inferences. We developed a Bayesian geostatistical model to account for preferential sampling with the main interest in statistical integration and uncertainty. We used PM10 data arising from the air quality network of the Environmental Protection Agency of Lombardy Region (Italy and numerical outputs from the deterministic model. We specified an inhomogeneous Poisson process for the sampling locations intensities and a shared spatial random component model for the dependence between the spatial location of monitors and the pollution surface. We found greater predicted standard deviation differences in areas not properly covered by the air quality network. In conclusion, in this context inferences on prediction uncertainty may be misleading when geostatistical modelling does not take into account preferential sampling.
Multivariate Statistics and Supervised Learning for Predictive Detection of Unintentional Islanding in Grid-Tied Solar PV Systems

Directory of Open Access Journals (Sweden)

Shashank Vyas

2016-01-01

Full Text Available Integration of solar photovoltaic (PV generation with power distribution networks leads to many operational challenges and complexities. Unintentional islanding is one of them which is of rising concern given the steady increase in grid-connected PV power. This paper builds up on an exploratory study of unintentional islanding on a modeled radial feeder having large PV penetration. Dynamic simulations, also run in real time, resulted in exploration of unique potential causes of creation of accidental islands. The resulting voltage and current data underwent dimensionality reduction using principal component analysis (PCA which formed the basis for the application of Q statistic control charts for detecting the anomalous currents that could island the system. For reducing the false alarm rate of anomaly detection, Kullback-Leibler (K-L divergence was applied on the principal component projections which concluded that Q statistic based approach alone is not reliable for detection of the symptoms liable to cause unintentional islanding. The obtained data was labeled and a K-nearest neighbor (K-NN binomial classifier was then trained for identification and classification of potential islanding precursors from other power system transients. The three-phase short-circuit fault case was successfully identified as statistically different from islanding symptoms.
Overview of statistical methods, models and analysis for predicting equipment end of life

Energy Technology Data Exchange (ETDEWEB)

NONE

2009-07-01

Utility equipment can be operated and maintained for many years following installation. However, as the equipment ages, utility operators must decide whether to extend the service life or replace the equipment. Condition assessment modelling is used by many utilities to determine the condition of equipment and to prioritize the maintenance or repair. Several factors are weighted and combined in assessment modelling, which gives a single index number to rate the equipment. There is speculation that this index alone may not be adequate for a business case to rework or replace an asset because it only ranks an asset into a particular category. For that reason, a new methodology was developed to determine the economic end of life of an asset. This paper described the different statistical methods available and their use in determining the remaining service life of electrical equipment. A newly developed Excel-based demonstration computer tool is also an integral part of the deliverables of this project.
Factors Influencing College Women's Contraceptive Behavior: An Application of the Integrative Model of Behavioral Prediction

Science.gov (United States)

Sutton, Jazmyne A.; Walsh-Buhi, Eric R.

2017-01-01

Objective: This study investigated variables within the Integrative Model of Behavioral Prediction (IMBP) as well as differences across socioeconomic status (SES) levels within the context of inconsistent contraceptive use among college women. Participants: A nonprobability sample of 515 female college students completed an Internet-based survey…
An integrated prediction and optimization model of biogas production system at a wastewater treatment facility.

Science.gov (United States)

Akbaş, Halil; Bilgen, Bilge; Turhan, Aykut Melih

2015-11-01

This study proposes an integrated prediction and optimization model by using multi-layer perceptron neural network and particle swarm optimization techniques. Three different objective functions are formulated. The first one is the maximization of methane percentage with single output. The second one is the maximization of biogas production with single output. The last one is the maximization of biogas quality and biogas production with two outputs. Methane percentage, carbon dioxide percentage, and other contents' percentage are used as the biogas quality criteria. Based on the formulated models and data from a wastewater treatment facility, optimal values of input variables and their corresponding maximum output values are found out for each model. It is expected that the application of the integrated prediction and optimization models increases the biogas production and biogas quality, and contributes to the quantity of electricity production at the wastewater treatment facility. Copyright © 2015 Elsevier Ltd. All rights reserved.
Prediction of thrombophilia in patients with unexplained recurrent pregnancy loss using a statistical model.

Science.gov (United States)

Wang, Tongfei; Kang, Xiaomin; He, Liying; Liu, Zhilan; Xu, Haijing; Zhao, Aimin

2017-09-01

To establish a statistical model to predict thrombophilia in patients with unexplained recurrent pregnancy loss (URPL). A retrospective case-control study was conducted at Ren Ji Hospital, Shanghai, China, from March 2014 to October 2016. The levels of D-dimer (DD), fibrinogen degradation products (FDP), activated partial thromboplastin time (APTT), prothrombin time (PT), thrombin time (TT), fibrinogen (Fg), and platelet aggregation in response to arachidonic acid (AA) and adenosine diphosphate (ADP) were collected. Receiver operating characteristic curve analysis was used to analyze data from 158 UPRL patients (≥3 previous first trimester pregnancy losses with unexplained etiology) and 131 non-RPL patients (no history of recurrent pregnancy loss). A logistic regression model (LRM) was built and the model was externally validated in another group of patients. The LRM included AA, DD, FDP, TT, APTT, and PT. The overall accuracy of the LRM was 80.9%, with sensitivity and specificity of 78.5% and 78.3%, respectively. The diagnostic threshold of the possibility of the LRM was 0.6492, with a sensitivity of 78.5% and a specificity of 78.3%. Subsequently, the LRM was validated with an overall accuracy of 83.6%. The LRM is a valuable model for prediction of thrombophilia in URPL patients. © 2017 International Federation of Gynecology and Obstetrics.
Integration of research infrastructures and ecosystem models toward development of predictive ecology

Science.gov (United States)

Luo, Y.; Huang, Y.; Jiang, J.; MA, S.; Saruta, V.; Liang, G.; Hanson, P. J.; Ricciuto, D. M.; Milcu, A.; Roy, J.

2017-12-01

The past two decades have witnessed rapid development in sensor technology. Built upon the sensor development, large research infrastructure facilities, such as National Ecological Observatory Network (NEON) and FLUXNET, have been established. Through networking different kinds of sensors and other data collections at many locations all over the world, those facilities generate large volumes of ecological data every day. The big data from those facilities offer an unprecedented opportunity for advancing our understanding of ecological processes, educating teachers and students, supporting decision-making, and testing ecological theory. The big data from the major research infrastructure facilities also provides foundation for developing predictive ecology. Indeed, the capability to predict future changes in our living environment and natural resources is critical to decision making in a world where the past is no longer a clear guide to the future. We are living in a period marked by rapid climate change, profound alteration of biogeochemical cycles, unsustainable depletion of natural resources, and deterioration of air and water quality. Projecting changes in future ecosystem services to the society becomes essential not only for science but also for policy making. We will use this panel format to outline major opportunities and challenges in integrating research infrastructure and ecosystem models toward developing predictive ecology. Meanwhile, we will also show results from an interactive model-experiment System - Ecological Platform for Assimilating Data into models (EcoPAD) - that have been implemented at the Spruce and Peatland Responses Under Climatic and Environmental change (SPRUCE) experiment in Northern Minnesota and Montpellier Ecotron, France. EcoPAD is developed by integrating web technology, eco-informatics, data assimilation techniques, and ecosystem modeling. EcoPAD is designed to streamline data transfer seamlessly from research infrastructure
Statistical analysis in the design of nuclear fuel cells and training of a neural network to predict safety parameters for reactors BWR

International Nuclear Information System (INIS)

Jauregui Ch, V.

2013-01-01

In this work the obtained results for a statistical analysis are shown, with the purpose of studying the performance of the fuel lattice, taking into account the frequency of the pins that were used. For this objective, different statistical distributions were used; one approximately to normal, another type X 2 but in an inverse form and a random distribution. Also, the prediction of some parameters of the nuclear reactor in a fuel reload was made through a neuronal network, which was trained. The statistical analysis was made using the parameters of the fuel lattice, which was generated through three heuristic techniques: Ant Colony Optimization System, Neuronal Networks and a hybrid among Scatter Search and Path Re linking. The behavior of the local power peak factor was revised in the fuel lattice with the use of different frequencies of enrichment uranium pines, using the three techniques mentioned before, in the same way the infinite multiplication factor of neutrons was analyzed (k..), to determine within what range this factor in the reactor is. Taking into account all the information, which was obtained through the statistical analysis, a neuronal network was trained; that will help to predict the behavior of some parameters of the nuclear reactor, considering a fixed fuel reload with their respective control rods pattern. In the same way, the quality of the training was evaluated using different fuel lattices. The neuronal network learned to predict the next parameters: Shutdown Margin (SDM), the pin burn peaks for two different fuel batches, Thermal Limits and the Effective Neutron Multiplication Factor (k eff ). The results show that the fuel lattices in which the frequency, which the inverted form of the X 2 distribution, was used revealed the best values of local power peak factor. Additionally it is shown that the performance of a fuel lattice could be enhanced controlling the frequency of the uranium enrichment rods and the variety of the gadolinium
Understanding Eating Behaviors through Parental Communication and the Integrative Model of Behavioral Prediction.

Science.gov (United States)

Scheinfeld, Emily; Shim, Minsun

2017-05-01

Emerging adulthood (EA) is an important yet overlooked period for developing long-term health behaviors. During these years, emerging adults adopt health behaviors that persist throughout life. This study applies the Integrative Model of Behavioral Prediction (IMBP) to examine the role of childhood parental communication in predicting engagement in healthful eating during EA. Participants included 239 college students, ages 18 to 25, from a large university in the southern United States. Participants were recruited and data collection occurred spring 2012. Participants responded to measures to assess perceived parental communication, eating behaviors, attitudes, subjective norms, and behavioral control over healthful eating. SEM and mediation analyses were used to address the hypotheses posited. Data demonstrated that perceived parent-child communication - specifically, its quality and target-specific content - significantly predicted emerging adults' eating behaviors, mediated through subjective norm and perceived behavioral control. This study sets the stage for further exploration and understanding of different ways parental communication influences emerging adults' healthy behavior enactment.
Predictive capacity of a non-radioisotopic local lymph node assay using flow cytometry, LLNA:BrdU-FCM: Comparison of a cutoff approach and inferential statistics.

Science.gov (United States)

Kim, Da-Eun; Yang, Hyeri; Jang, Won-Hee; Jung, Kyoung-Mi; Park, Miyoung; Choi, Jin Kyu; Jung, Mi-Sook; Jeon, Eun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Park, Jung Eun; Sohn, Soo Jung; Kim, Tae Sung; Ahn, Il Young; Jeong, Tae-Cheon; Lim, Kyung-Min; Bae, SeungJin

2016-01-01

In order for a novel test method to be applied for regulatory purposes, its reliability and relevance, i.e., reproducibility and predictive capacity, must be demonstrated. Here, we examine the predictive capacity of a novel non-radioisotopic local lymph node assay, LLNA:BrdU-FCM (5-bromo-2'-deoxyuridine-flow cytometry), with a cutoff approach and inferential statistics as a prediction model. 22 reference substances in OECD TG429 were tested with a concurrent positive control, hexylcinnamaldehyde 25%(PC), and the stimulation index (SI) representing the fold increase in lymph node cells over the vehicle control was obtained. The optimal cutoff SI (2.7≤cutoff <3.5), with respect to predictive capacity, was obtained by a receiver operating characteristic curve, which produced 90.9% accuracy for the 22 substances. To address the inter-test variability in responsiveness, SI values standardized with PC were employed to obtain the optimal percentage cutoff (42.6≤cutoff <57.3% of PC), which produced 86.4% accuracy. A test substance may be diagnosed as a sensitizer if a statistically significant increase in SI is elicited. The parametric one-sided t-test and non-parametric Wilcoxon rank-sum test produced 77.3% accuracy. Similarly, a test substance could be defined as a sensitizer if the SI means of the vehicle control, and of the low, middle, and high concentrations were statistically significantly different, which was tested using ANOVA or Kruskal-Wallis, with post hoc analysis, Dunnett, or DSCF (Dwass-Steel-Critchlow-Fligner), respectively, depending on the equal variance test, producing 81.8% accuracy. The absolute SI-based cutoff approach produced the best predictive capacity, however the discordant decisions between prediction models need to be examined further. Copyright © 2015 Elsevier Inc. All rights reserved.
Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis

Directory of Open Access Journals (Sweden)

Turnbull Arran K

2012-08-01

Full Text Available Abstract Background Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis. Results Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN, significantly outperform mean-centering and distance-weighted discrimination (DWD in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. Conclusion Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis.
SU-F-BRB-10: A Statistical Voxel Based Normal Organ Dose Prediction Model for Coplanar and Non-Coplanar Prostate Radiotherapy

Energy Technology Data Exchange (ETDEWEB)

Tran, A; Yu, V; Nguyen, D; Woods, K; Low, D; Sheng, K [UCLA, Los Angeles, CA (United States)

2015-06-15

Purpose: Knowledge learned from previous plans can be used to guide future treatment planning. Existing knowledge-based treatment planning methods study the correlation between organ geometry and dose volume histogram (DVH), which is a lossy representation of the complete dose distribution. A statistical voxel dose learning (SVDL) model was developed that includes the complete dose volume information. Its accuracy of predicting volumetric-modulated arc therapy (VMAT) and non-coplanar 4π radiotherapy was quantified. SVDL provided more isotropic dose gradients and may improve knowledge-based planning. Methods: 12 prostate SBRT patients originally treated using two full-arc VMAT techniques were re-planned with 4π using 20 intensity-modulated non-coplanar fields to a prescription dose of 40 Gy. The bladder and rectum voxels were binned based on their distances to the PTV. The dose distribution in each bin was resampled by convolving to a Gaussian kernel, resulting in 1000 data points in each bin that predicted the statistical dose information of a voxel with unknown dose in a new patient without triaging information that may be collectively important to a particular patient. We used this method to predict the DVHs, mean and max doses in a leave-one-out cross validation (LOOCV) test and compared its performance against lossy estimators including mean, median, mode, Poisson and Rayleigh of the voxelized dose distributions. Results: SVDL predicted the bladder and rectum doses more accurately than other estimators, giving mean percentile errors ranging from 13.35–19.46%, 4.81–19.47%, 22.49–28.69%, 23.35–30.5%, 21.05–53.93% for predicting mean, max dose, V20, V35, and V40 respectively, to OARs in both planning techniques. The prediction errors were generally lower for 4π than VMAT. Conclusion: By employing all dose volume information in the SVDL model, the OAR doses were more accurately predicted. 4π plans are better suited for knowledge-based planning than
Calculation of solar irradiation prediction intervals combining volatility and kernel density estimates

International Nuclear Information System (INIS)

Trapero, Juan R.

2016-01-01

In order to integrate solar energy into the grid it is important to predict the solar radiation accurately, where forecast errors can lead to significant costs. Recently, the increasing statistical approaches that cope with this problem is yielding a prolific literature. In general terms, the main research discussion is centred on selecting the “best” forecasting technique in accuracy terms. However, the need of the users of such forecasts require, apart from point forecasts, information about the variability of such forecast to compute prediction intervals. In this work, we will analyze kernel density estimation approaches, volatility forecasting models and combination of both of them in order to improve the prediction intervals performance. The results show that an optimal combination in terms of prediction interval statistical tests can achieve the desired confidence level with a lower average interval width. Data from a facility located in Spain are used to illustrate our methodology. - Highlights: • This work explores uncertainty forecasting models to build prediction intervals. • Kernel density estimators, exponential smoothing and GARCH models are compared. • An optimal combination of methods provides the best results. • A good compromise between coverage and average interval width is shown.
Bootstrap prediction and Bayesian prediction under misspecified models

OpenAIRE

Fushiki, Tadayoshi

2005-01-01

We consider a statistical prediction problem under misspecified models. In a sense, Bayesian prediction is an optimal prediction method when an assumed model is true. Bootstrap prediction is obtained by applying Breiman's `bagging' method to a plug-in prediction. Bootstrap prediction can be considered to be an approximation to the Bayesian prediction under the assumption that the model is true. However, in applications, there are frequently deviations from the assumed model. In this paper, bo...
Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review

International Nuclear Information System (INIS)

Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan

2017-01-01

Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0–0.10 m, or 0–0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component
Prediction of dimethyl disulfide levels from biosolids using statistical modeling.

Science.gov (United States)

Gabriel, Steven A; Vilalai, Sirapong; Arispe, Susanna; Kim, Hyunook; McConnell, Laura L; Torrents, Alba; Peot, Christopher; Ramirez, Mark

2005-01-01

Two statistical models were used to predict the concentration of dimethyl disulfide (DMDS) released from biosolids produced by an advanced wastewater treatment plant (WWTP) located in Washington, DC, USA. The plant concentrates sludge from primary sedimentation basins in gravity thickeners (GT) and sludge from secondary sedimentation basins in dissolved air flotation (DAF) thickeners. The thickened sludge is pumped into blending tanks and then fed into centrifuges for dewatering. The dewatered sludge is then conditioned with lime before trucking out from the plant. DMDS, along with other volatile sulfur and nitrogen-containing chemicals, is known to contribute to biosolids odors. These models identified oxidation/reduction potential (ORP) values of a GT and DAF, the amount of sludge dewatered by centrifuges, and the blend ratio between GT thickened sludge and DAF thickened sludge in blending tanks as control variables. The accuracy of the developed regression models was evaluated by checking the adjusted R2 of the regression as well as the signs of coefficients associated with each variable. In general, both models explained observed DMDS levels in sludge headspace samples. The adjusted R2 value of the regression models 1 and 2 were 0.79 and 0.77, respectively. Coefficients for each regression model also had the correct sign. Using the developed models, plant operators can adjust the controllable variables to proactively decrease this odorant. Therefore, these models are a useful tool in biosolids management at WWTPs.
Ultimate compression after impact load prediction in graphite/epoxy coupons using neural network and multivariate statistical analyses

Science.gov (United States)

Gregoire, Alexandre David

2011-07-01

The goal of this research was to accurately predict the ultimate compressive load of impact damaged graphite/epoxy coupons using a Kohonen self-organizing map (SOM) neural network and multivariate statistical regression analysis (MSRA). An optimized use of these data treatment tools allowed the generation of a simple, physically understandable equation that predicts the ultimate failure load of an impacted damaged coupon based uniquely on the acoustic emissions it emits at low proof loads. Acoustic emission (AE) data were collected using two 150 kHz resonant transducers which detected and recorded the AE activity given off during compression to failure of thirty-four impacted 24-ply bidirectional woven cloth laminate graphite/epoxy coupons. The AE quantification parameters duration, energy and amplitude for each AE hit were input to the Kohonen self-organizing map (SOM) neural network to accurately classify the material failure mechanisms present in the low proof load data. The number of failure mechanisms from the first 30% of the loading for twenty-four coupons were used to generate a linear prediction equation which yielded a worst case ultimate load prediction error of 16.17%, just outside of the +/-15% B-basis allowables, which was the goal for this research. Particular emphasis was placed upon the noise removal process which was largely responsible for the accuracy of the results.

Archaeology Through Computational Linguistics: Inscription Statistics Predict Excavation Sites of Indus Valley Artifacts.

Science.gov (United States)

Recchia, Gabriel L; Louwerse, Max M

2016-11-01

Computational techniques comparing co-occurrences of city names in texts allow the relative longitudes and latitudes of cities to be estimated algorithmically. However, these techniques have not been applied to estimate the provenance of artifacts with unknown origins. Here, we estimate the geographic origin of artifacts from the Indus Valley Civilization, applying methods commonly used in cognitive science to the Indus script. We show that these methods can accurately predict the relative locations of archeological sites on the basis of artifacts of known provenance, and we further apply these techniques to determine the most probable excavation sites of four sealings of unknown provenance. These findings suggest that inscription statistics reflect historical interactions among locations in the Indus Valley region, and they illustrate how computational methods can help localize inscribed archeological artifacts of unknown origin. The success of this method offers opportunities for the cognitive sciences in general and for computational anthropology specifically. Copyright © 2015 Cognitive Science Society, Inc.
Development of the statistical ARIMA model: an application for predicting the upcoming of MJO index

Science.gov (United States)

Hermawan, Eddy; Nurani Ruchjana, Budi; Setiawan Abdullah, Atje; Gede Nyoman Mindra Jaya, I.; Berliana Sipayung, Sinta; Rustiana, Shailla

2017-10-01

This study is mainly concerned in development one of the most important equatorial atmospheric phenomena that we call as the Madden Julian Oscillation (MJO) which having strong impacts to the extreme rainfall anomalies over the Indonesian Maritime Continent (IMC). In this study, we focused to the big floods over Jakarta and surrounded area that suspecting caused by the impacts of MJO. We concentrated to develop the MJO index using the statistical model that we call as Box-Jenkis (ARIMA) ini 1996, 2002, and 2007, respectively. They are the RMM (Real Multivariate MJO) index as represented by RMM1 and RMM2, respectively. There are some steps to develop that model, starting from identification of data, estimated, determined model, before finally we applied that model for investigation some big floods that occurred at Jakarta in 1996, 2002, and 2007 respectively. We found the best of estimated model for the RMM1 and RMM2 prediction is ARIMA (2,1,2). Detailed steps how that model can be extracted and applying to predict the rainfall anomalies over Jakarta for 3 to 6 months later is discussed at this paper.
Predicting the distribution of four species of raptors (Aves: Accipitridae) in southern Spain: statistical models work better than existing maps

OpenAIRE

Bustamante, Javier; Seoane, Javier

2004-01-01

Aim To test the effectiveness of statistical models based on explanatory environmental variables vs. existing distribution information (maps and breeding atlas), for predicting the distribution of four species of raptors (family Accipitridae): common buzzard Buteo buteo (Linnaeus, 1758), short-toed eagle Circaetus gallicus (Gmelin, 1788), booted eagle Hieraaetus pennatus (Gmelin, 1788) and black kite Milvus migrans (Boddaert, 1783). Location Andalusia, southe...
TRAN-STAT: statistics for environmental studies

International Nuclear Information System (INIS)

Gilbert, R.O.

1984-09-01

This issue of TRAN-STAT discusses statistical methods for assessing the uncertainty in predictions of pollutant transport models, particularly for radionuclides. Emphasis is placed on radionuclide transport models but the statistical assessment techniques also apply in general to other types of pollutants. The report begins with an outline of why an assessment of prediction uncertainties is important. This is followed by an introduction to several methods currently used in these assessments. This in turn is followed by more detailed discussion of the methods, including examples. 43 references, 2 figures
Comparison of experimental pulse-height distributions in germanium detectors with integrated-tiger-series-code predictions

International Nuclear Information System (INIS)

Beutler, D.E.; Halbleib, J.A.; Knott, D.P.

1989-01-01

This paper reports pulse-height distributions in two different types of Ge detectors measured for a variety of medium-energy x-ray bremsstrahlung spectra. These measurements have been compared to predictions using the integrated tiger series (ITS) Monte Carlo electron/photon transport code. In general, the authors find excellent agreement between experiments and predictions using no free parameters. These results demonstrate that the ITS codes can predict the combined bremsstrahlung production and energy deposition with good precision (within measurement uncertainties). The one region of disagreement observed occurs for low-energy (<50 keV) photons using low-energy bremsstrahlung spectra. In this case the ITS codes appear to underestimate the produced and/or absorbed radiation by almost an order of magnitude
A statistical forecast model using the time-scale decomposition technique to predict rainfall during flood period over the middle and lower reaches of the Yangtze River Valley

Science.gov (United States)

Hu, Yijia; Zhong, Zhong; Zhu, Yimin; Ha, Yao

2018-04-01

In this paper, a statistical forecast model using the time-scale decomposition method is established to do the seasonal prediction of the rainfall during flood period (FPR) over the middle and lower reaches of the Yangtze River Valley (MLYRV). This method decomposites the rainfall over the MLYRV into three time-scale components, namely, the interannual component with the period less than 8 years, the interdecadal component with the period from 8 to 30 years, and the interdecadal component with the period larger than 30 years. Then, the predictors are selected for the three time-scale components of FPR through the correlation analysis. At last, a statistical forecast model is established using the multiple linear regression technique to predict the three time-scale components of the FPR, respectively. The results show that this forecast model can capture the interannual and interdecadal variation of FPR. The hindcast of FPR during 14 years from 2001 to 2014 shows that the FPR can be predicted successfully in 11 out of the 14 years. This forecast model performs better than the model using traditional scheme without time-scale decomposition. Therefore, the statistical forecast model using the time-scale decomposition technique has good skills and application value in the operational prediction of FPR over the MLYRV.
Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

Science.gov (United States)

Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D

2017-01-01

If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.
Fractional statistics and quantum scaling properties of the integrable Penson-Kolb-Hubbard chain

Science.gov (United States)

Vitoriano, Carlindo; Coutinho-Filho, M. D.

2010-09-01

We investigate the ground-state and low-temperature properties of the integrable version of the Penson-Kolb-Hubbard chain. The model obeys fractional statistical properties, which give rise to fractional elementary excitations and manifest differently in the four regions of the phase diagram U/t versus n , where U is the Coulomb coupling, t is the correlated hopping amplitude, and n is the particle density. In fact, we can find local pair formation, fractionalization of the average occupation number per orbital k , or U - and n -dependent average electric charge per orbital k . We also study the scaling behavior near the U -driven quantum phase transitions and characterize their universality classes. Finally, it is shown that in the regime of parameters where local pair formation is energetically more favorable, the ground state exhibits power-law superconductivity; we also stress that above half filling the pair-hopping term stabilizes local Cooper pairs in the repulsive- U regime for U
Integrating environmental and genetic effects to predict responses of tree populations to climate.

Science.gov (United States)

Wang, Tongli; O'Neill, Gregory A; Aitken, Sally N

2010-01-01

Climate is a major environmental factor affecting the phenotype of trees and is also a critical agent of natural selection that has molded among-population genetic variation. Population response functions describe the environmental effect of planting site climates on the performance of a single population, whereas transfer functions describe among-population genetic variation molded by natural selection for climate. Although these approaches are widely used to predict the responses of trees to climate change, both have limitations. We present a novel approach that integrates both genetic and environmental effects into a single "universal response function" (URF) to better predict the influence of climate on phenotypes. Using a large lodgepole pine (Pinus contorta Dougl. ex Loud.) field transplant experiment composed of 140 populations planted on 62 sites to demonstrate the methodology, we show that the URF makes full use of data from provenance trials to: (1) improve predictions of climate change impacts on phenotypes; (2) reduce the size and cost of future provenance trials without compromising predictive power; (3) more fully exploit existing, less comprehensive provenance tests; (4) quantify and compare environmental and genetic effects of climate on population performance; and (5) predict the performance of any population growing in any climate. Finally, we discuss how the last attribute allows the URF to be used as a mechanistic model to predict population and species ranges for the future and to guide assisted migration of seed for reforestation, restoration, or afforestation and genetic conservation in a changing climate.
Predicting sugar consumption: Application of an integrated dual-process, dual-phase model.

Science.gov (United States)

Hagger, Martin S; Trost, Nadine; Keech, Jacob J; Chan, Derwin K C; Hamilton, Kyra

2017-09-01

Excess consumption of added dietary sugars is related to multiple metabolic problems and adverse health conditions. Identifying the modifiable social cognitive and motivational constructs that predict sugar consumption is important to inform behavioral interventions aimed at reducing sugar intake. We tested the efficacy of an integrated dual-process, dual-phase model derived from multiple theories to predict sugar consumption. Using a prospective design, university students (N = 90) completed initial measures of the reflective (autonomous and controlled motivation, intentions, attitudes, subjective norm, perceived behavioral control), impulsive (implicit attitudes), volitional (action and coping planning), and behavioral (past sugar consumption) components of the proposed model. Self-reported sugar consumption was measured two weeks later. A structural equation model revealed that intentions, implicit attitudes, and, indirectly, autonomous motivation to reduce sugar consumption had small, significant effects on sugar consumption. Attitudes, subjective norm, and, indirectly, autonomous motivation to reduce sugar consumption predicted intentions. There were no effects of the planning constructs. Model effects were independent of the effects of past sugar consumption. The model identified the relative contribution of reflective and impulsive components in predicting sugar consumption. Given the prominent role of the impulsive component, interventions that assist individuals in managing cues-to-action and behavioral monitoring are likely to be effective in regulating sugar consumption. Copyright © 2017 Elsevier Ltd. All rights reserved.
An integrated multi-label classifier with chemical-chemical interactions for prediction of chemical toxicity effects.

Science.gov (United States)

Liu, Tao; Chen, Lei; Pan, Xiaoyong

2018-05-31

Chemical toxicity effect is one of the major reasons for declining candidate drugs. Detecting the toxicity effects of all chemicals can accelerate the procedures of drug discovery. However, it is time-consuming and expensive to identify the toxicity effects of a given chemical through traditional experiments. Designing quick, reliable and non-animal-involved computational methods is an alternative way. In this study, a novel integrated multi-label classifier was proposed. First, based on five types of chemical-chemical interactions retrieved from STITCH, each of which is derived from one aspect of chemicals, five individual classifiers were built. Then, several integrated classifiers were built by integrating some or all individual classifiers. By testing the integrated classifiers on a dataset with chemicals and their toxicity effects in Accelrys Toxicity database and non-toxic chemicals with their performance evaluated by jackknife test, an optimal integrated classifier was selected as the proposed classifier, which provided quite high prediction accuracies and wide applications. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Analysis of gene expression profiles of soft tissue sarcoma using a combination of knowledge-based filtering with integration of multiple statistics.

Directory of Open Access Journals (Sweden)

Anna Takahashi

Full Text Available The diagnosis and treatment of soft tissue sarcomas (STS have been difficult. Of the diverse histological subtypes, undifferentiated pleomorphic sarcoma (UPS is particularly difficult to diagnose accurately, and its classification per se is still controversial. Recent advances in genomic technologies provide an excellent way to address such problems. However, it is often difficult, if not impossible, to identify definitive disease-associated genes using genome-wide analysis alone, primarily because of multiple testing problems. In the present study, we analyzed microarray data from 88 STS patients using a combination method that used knowledge-based filtering and a simulation based on the integration of multiple statistics to reduce multiple testing problems. We identified 25 genes, including hypoxia-related genes (e.g., MIF, SCD1, P4HA1, ENO1, and STAT1 and cell cycle- and DNA repair-related genes (e.g., TACC3, PRDX1, PRKDC, and H2AFY. These genes showed significant differential expression among histological subtypes, including UPS, and showed associations with overall survival. STAT1 showed a strong association with overall survival in UPS patients (logrank p = 1.84 × 10(-6 and adjusted p value 2.99 × 10(-3 after the permutation test. According to the literature, the 25 genes selected are useful not only as markers of differential diagnosis but also as prognostic/predictive markers and/or therapeutic targets for STS. Our combination method can identify genes that are potential prognostic/predictive factors and/or therapeutic targets in STS and possibly in other cancers. These disease-associated genes deserve further preclinical and clinical validation.
Predictive analysis of beer quality by correlating sensory evaluation with higher alcohol and ester production using multivariate statistics methods.

Science.gov (United States)

Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru

2014-10-15

Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Standardized Evaluation System for Decadal Climate Prediction

Science.gov (United States)

Kadow, C.; Cubasch, U.

2012-12-01

The evaluation of decadal prediction systems is a scientific challenge as well as a technical challenge in the climate research. The major project MiKlip (www.fona-miklip.de) for medium-term climate prediction funded by the Federal Ministry of Education and Research in Germany (BMBF) has the aim to create a model system that can provide reliable decadal forecasts on climate and weather. The model system to be developed will be novel in several aspects, with great challenges for the methodology development. This concerns especially the determination of the initial conditions, the inclusion into the model of processes relevant to decadal predictions, the increase of the spatial resolution through regionalisation, the improvement or adjustment of statistical post-processing, and finally the synthesis and validation of the entire model system. Therefore, a standardized evaluation system will be part of the MiKlip system to validate it - developed by the project 'Integrated data and evaluation system for decadal scale prediction' (INTEGRATION). The presentation gives an overview of the different linkages of such a project, shows the different development stages and gives an outlook for users and possible end users in climate service. The technical interface combines all projects inside of MiKlip and invites them to participate in a common evaluation system. The system design and the validation strategy from a standalone tool in the beginning to a user friendly web based system using GRID technologies to an integrated part of the operational MiKlip system for industry and society will give the opportunity to enhance the MiKlip strategy. First results of different possibilities of such a system will be shown to present the scientific background through Taylor diagrams, ensemble skill scores and e.g. climatological means to show the usability and possibilities of MiKlip and the INTEGRATION project.
Predicting Self-Management Behaviors in Familial Hypercholesterolemia Using an Integrated Theoretical Model: the Impact of Beliefs About Illnesses and Beliefs About Behaviors.

Science.gov (United States)

Hagger, Martin S; Hardcastle, Sarah J; Hingley, Catherine; Strickland, Ella; Pang, Jing; Watts, Gerald F

2016-06-01

Patients with familial hypercholesterolemia (FH) are at markedly increased risk of coronary artery disease. Regular participation in three self-management behaviors, physical activity, healthy eating, and adherence to medication, can significantly reduce this risk in FH patients. We aimed to predict intentions to engage in these self-management behaviors in FH patients using a multi-theory, integrated model that makes the distinction between beliefs about illness and beliefs about self-management behaviors. Using a cross-sectional, correlational design, patients (N = 110) diagnosed with FH from a clinic in Perth, Western Australia, self-completed a questionnaire that measured constructs from three health behavior theories: the common sense model of illness representations (serious consequences, timeline, personal control, treatment control, illness coherence, emotional representations); theory of planned behavior (attitudes, subjective norms, perceived behavioral control); and social cognitive theory (self-efficacy). Structural equation models for each self-management behavior revealed consistent and statistically significant effects of attitudes on intentions across the three behaviors. Subjective norms predicted intentions for health eating only and self-efficacy predicted intentions for physical activity only. There were no effects for the perceived behavioral control and common sense model constructs in any model. Attitudes feature prominently in determining intentions to engage in self-management behaviors in FH patients. The prominence of these attitudinal beliefs about self-management behaviors, as opposed to illness beliefs, suggest that addressing these beliefs may be a priority in the management of FH.
Integration of association statistics over genomic regions using Bayesian adaptive regression splines

Directory of Open Access Journals (Sweden)

Zhang Xiaohua

2003-11-01

Full Text Available Abstract In the search for genetic determinants of complex disease, two approaches to association analysis are most often employed, testing single loci or testing a small group of loci jointly via haplotypes for their relationship to disease status. It is still debatable which of these approaches is more favourable, and under what conditions. The former has the advantage of simplicity but suffers severely when alleles at the tested loci are not in linkage disequilibrium (LD with liability alleles; the latter should capture more of the signal encoded in LD, but is far from simple. The complexity of haplotype analysis could be especially troublesome for association scans over large genomic regions, which, in fact, is becoming the standard design. For these reasons, the authors have been evaluating statistical methods that bridge the gap between single-locus and haplotype-based tests. In this article, they present one such method, which uses non-parametric regression techniques embodied by Bayesian adaptive regression splines (BARS. For a set of markers falling within a common genomic region and a corresponding set of single-locus association statistics, the BARS procedure integrates these results into a single test by examining the class of smooth curves consistent with the data. The non-parametric BARS procedure generally finds no signal when no liability allele exists in the tested region (ie it achieves the specified size of the test and it is sensitive enough to pick up signals when a liability allele is present. The BARS procedure provides a robust and potentially powerful alternative to classical tests of association, diminishes the multiple testing problem inherent in those tests and can be applied to a wide range of data types, including genotype frequencies estimated from pooled samples.
Global identification predicts gay-male identity integration and well-being among Turkish gay men.

Science.gov (United States)

Koc, Yasin; Vignoles, Vivian L

2016-12-01

In most parts of the world, hegemonic masculinity requires men to endorse traditional masculine ideals, one of which is rejection of homosexuality. Wherever hegemonic masculinity favours heterosexuality over homosexuality, gay males may feel under pressure to negotiate their conflicting male gender and gay sexual identities to maintain positive self-perceptions. However, globalization, as a source of intercultural interaction, might provide a beneficial context for people wishing to create alternative masculinities in the face of hegemonic masculinity. Hence, we tested if global identification would predict higher levels of gay-male identity integration, and indirectly subjective well-being, via alternative masculinity representations for gay and male identities. A community sample of 219 gay and bisexual men from Turkey completed the study. Structural equation modelling revealed that global identification positively predicted gay-male identity integration, and indirectly subjective well-being; however, alternative masculinity representations did not mediate this relationship. Our findings illustrate how identity categories in different domains can intersect and affect each other in complex ways. Moreover, we discuss mental health and well-being implications for gay men living in cultures where they experience high levels of prejudice and stigma. © 2016 The British Psychological Society.
An integrated numerical model for the prediction of Gaussian and billet shapes

DEFF Research Database (Denmark)

Hattel, Jesper; Pryds, Nini; Pedersen, Trine Bjerre

2004-01-01

Separate models for the atomisation and the deposition stages were recently integrated by the authors to form a unified model describing the entire spray-forming process. In the present paper, the focus is on describing the shape of the deposited material during the spray-forming process, obtained...... by this model. After a short review of the models and their coupling, the important factors which influence the resulting shape, i.e. Gaussian or billet, are addressed. The key parameters, which are utilized to predict the geometry and dimension of the deposited material, are the sticking efficiency...
An integrated computational validation approach for potential novel miRNA prediction

Directory of Open Access Journals (Sweden)

Pooja Viswam

2017-12-01

Full Text Available MicroRNAs (miRNAs are short, non-coding RNAs between 17bp-24bp length that regulate gene expression by targeting mRNA molecules. The regulatory functions of miRNAs are known to be majorly associated with disease phenotypes such as cancer, cell signaling, cell division, growth and other metabolisms. Novel miRNAs are defined as sequences which does not have any similarity with the existing known sequences and void of any experimental evidences. In recent decades, the advent of next-generation sequencing allows us to capture the small RNA molecules form the cells and developing methods to estimate their expression levels. Several computational algorithms are available to predict the novel miRNAs from the deep sequencing data. In this work, we integrated three novel miRNA prediction programs miRDeep, miRanalyzer and miRPRo to compare and validate their prediction efficiency. The dicer cleavage sites, alignment density, seed conservation, minimum free energy, AU-GC percentage, secondary loop scores, false discovery rates and confidence scores will be considered for comparison and evaluation. Efficiency to identify isomiRs and base pair mismatches in a strand specific manner will also be considered for the computational validation. Further, the criteria and parameters for the identification of the best possible novel miRNA with minimal false positive rates were deduced.
Integration of relational and hierarchical network information for protein function prediction

Directory of Open Access Journals (Sweden)

Jiang Xiaoyu

2008-08-01

Full Text Available Abstract Background In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions. Results We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing. Conclusion A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased

Detailed comparison of next-to-leading order predictions for jet photoproduction at HERA.

Energy Technology Data Exchange (ETDEWEB)

Harris, B. W.; Klassen, M.; Vossebeld, J.

1999-06-02

The precision of new HERA data on jet photoproduction opens up the possibility to discriminate between different models of the photon structure. This requires equally precise theoretical predictions from perturbative QCD calculations. In the past years, next-to-leading order calculations for the photoproduction of jets at HERA have become available. Using the kinematic cuts of recent ZEUS analyses, we compare the predictions of three calculations for different dijet and three-jet distributions. We find that in general all three calculations agree within the statistical accuracy of the Monte Carlo integration yielding reliable theoretical predictions. In certain restricted regions of phase space, the calculations differ by up to 5%.
Statistical sampling approaches for soil monitoring

NARCIS (Netherlands)

Brus, D.J.

2014-01-01

This paper describes three statistical sampling approaches for regional soil monitoring, a design-based, a model-based and a hybrid approach. In the model-based approach a space-time model is exploited to predict global statistical parameters of interest such as the space-time mean. In the hybrid
Population activity statistics dissect subthreshold and spiking variability in V1.

Science.gov (United States)

Bányai, Mihály; Koman, Zsombor; Orbán, Gergő

2017-07-01

Response variability, as measured by fluctuating responses upon repeated performance of trials, is a major component of neural responses, and its characterization is key to interpret high dimensional population recordings. Response variability and covariability display predictable changes upon changes in stimulus and cognitive or behavioral state, providing an opportunity to test the predictive power of models of neural variability. Still, there is little agreement on which model to use as a building block for population-level analyses, and models of variability are often treated as a subject of choice. We investigate two competing models, the doubly stochastic Poisson (DSP) model assuming stochasticity at spike generation, and the rectified Gaussian (RG) model tracing variability back to membrane potential variance, to analyze stimulus-dependent modulation of both single-neuron and pairwise response statistics. Using a pair of model neurons, we demonstrate that the two models predict similar single-cell statistics. However, DSP and RG models have contradicting predictions on the joint statistics of spiking responses. To test the models against data, we build a population model to simulate stimulus change-related modulations in pairwise response statistics. We use single-unit data from the primary visual cortex (V1) of monkeys to show that while model predictions for variance are qualitatively similar to experimental data, only the RG model's predictions are compatible with joint statistics. These results suggest that models using Poisson-like variability might fail to capture important properties of response statistics. We argue that membrane potential-level modeling of stochasticity provides an efficient strategy to model correlations. NEW & NOTEWORTHY Neural variability and covariability are puzzling aspects of cortical computations. For efficient decoding and prediction, models of information encoding in neural populations hinge on an appropriate model of
Statistical Modelling of Wind Proles - Data Analysis and Modelling

DEFF Research Database (Denmark)

Jónsson, Tryggvi; Pinson, Pierre

The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles.......The aim of the analysis presented in this document is to investigate whether statistical models can be used to make very short-term predictions of wind profiles....
Integrative miRNA-Gene Expression Analysis Enables Refinement of Associated Biology and Prediction of Response to Cetuximab in Head and Neck Squamous Cell Cancer

Directory of Open Access Journals (Sweden)

Loris De Cecco

2017-01-01

Full Text Available This paper documents the process by which we, through gene and miRNA expression profiling of the same samples of head and neck squamous cell carcinomas (HNSCC and an integrative miRNA-mRNA expression analysis, were able to identify candidate biomarkers of progression-free survival (PFS in patients treated with cetuximab-based approaches. Through sparse partial least square–discriminant analysis (sPLS-DA and supervised analysis, 36 miRNAs were identified in two components that clearly separated long- and short-PFS patients. Gene set enrichment analysis identified a significant correlation between the miRNA first-component and EGFR signaling, keratinocyte differentiation, and p53. Another significant correlation was identified between the second component and RAS, NOTCH, immune/inflammatory response, epithelial–mesenchymal transition (EMT, and angiogenesis pathways. Regularized canonical correlation analysis of sPLS-DA miRNA and gene data combined with the MAGIA2 web-tool highlighted 16 miRNAs and 84 genes that were interconnected in a total of 245 interactions. After feature selection by a smoothed t-statistic support vector machine, we identified three miRNAs and five genes in the miRNA-gene network whose expression result was the most relevant in predicting PFS (Area Under the Curve, AUC = 0.992. Overall, using a well-defined clinical setting and up-to-date bioinformatics tools, we are able to give the proof of principle that an integrative miRNA-mRNA expression could greatly contribute to the refinement of the biology behind a predictive model.
PGT: A Statistical Approach to Prediction and Mechanism Design

Science.gov (United States)

Wolpert, David H.; Bono, James W.

One of the biggest challenges facing behavioral economics is the lack of a single theoretical framework that is capable of directly utilizing all types of behavioral data. One of the biggest challenges of game theory is the lack of a framework for making predictions and designing markets in a manner that is consistent with the axioms of decision theory. An approach in which solution concepts are distribution-valued rather than set-valued (i.e. equilibrium theory) has both capabilities. We call this approach Predictive Game Theory (or PGT). This paper outlines a general Bayesian approach to PGT. It also presents one simple example to illustrate the way in which this approach differs from equilibrium approaches in both prediction and mechanism design settings.
Multivariate statistical models for disruption prediction at ASDEX Upgrade

International Nuclear Information System (INIS)

Aledda, R.; Cannas, B.; Fanni, A.; Sias, G.; Pautasso, G.

2013-01-01

In this paper, a disruption prediction system for ASDEX Upgrade has been proposed that does not require disruption terminated experiments to be implemented. The system consists of a data-based model, which is built using only few input signals coming from successfully terminated pulses. A fault detection and isolation approach has been used, where the prediction is based on the analysis of the residuals of an auto regressive exogenous input model. The prediction performance of the proposed system is encouraging when it is applied to the same set of campaigns used to implement the model. However, the false alarms significantly increase when we tested the system on discharges coming from experimental campaigns temporally far from those used to train the model. This is due to the well know aging effect inherent in the data-based models. The main advantage of the proposed method, with respect to other data-based approaches in literature, is that it does not need data on experiments terminated with a disruption, as it uses a normal operating conditions model. This is a big advantage in the prospective of a prediction system for ITER, where a limited number of disruptions can be allowed
Statistical model based gender prediction for targeted NGS clinical panels

Directory of Open Access Journals (Sweden)

Palani Kannan Kandavel

2017-12-01

The reference test dataset are being used to test the model. The sensitivity on predicting the gender has been increased from the current “genotype composition in ChrX” based approach. In addition, the prediction score given by the model can be used to evaluate the quality of clinical dataset. The higher prediction score towards its respective gender indicates the higher quality of sequenced data.
Manufacturing Squares: An Integrative Statistical Process Control Exercise

Science.gov (United States)

Coy, Steven P.

2016-01-01

In the exercise, students in a junior-level operations management class are asked to manufacture a simple product. Given product specifications, they must design a production process, create roles and design jobs for each team member, and develop a statistical process control plan that efficiently and effectively controls quality during…
An Intelligent Model for Stock Market Prediction

Directory of Open Access Journals (Sweden)

IbrahimM. Hamed

2012-08-01

Full Text Available This paper presents an intelligent model for stock market signal prediction using Multi-Layer Perceptron (MLP Artificial Neural Networks (ANN. Blind source separation technique, from signal processing, is integrated with the learning phase of the constructed baseline MLP ANN to overcome the problems of prediction accuracy and lack of generalization. Kullback Leibler Divergence (KLD is used, as a learning algorithm, because it converges fast and provides generalization in the learning mechanism. Both accuracy and efficiency of the proposed model were confirmed through the Microsoft stock, from wall-street market, and various data sets, from different sectors of the Egyptian stock market. In addition, sensitivity analysis was conducted on the various parameters of the model to ensure the coverage of the generalization issue. Finally, statistical significance was examined using ANOVA test.
Predicting spatial and temporal gene expression using an integrative model of transcription factor occupancy and chromatin state.

Directory of Open Access Journals (Sweden)

Bartek Wilczynski

Full Text Available Precise patterns of spatial and temporal gene expression are central to metazoan complexity and act as a driving force for embryonic development. While there has been substantial progress in dissecting and predicting cis-regulatory activity, our understanding of how information from multiple enhancer elements converge to regulate a gene's expression remains elusive. This is in large part due to the number of different biological processes involved in mediating regulation as well as limited availability of experimental measurements for many of them. Here, we used a Bayesian approach to model diverse experimental regulatory data, leading to accurate predictions of both spatial and temporal aspects of gene expression. We integrated whole-embryo information on transcription factor recruitment to multiple cis-regulatory modules, insulator binding and histone modification status in the vicinity of individual gene loci, at a genome-wide scale during Drosophila development. The model uses Bayesian networks to represent the relation between transcription factor occupancy and enhancer activity in specific tissues and stages. All parameters are optimized in an Expectation Maximization procedure providing a model capable of predicting tissue- and stage-specific activity of new, previously unassayed genes. Performing the optimization with subsets of input data demonstrated that neither enhancer occupancy nor chromatin state alone can explain all gene expression patterns, but taken together allow for accurate predictions of spatio-temporal activity. Model predictions were validated using the expression patterns of more than 600 genes recently made available by the BDGP consortium, demonstrating an average 15-fold enrichment of genes expressed in the predicted tissue over a naïve model. We further validated the model by experimentally testing the expression of 20 predicted target genes of unknown expression, resulting in an accuracy of 95% for temporal
Integrating sequence stratigraphy and rock-physics to interpret seismic amplitudes and predict reservoir quality

Science.gov (United States)

Dutta, Tanima

This dissertation focuses on the link between seismic amplitudes and reservoir properties. Prediction of reservoir properties, such as sorting, sand/shale ratio, and cement-volume from seismic amplitudes improves by integrating knowledge from multiple disciplines. The key contribution of this dissertation is to improve the prediction of reservoir properties by integrating sequence stratigraphy and rock physics. Sequence stratigraphy has been successfully used for qualitative interpretation of seismic amplitudes to predict reservoir properties. Rock physics modeling allows quantitative interpretation of seismic amplitudes. However, often there is uncertainty about selecting geologically appropriate rock physics model and its input parameters, away from the wells. In the present dissertation, we exploit the predictive power of sequence stratigraphy to extract the spatial trends of sedimentological parameters that control seismic amplitudes. These spatial trends of sedimentological parameters can serve as valuable constraints in rock physics modeling, especially away from the wells. Consequently, rock physics modeling, integrated with the trends from sequence stratigraphy, become useful for interpreting observed seismic amplitudes away from the wells in terms of underlying sedimentological parameters. We illustrate this methodology using a comprehensive dataset from channelized turbidite systems, deposited in minibasin settings in the offshore Equatorial Guinea, West Africa. First, we present a practical recipe for using closed-form expressions of effective medium models to predict seismic velocities in unconsolidated sandstones. We use an effective medium model that combines perfectly rough and smooth grains (the extended Walton model), and use that model to derive coordination number, porosity, and pressure relations for P and S wave velocities from experimental data. Our recipe provides reasonable fits to other experimental and borehole data, and specifically
An integrated Modelling framework to monitor and predict trends of agricultural management (iMSoil)

Science.gov (United States)

Keller, Armin; Della Peruta, Raneiro; Schaepman, Michael; Gomez, Marta; Mann, Stefan; Schulin, Rainer

2014-05-01

Agricultural systems lay at the interface between natural ecosystems and the anthroposphere. Various drivers induce pressures on the agricultural systems, leading to changes in farming practice. The limitation of available land and the socio-economic drivers are likely to result in further intensification of agricultural land management, with implications on fertilization practices, soil and pest management, as well as crop and livestock production. In order to steer the development into desired directions, tools are required by which the effects of these pressures on agricultural management and resulting impacts on soil functioning can be detected as early as possible, future scenarios predicted and suitable management options and policies defined. In this context, the use of integrated models can play a major role in providing long-term predictions of soil quality and assessing the sustainability of agricultural soil management. Significant progress has been made in this field over the last decades. Some of these integrated modelling frameworks include biophysical parameters, but often the inherent characteristics and detailed processes of the soil system have been very simplified. The development of such tools has been hampered in the past by a lack of spatially explicit soil and land management information at regional scale. The iMSoil project, funded by the Swiss National Science Foundation in the national research programme NRP68 "soil as a resource" (www.nrp68.ch) aims at developing and implementing an integrated modeling framework (IMF) which can overcome the limitations mentioned above, by combining socio-economic, agricultural land management, and biophysical models, in order to predict the long-term impacts of different socio-economic scenarios on the soil quality. In our presentation we briefly outline the approach that is based on an interdisciplinary modular framework that builds on already existing monitoring tools and model components that are
A comparison of artificial neural networks with other statistical approaches for the prediction of true metabolizable energy of meat and bone meal.

Science.gov (United States)

Perai, A H; Nassiri Moghaddam, H; Asadpour, S; Bahrampour, J; Mansoori, Gh

2010-07-01

There has been a considerable and continuous interest to develop equations for rapid and accurate prediction of the ME of meat and bone meal. In this study, an artificial neural network (ANN), a partial least squares (PLS), and a multiple linear regression (MLR) statistical method were used to predict the TME(n) of meat and bone meal based on its CP, ether extract, and ash content. The accuracy of the models was calculated by R(2) value, MS error, mean absolute percentage error, mean absolute deviation, bias, and Theil's U. The predictive ability of an ANN was compared with a PLS and a MLR model using the same training data sets. The squared regression coefficients of prediction for the MLR, PLS, and ANN models were 0.38, 0.36, and 0.94, respectively. The results revealed that ANN produced more accurate predictions of TME(n) as compared with PLS and MLR methods. Based on the results of this study, ANN could be used as a promising approach for rapid prediction of nutritive value of meat and bone meal.
Thermodynamics and statistical mechanics an integrated approach

CERN Document Server

Hardy, Robert J

2014-01-01

This textbook brings together the fundamentals of the macroscopic and microscopic aspects of thermal physics by presenting thermodynamics and statistical mechanics as complementary theories based on small numbers of postulates. The book is designed to give the instructor flexibility in structuring courses for advanced undergraduates and/or beginning graduate students and is written on the principle that a good text should also be a good reference. The presentation of thermodynamics follows the logic of Clausius and Kelvin while relating the concepts involved to familiar phenomena and the mod
Wind power prediction models

Science.gov (United States)

Levy, R.; Mcginness, H.

1976-01-01

Investigations were performed to predict the power available from the wind at the Goldstone, California, antenna site complex. The background for power prediction was derived from a statistical evaluation of available wind speed data records at this location and at nearby locations similarly situated within the Mojave desert. In addition to a model for power prediction over relatively long periods of time, an interim simulation model that produces sample wind speeds is described. The interim model furnishes uncorrelated sample speeds at hourly intervals that reproduce the statistical wind distribution at Goldstone. A stochastic simulation model to provide speed samples representative of both the statistical speed distributions and correlations is also discussed.
PSPP: a protein structure prediction pipeline for computing clusters.

Directory of Open Access Journals (Sweden)

Michael S Lee

2009-07-01

Full Text Available Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster.The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes.The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform
Significant Statistics: Viewed with a Contextual Lens

Science.gov (United States)

Tait-McCutcheon, Sandi

2010-01-01

This paper examines the pedagogical and organisational changes three lead teachers made to their statistics teaching and learning programs. The lead teachers posed the research question: What would the effect of contextually integrating statistical investigations and literacies into other curriculum areas be on student achievement? By finding the…
Statistical learning in social action contexts.

Science.gov (United States)

Monroy, Claire; Meyer, Marlene; Gerson, Sarah; Hunnius, Sabine

2017-01-01

Sensitivity to the regularities and structure contained within sequential, goal-directed actions is an important building block for generating expectations about the actions we observe. Until now, research on statistical learning for actions has solely focused on individual action sequences, but many actions in daily life involve multiple actors in various interaction contexts. The current study is the first to investigate the role of statistical learning in tracking regularities between actions performed by different actors, and whether the social context characterizing their interaction influences learning. That is, are observers more likely to track regularities across actors if they are perceived as acting jointly as opposed to in parallel? We tested adults and toddlers to explore whether social context guides statistical learning and-if so-whether it does so from early in development. In a between-subjects eye-tracking experiment, participants were primed with a social context cue between two actors who either shared a goal of playing together ('Joint' condition) or stated the intention to act alone ('Parallel' condition). In subsequent videos, the actors performed sequential actions in which, for certain action pairs, the first actor's action reliably predicted the second actor's action. We analyzed predictive eye movements to upcoming actions as a measure of learning, and found that both adults and toddlers learned the statistical regularities across actors when their actions caused an effect. Further, adults with high statistical learning performance were sensitive to social context: those who observed actors with a shared goal were more likely to correctly predict upcoming actions. In contrast, there was no effect of social context in the toddler group, regardless of learning performance. These findings shed light on how adults and toddlers perceive statistical regularities across actors depending on the nature of the observed social situation and the
Baseline Assessment and Prioritization Framework for IVHM Integrity Assurance Enabling Capabilities

Science.gov (United States)

Cooper, Eric G.; DiVito, Benedetto L.; Jacklin, Stephen A.; Miner, Paul S.

2009-01-01

Fundamental to vehicle health management is the deployment of systems incorporating advanced technologies for predicting and detecting anomalous conditions in highly complex and integrated environments. Integrated structural integrity health monitoring, statistical algorithms for detection, estimation, prediction, and fusion, and diagnosis supporting adaptive control are examples of advanced technologies that present considerable verification and validation challenges. These systems necessitate interactions between physical and software-based systems that are highly networked with sensing and actuation subsystems, and incorporate technologies that are, in many respects, different from those employed in civil aviation today. A formidable barrier to deploying these advanced technologies in civil aviation is the lack of enabling verification and validation tools, methods, and technologies. The development of new verification and validation capabilities will not only enable the fielding of advanced vehicle health management systems, but will also provide new assurance capabilities for verification and validation of current generation aviation software which has been implicated in anomalous in-flight behavior. This paper describes the research focused on enabling capabilities for verification and validation underway within NASA s Integrated Vehicle Health Management project, discusses the state of the art of these capabilities, and includes a framework for prioritizing activities.

Lies, damn lies and statistics

International Nuclear Information System (INIS)

Jones, M.D.

2001-01-01

Statistics are widely employed within archaeological research. This is becoming increasingly so as user friendly statistical packages make increasingly sophisticated analyses available to non statisticians. However, all statistical techniques are based on underlying assumptions of which the end user may be unaware. If statistical analyses are applied in ignorance of the underlying assumptions there is the potential for highly erroneous inferences to be drawn. This does happen within archaeology and here this is illustrated with the example of 'date pooling', a technique that has been widely misused in archaeological research. This misuse may have given rise to an inevitable and predictable misinterpretation of New Zealand's archaeological record. (author). 10 refs., 6 figs., 1 tab
MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

Science.gov (United States)

Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

2018-05-08

Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on
Energy optimization and prediction of complex petrochemical industries using an improved artificial neural network approach integrating data envelopment analysis

International Nuclear Information System (INIS)

Han, Yong-Ming; Geng, Zhi-Qiang; Zhu, Qun-Xiong

2016-01-01

Graphical abstract: This paper proposed an energy optimization and prediction of complex petrochemical industries based on a DEA-integrated ANN approach (DEA-ANN). The proposed approach utilizes the DEA model with slack variables for sensitivity analysis to determine the effective decision making units (DMUs) and indicate the optimized direction of the ineffective DMUs. Compared with the traditional ANN approach, the DEA-ANN prediction model is effectively verified by executing a linear comparison between all DMUs and the effective DMUs through the standard data source from the UCI (University of California at Irvine) repository. Finally, the proposed model is validated through an application in a complex ethylene production system of China petrochemical industry. Meanwhile, the optimization result and the prediction value are obtained to reduce energy consumption of the ethylene production system, guide ethylene production and improve energy efficiency. - Highlights: • The DEA-integrated ANN approach is proposed. • The DEA-ANN prediction model is effectively verified through the standard data source from the UCI repository. • The energy optimization and prediction framework of complex petrochemical industries based on the proposed method is obtained. • The proposed method is valid and efficient in improvement of energy efficiency in complex petrochemical plants. - Abstract: Since the complex petrochemical data have characteristics of multi-dimension, uncertainty and noise, it is difficult to accurately optimize and predict the energy usage of complex petrochemical systems. Therefore, this paper proposes a data envelopment analysis (DEA) integrated artificial neural network (ANN) approach (DEA-ANN). The proposed approach utilizes the DEA model with slack variables for sensitivity analysis to determine the effective decision making units (DMUs) and indicate the optimized direction of the ineffective DMUs. Compared with the traditional ANN approach, the DEA
Statistics of return intervals between long heartbeat intervals and their usability for online prediction of disorders

International Nuclear Information System (INIS)

Bogachev, Mikhail I; Bunde, Armin; Kireenkov, Igor S; Nifontov, Eugene M

2009-01-01

We study the statistics of return intervals between large heartbeat intervals (above a certain threshold Q) in 24 h records obtained from healthy subjects. We find that both the linear and the nonlinear long-term memory inherent in the heartbeat intervals lead to power-laws in the probability density function P Q (r) of the return intervals. As a consequence, the probability W Q (t; Δt) that at least one large heartbeat interval will occur within the next Δt heartbeat intervals, with an increasing elapsed number of intervals t after the last large heartbeat interval, follows a power-law. Based on these results, we suggest a method of obtaining a priori information about the occurrence of the next large heartbeat interval, and thus to predict it. We show explicitly that the proposed method, which exploits long-term memory, is superior to the conventional precursory pattern recognition technique, which focuses solely on short-term memory. We believe that our results can be straightforwardly extended to obtain more reliable predictions in other physiological signals like blood pressure, as well as in other complex records exhibiting multifractal behaviour, e.g. turbulent flow, precipitation, river flows and network traffic.
Joint probability of statistical success of multiple phase III trials.

Science.gov (United States)

Zhang, Jianliang; Zhang, Jenny J

2013-01-01

In drug development, after completion of phase II proof-of-concept trials, the sponsor needs to make a go/no-go decision to start expensive phase III trials. The probability of statistical success (PoSS) of the phase III trials based on data from earlier studies is an important factor in that decision-making process. Instead of statistical power, the predictive power of a phase III trial, which takes into account the uncertainty in the estimation of treatment effect from earlier studies, has been proposed to evaluate the PoSS of a single trial. However, regulatory authorities generally require statistical significance in two (or more) trials for marketing licensure. We show that the predictive statistics of two future trials are statistically correlated through use of the common observed data from earlier studies. Thus, the joint predictive power should not be evaluated as a simplistic product of the predictive powers of the individual trials. We develop the relevant formulae for the appropriate evaluation of the joint predictive power and provide numerical examples. Our methodology is further extended to the more complex phase III development scenario comprising more than two (K > 2) trials, that is, the evaluation of the PoSS of at least k₀ (k₀≤ K) trials from a program of K total trials. Copyright © 2013 John Wiley & Sons, Ltd.
Statistical Engineering in Air Traffic Management Research

Science.gov (United States)

Wilson, Sara R.

2015-01-01

NASA is working to develop an integrated set of advanced technologies to enable efficient arrival operations in high-density terminal airspace for the Next Generation Air Transportation System. This integrated arrival solution is being validated and verified in laboratories and transitioned to a field prototype for an operational demonstration at a major U.S. airport. Within NASA, this is a collaborative effort between Ames and Langley Research Centers involving a multi-year iterative experimentation process. Designing and analyzing a series of sequential batch computer simulations and human-in-the-loop experiments across multiple facilities and simulation environments involves a number of statistical challenges. Experiments conducted in separate laboratories typically have different limitations and constraints, and can take different approaches with respect to the fundamental principles of statistical design of experiments. This often makes it difficult to compare results from multiple experiments and incorporate findings into the next experiment in the series. A statistical engineering approach is being employed within this project to support risk-informed decision making and maximize the knowledge gained within the available resources. This presentation describes a statistical engineering case study from NASA, highlights statistical challenges, and discusses areas where existing statistical methodology is adapted and extended.
Developing and validating a new precise risk-prediction model for new-onset hypertension: The Jichi Genki hypertension prediction model (JG model).

Science.gov (United States)

Kanegae, Hiroshi; Oikawa, Takamitsu; Suzuki, Kenji; Okawara, Yukie; Kario, Kazuomi

2018-03-31

No integrated risk assessment tools that include lifestyle factors and uric acid have been developed. In accordance with the Industrial Safety and Health Law in Japan, a follow-up examination of 63 495 normotensive individuals (mean age 42.8 years) who underwent a health checkup in 2010 was conducted every year for 5 years. The primary endpoint was new-onset hypertension (systolic blood pressure [SBP]/diastolic blood pressure [DBP] ≥ 140/90 mm Hg and/or the initiation of antihypertensive medications with self-reported hypertension). During the mean 3.4 years of follow-up, 7402 participants (11.7%) developed hypertension. The prediction model included age, sex, body mass index (BMI), SBP, DBP, low-density lipoprotein cholesterol, uric acid, proteinuria, current smoking, alcohol intake, eating rate, DBP by age, and BMI by age at baseline and was created by using Cox proportional hazards models to calculate 3-year absolute risks. The derivation analysis confirmed that the model performed well both with respect to discrimination and calibration (n = 63 495; C-statistic = 0.885, 95% confidence interval [CI], 0.865-0.903; χ 2 statistic = 13.6, degree of freedom [df] = 7). In the external validation analysis, moreover, the model performed well both in its discrimination and calibration characteristics (n = 14 168; C-statistic = 0.846; 95%CI, 0.775-0.905; χ 2 statistic = 8.7, df = 7). Adding LDL cholesterol, uric acid, proteinuria, alcohol intake, eating rate, and BMI by age to the base model yielded a significantly higher C-statistic, net reclassification improvement (NRI), and integrated discrimination improvement, especially NRI non-event (NRI = 0.127, 95%CI = 0.100-0.152; NRI non-event = 0.108, 95%CI = 0.102-0.117). In conclusion, a highly precise model with good performance was developed for predicting incident hypertension using the new parameters of eating rate, uric acid, proteinuria, and BMI by age. ©2018 Wiley Periodicals, Inc.
Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

Directory of Open Access Journals (Sweden)

Vandepoele Klaas

2009-06-01

Full Text Available Abstract Background Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization and components (e.g. ARPs, actin-related proteins exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.
Addressing the statistical mechanics of planet orbits in the solar system

Science.gov (United States)

Mogavero, Federico

2017-10-01

The chaotic nature of planet dynamics in the solar system suggests the relevance of a statistical approach to planetary orbits. In such a statistical description, the time-dependent position and velocity of the planets are replaced by the probability density function (PDF) of their orbital elements. It is natural to set up this kind of approach in the framework of statistical mechanics. In the present paper, I focus on the collisionless excitation of eccentricities and inclinations via gravitational interactions in a planetary system. The future planet trajectories in the solar system constitute the prototype of this kind of dynamics. I thus address the statistical mechanics of the solar system planet orbits and try to reproduce the PDFs numerically constructed by Laskar (2008, Icarus, 196, 1). I show that the microcanonical ensemble of the Laplace-Lagrange theory accurately reproduces the statistics of the giant planet orbits. To model the inner planets I then investigate the ansatz of equiprobability in the phase space constrained by the secular integrals of motion. The eccentricity and inclination PDFs of Earth and Venus are reproduced with no free parameters. Within the limitations of a stationary model, the predictions also show a reasonable agreement with Mars PDFs and that of Mercury inclination. The eccentricity of Mercury demands in contrast a deeper analysis. I finally revisit the random walk approach of Laskar to the time dependence of the inner planet PDFs. Such a statistical theory could be combined with direct numerical simulations of planet trajectories in the context of planet formation, which is likely to be a chaotic process.
A Network Integration Approach to Predict Conserved Regulators Related to Pathogenicity of Influenza and SARS-CoV Respiratory Viruses

Energy Technology Data Exchange (ETDEWEB)

Mitchell, Hugh D.; Eisfeld, Amie J.; Sims, Amy; McDermott, Jason E.; Matzke, Melissa M.; Webb-Robertson, Bobbie-Jo M.; Tilton, Susan C.; Tchitchek, Nicholas; Josset, Laurence; Li, Chengjun; Ellis, Amy L.; Chang, Jean H.; Heegel, Robert A.; Luna, Maria L.; Schepmoes, Athena A.; Shukla, Anil K.; Metz, Thomas O.; Neumann, Gabriele; Benecke, Arndt; Smith, Richard D.; Baric, Ralph; Kawaoka, Yoshihiro; Katze, Michael G.; Waters, Katrina M.

2013-07-25

Respiratory infections stemming from influenza viruses and the Severe Acute Respiratory Syndrome corona virus (SARS-CoV) represent a serious public health threat as emerging pandemics. Despite efforts to identify the critical interactions of these viruses with host machinery, the key regulatory events that lead to disease pathology remain poorly targeted with therapeutics. Here we implement an integrated network interrogation approach, in which proteome and transcriptome datasets from infection of both viruses in human lung epithelial cells are utilized to predict regulatory genes involved in the host response. We take advantage of a novel “crowd-based” approach to identify and combine ranking metrics that isolate genes/proteins likely related to the pathogenicity of SARS-CoV and influenza virus. Subsequently, a multivariate regression model is used to compare predicted lung epithelial regulatory influences with data derived from other respiratory virus infection models. We predicted a small set of regulatory factors with conserved behavior for consideration as important components of viral pathogenesis that might also serve as therapeutic targets for intervention. Our results demonstrate the utility of integrating diverse ‘omic datasets to predict and prioritize regulatory features conserved across multiple pathogen infection models.
The Statistical Mechanics of Ideal MHD Turbulence

Science.gov (United States)

Shebalin, John V.

2003-01-01

Turbulence is a universal, nonlinear phenomenon found in all energetic fluid and plasma motion. In particular. understanding magneto hydrodynamic (MHD) turbulence and incorporating its effects in the computation and prediction of the flow of ionized gases in space, for example, are great challenges that must be met if such computations and predictions are to be meaningful. Although a general solution to the "problem of turbulence" does not exist in closed form, numerical integrations allow us to explore the phase space of solutions for both ideal and dissipative flows. For homogeneous, incompressible turbulence, Fourier methods are appropriate, and phase space is defined by the Fourier coefficients of the physical fields. In the case of ideal MHD flows, a fairly robust statistical mechanics has been developed, in which the symmetry and ergodic properties of phase space is understood. A discussion of these properties will illuminate our principal discovery: Coherent structure and randomness co-exist in ideal MHD turbulence. For dissipative flows, as opposed to ideal flows, progress beyond the dimensional analysis of Kolmogorov has been difficult. Here, some possible future directions that draw on the ideal results will also be discussed. Our conclusion will be that while ideal turbulence is now well understood, real turbulence still presents great challenges.
[Prediction of regional soil quality based on mutual information theory integrated with decision tree algorithm].

Science.gov (United States)

Lin, Fen-Fang; Wang, Ke; Yang, Ning; Yan, Shi-Guang; Zheng, Xin-Yu

2012-02-01

In this paper, some main factors such as soil type, land use pattern, lithology type, topography, road, and industry type that affect soil quality were used to precisely obtain the spatial distribution characteristics of regional soil quality, mutual information theory was adopted to select the main environmental factors, and decision tree algorithm See 5.0 was applied to predict the grade of regional soil quality. The main factors affecting regional soil quality were soil type, land use, lithology type, distance to town, distance to water area, altitude, distance to road, and distance to industrial land. The prediction accuracy of the decision tree model with the variables selected by mutual information was obviously higher than that of the model with all variables, and, for the former model, whether of decision tree or of decision rule, its prediction accuracy was all higher than 80%. Based on the continuous and categorical data, the method of mutual information theory integrated with decision tree could not only reduce the number of input parameters for decision tree algorithm, but also predict and assess regional soil quality effectively.
Abnormal white matter integrity in the corpus callosum among smokers: tract-based spatial statistics.

Directory of Open Access Journals (Sweden)

Wakako Umene-Nakano

Full Text Available In the present study, we aimed to investigate the difference in white matter between smokers and nonsmokers. In addition, we examined relationships between white matter integrity and nicotine dependence parameters in smoking subjects. Nineteen male smokers were enrolled in this study. Eighteen age-matched non-smokers with no current or past psychiatric history were included as controls. Diffusion tensor imaging scans were performed, and the analysis was conducted using a tract-based special statistics approach. Compared with nonsmokers, smokers exhibited a significant decrease in fractional anisotropy (FA throughout the whole corpus callosum. There were no significant differences in radial diffusivity or axial diffusivity between the two groups. There was a significant negative correlation between FA in the whole corpus callosum and the amount of tobacco use (cigarettes/day; R = - 0.580, p = 0.023. These results suggest that the corpus callosum may be one of the key areas influenced by chronic smoking.
Generalized Majority Logic Criterion to Analyze the Statistical Strength of S-Boxes

Science.gov (United States)

Hussain, Iqtadar; Shah, Tariq; Gondal, Muhammad Asif; Mahmood, Hasan

2012-05-01

The majority logic criterion is applicable in the evaluation process of substitution boxes used in the advanced encryption standard (AES). The performance of modified or advanced substitution boxes is predicted by processing the results of statistical analysis by the majority logic criteria. In this paper, we use the majority logic criteria to analyze some popular and prevailing substitution boxes used in encryption processes. In particular, the majority logic criterion is applied to AES, affine power affine (APA), Gray, Lui J, residue prime, S8 AES, Skipjack, and Xyi substitution boxes. The majority logic criterion is further extended into a generalized majority logic criterion which has a broader spectrum of analyzing the effectiveness of substitution boxes in image encryption applications. The integral components of the statistical analyses used for the generalized majority logic criterion are derived from results of entropy analysis, contrast analysis, correlation analysis, homogeneity analysis, energy analysis, and mean of absolute deviation (MAD) analysis.
Global discriminative learning for higher-accuracy computational gene prediction.

Directory of Open Access Journals (Sweden)

Axel Bernal

2007-03-01

Full Text Available Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns.
Quantum local asymptotic normality and other questions of quantum statistics

NARCIS (Netherlands)

Kahn, Jonas

2008-01-01

This thesis is entitled Quantum Local Asymptotic Normality and other questions of Quantum Statistics ,. Quantum statistics are statistics on quantum objects. In classical statistics, we usually start from the data. Indeed, if we want to predict the weather, and can measure the wind or the
Internal representations of temporal statistics and feedback calibrate motor-sensory interval timing.

Directory of Open Access Journals (Sweden)

Luigi Acerbi

Full Text Available Humans have been shown to adapt to the temporal statistics of timing tasks so as to optimize the accuracy of their responses, in agreement with the predictions of Bayesian integration. This suggests that they build an internal representation of both the experimentally imposed distribution of time intervals (the prior and of the error (the loss function. The responses of a Bayesian ideal observer depend crucially on these internal representations, which have only been previously studied for simple distributions. To study the nature of these representations we asked subjects to reproduce time intervals drawn from underlying temporal distributions of varying complexity, from uniform to highly skewed or bimodal while also varying the error mapping that determined the performance feedback. Interval reproduction times were affected by both the distribution and feedback, in good agreement with a performance-optimizing Bayesian observer and actor model. Bayesian model comparison highlighted that subjects were integrating the provided feedback and represented the experimental distribution with a smoothed approximation. A nonparametric reconstruction of the subjective priors from the data shows that they are generally in agreement with the true distributions up to third-order moments, but with systematically heavier tails. In particular, higher-order statistical features (kurtosis, multimodality seem much harder to acquire. Our findings suggest that humans have only minor constraints on learning lower-order statistical properties of unimodal (including peaked and skewed distributions of time intervals under the guidance of corrective feedback, and that their behavior is well explained by Bayesian decision theory.
Statistical and extra-statistical considerations in differential item functioning analyses

Directory of Open Access Journals (Sweden)

G. K. Huysamen

2004-10-01

Full Text Available This article briefly describes the main procedures for performing differential item functioning (DIF analyses and points out some of the statistical and extra-statistical implications of these methods. Research findings on the sources of DIF, including those associated with translated tests, are reviewed. As DIF analyses are oblivious of correlations between a test and relevant criteria, the elimination of differentially functioning items does not necessarily improve predictive validity or reduce any predictive bias. The implications of the results of past DIF research for test development in the multilingual and multi-cultural South African society are considered. Opsomming Hierdie artikel beskryf kortliks die hoofprosedures vir die ontleding van differensiële itemfunksionering (DIF en verwys na sommige van die statistiese en buite-statistiese implikasies van hierdie metodes. ’n Oorsig word verskaf van navorsingsbevindings oor die bronne van DIF, insluitend dié by vertaalde toetse. Omdat DIF-ontledings nie die korrelasies tussen ’n toets en relevante kriteria in ag neem nie, sal die verwydering van differensieel-funksionerende items nie noodwendig voorspellingsgeldigheid verbeter of voorspellingsydigheid verminder nie. Die implikasies van vorige DIF-navorsingsbevindings vir toetsontwikkeling in die veeltalige en multikulturele Suid-Afrikaanse gemeenskap word oorweeg.
Digital immunohistochemistry platform for the staining variation monitoring based on integration of image and statistical analyses with laboratory information system.

Science.gov (United States)

Laurinaviciene, Aida; Plancoulaine, Benoit; Baltrusaityte, Indra; Meskauskas, Raimundas; Besusparis, Justinas; Lesciute-Krilaviciene, Daiva; Raudeliunas, Darius; Iqbal, Yasir; Herlin, Paulette; Laurinavicius, Arvydas

2014-01-01

Digital immunohistochemistry (IHC) is one of the most promising applications brought by new generation image analysis (IA). While conventional IHC staining quality is monitored by semi-quantitative visual evaluation of tissue controls, IA may require more sensitive measurement. We designed an automated system to digitally monitor IHC multi-tissue controls, based on SQL-level integration of laboratory information system with image and statistical analysis tools. Consecutive sections of TMA containing 10 cores of breast cancer tissue were used as tissue controls in routine Ki67 IHC testing. Ventana slide label barcode ID was sent to the LIS to register the serial section sequence. The slides were stained and scanned (Aperio ScanScope XT), IA was performed by the Aperio/Leica Colocalization and Genie Classifier/Nuclear algorithms. SQL-based integration ensured automated statistical analysis of the IA data by the SAS Enterprise Guide project. Factor analysis and plot visualizations were performed to explore slide-to-slide variation of the Ki67 IHC staining results in the control tissue. Slide-to-slide intra-core IHC staining analysis revealed rather significant variation of the variables reflecting the sample size, while Brown and Blue Intensity were relatively stable. To further investigate this variation, the IA results from the 10 cores were aggregated to minimize tissue-related variance. Factor analysis revealed association between the variables reflecting the sample size detected by IA and Blue Intensity. Since the main feature to be extracted from the tissue controls was staining intensity, we further explored the variation of the intensity variables in the individual cores. MeanBrownBlue Intensity ((Brown+Blue)/2) and DiffBrownBlue Intensity (Brown-Blue) were introduced to better contrast the absolute intensity and the colour balance variation in each core; relevant factor scores were extracted. Finally, tissue-related factors of IHC staining variance were
Lipoprotein metabolism indicators improve cardiovascular risk prediction.

Directory of Open Access Journals (Sweden)

Daniël B van Schalkwijk

Full Text Available BACKGROUND: Cardiovascular disease risk increases when lipoprotein metabolism is dysfunctional. We have developed a computational model able to derive indicators of lipoprotein production, lipolysis, and uptake processes from a single lipoprotein profile measurement. This is the first study to investigate whether lipoprotein metabolism indicators can improve cardiovascular risk prediction and therapy management. METHODS AND RESULTS: We calculated lipoprotein metabolism indicators for 1981 subjects (145 cases, 1836 controls from the Framingham Heart Study offspring cohort in which NMR lipoprotein profiles were measured. We applied a statistical learning algorithm using a support vector machine to select conventional risk factors and lipoprotein metabolism indicators that contributed to predicting risk for general cardiovascular disease. Risk prediction was quantified by the change in the Area-Under-the-ROC-Curve (ΔAUC and by risk reclassification (Net Reclassification Improvement (NRI and Integrated Discrimination Improvement (IDI. Two VLDL lipoprotein metabolism indicators (VLDLE and VLDLH improved cardiovascular risk prediction. We added these indicators to a multivariate model with the best performing conventional risk markers. Our method significantly improved both CVD prediction and risk reclassification. CONCLUSIONS: Two calculated VLDL metabolism indicators significantly improved cardiovascular risk prediction. These indicators may help to reduce prescription of unnecessary cholesterol-lowering medication, reducing costs and possible side-effects. For clinical application, further validation is required.

PREVAIL: Predicting Recovery through Estimation and Visualization of Active and Incident Lesions.

Science.gov (United States)

Dworkin, Jordan D; Sweeney, Elizabeth M; Schindler, Matthew K; Chahin, Salim; Reich, Daniel S; Shinohara, Russell T

2016-01-01

The goal of this study was to develop a model that integrates imaging and clinical information observed at lesion incidence for predicting the recovery of white matter lesions in multiple sclerosis (MS) patients. Demographic, clinical, and magnetic resonance imaging (MRI) data were obtained from 60 subjects with MS as part of a natural history study at the National Institute of Neurological Disorders and Stroke. A total of 401 lesions met the inclusion criteria and were used in the study. Imaging features were extracted from the intensity-normalized T1-weighted (T1w) and T2-weighted sequences as well as magnetization transfer ratio (MTR) sequence acquired at lesion incidence. T1w and MTR signatures were also extracted from images acquired one-year post-incidence. Imaging features were integrated with clinical and demographic data observed at lesion incidence to create statistical prediction models for long-term damage within the lesion. The performance of the T1w and MTR predictions was assessed in two ways: first, the predictive accuracy was measured quantitatively using leave-one-lesion-out cross-validated (CV) mean-squared predictive error. Then, to assess the prediction performance from the perspective of expert clinicians, three board-certified MS clinicians were asked to individually score how similar the CV model-predicted one-year appearance was to the true one-year appearance for a random sample of 100 lesions. The cross-validated root-mean-square predictive error was 0.95 for normalized T1w and 0.064 for MTR, compared to the estimated measurement errors of 0.48 and 0.078 respectively. The three expert raters agreed that T1w and MTR predictions closely resembled the true one-year follow-up appearance of the lesions in both degree and pattern of recovery within lesions. This study demonstrates that by using only information from a single visit at incidence, we can predict how a new lesion will recover using relatively simple statistical techniques. The
Summer drought predictability over Europe: empirical versus dynamical forecasts

Science.gov (United States)

Turco, Marco; Ceglar, Andrej; Prodhomme, Chloé; Soret, Albert; Toreti, Andrea; Doblas-Reyes Francisco, J.

2017-08-01

Seasonal climate forecasts could be an important planning tool for farmers, government and insurance companies that can lead to better and timely management of seasonal climate risks. However, climate seasonal forecasts are often under-used, because potential users are not well aware of the capabilities and limitations of these products. This study aims at assessing the merits and caveats of a statistical empirical method, the ensemble streamflow prediction system (ESP, an ensemble based on reordering historical data) and an operational dynamical forecast system, the European Centre for Medium-Range Weather Forecasts—System 4 (S4) in predicting summer drought in Europe. Droughts are defined using the Standardized Precipitation Evapotranspiration Index for the month of August integrated over 6 months. Both systems show useful and mostly comparable deterministic skill. We argue that this source of predictability is mostly attributable to the observed initial conditions. S4 shows only higher skill in terms of ability to probabilistically identify drought occurrence. Thus, currently, both approaches provide useful information and ESP represents a computationally fast alternative to dynamical prediction applications for drought prediction.
Improving Multi-Sensor Drought Monitoring, Prediction and Recovery Assessment Using Gravimetry Information

Science.gov (United States)

Aghakouchak, Amir; Tourian, Mohammad J.

2015-04-01

Development of reliable drought monitoring, prediction and recovery assessment tools are fundamental to water resources management. This presentation focuses on how gravimetry information can improve drought assessment. First, we provide an overview of the Global Integrated Drought Monitoring and Prediction System (GIDMaPS) which offers near real-time drought information using remote sensing observations and model simulations. Then, we present a framework for integration of satellite gravimetry information for improving drought prediction and recovery assessment. The input data include satellite-based and model-based precipitation, soil moisture estimates and equivalent water height. Previous studies show that drought assessment based on one single indicator may not be sufficient. For this reason, GIDMaPS provides drought information based on multiple drought indicators including Standardized Precipitation Index (SPI), Standardized Soil Moisture Index (SSI) and the Multivariate Standardized Drought Index (MSDI) which combines SPI and SSI probabilistically. MSDI incorporates the meteorological and agricultural drought conditions and provides composite multi-index drought information for overall characterization of droughts. GIDMaPS includes a seasonal prediction component based on a statistical persistence-based approach. The prediction component of GIDMaPS provides the empirical probability of drought for different severity levels. In this presentation we present a new component in which the drought prediction information based on SPI, SSI and MSDI are conditioned on equivalent water height obtained from the Gravity Recovery and Climate Experiment (GRACE). Using a Bayesian approach, GRACE information is used to evaluate persistence of drought. Finally, the deficit equivalent water height based on GRACE is used for assessing drought recovery. In this presentation, both monitoring and prediction components of GIDMaPS will be discussed, and the results from 2014
Framework for Infectious Disease Analysis: A comprehensive and integrative multi-modeling approach to disease prediction and management.

Science.gov (United States)

Erraguntla, Madhav; Zapletal, Josef; Lawley, Mark

2017-12-01

The impact of infectious disease on human populations is a function of many factors including environmental conditions, vector dynamics, transmission mechanics, social and cultural behaviors, and public policy. A comprehensive framework for disease management must fully connect the complete disease lifecycle, including emergence from reservoir populations, zoonotic vector transmission, and impact on human societies. The Framework for Infectious Disease Analysis is a software environment and conceptual architecture for data integration, situational awareness, visualization, prediction, and intervention assessment. Framework for Infectious Disease Analysis automatically collects biosurveillance data using natural language processing, integrates structured and unstructured data from multiple sources, applies advanced machine learning, and uses multi-modeling for analyzing disease dynamics and testing interventions in complex, heterogeneous populations. In the illustrative case studies, natural language processing from social media, news feeds, and websites was used for information extraction, biosurveillance, and situation awareness. Classification machine learning algorithms (support vector machines, random forests, and boosting) were used for disease predictions.
Statistical models to predict flows at monthly level in Salvajina

International Nuclear Information System (INIS)

Gonzalez, Harold O

1994-01-01

It thinks about and models of lineal regression evaluate at monthly level that they allow to predict flows in Salvajina, with base in predictions variable, like the difference of pressure between Darwin and Tahiti, precipitation in Piendamo Cauca), temperature in Port Chicama (Peru) and pressure in Tahiti
What's statistical about learning? Insights from modelling statistical learning as a set of memory processes.

Science.gov (United States)

Thiessen, Erik D

2017-01-05

Statistical learning has been studied in a variety of different tasks, including word segmentation, object identification, category learning, artificial grammar learning and serial reaction time tasks (e.g. Saffran et al. 1996 Science 274: , 1926-1928; Orban et al. 2008 Proceedings of the National Academy of Sciences 105: , 2745-2750; Thiessen & Yee 2010 Child Development 81: , 1287-1303; Saffran 2002 Journal of Memory and Language 47: , 172-196; Misyak & Christiansen 2012 Language Learning 62: , 302-331). The difference among these tasks raises questions about whether they all depend on the same kinds of underlying processes and computations, or whether they are tapping into different underlying mechanisms. Prior theoretical approaches to statistical learning have often tried to explain or model learning in a single task. However, in many cases these approaches appear inadequate to explain performance in multiple tasks. For example, explaining word segmentation via the computation of sequential statistics (such as transitional probability) provides little insight into the nature of sensitivity to regularities among simultaneously presented features. In this article, we will present a formal computational approach that we believe is a good candidate to provide a unifying framework to explore and explain learning in a wide variety of statistical learning tasks. This framework suggests that statistical learning arises from a set of processes that are inherent in memory systems, including activation, interference, integration of information and forgetting (e.g. Perruchet & Vinter 1998 Journal of Memory and Language 39: , 246-263; Thiessen et al. 2013 Psychological Bulletin 139: , 792-814). From this perspective, statistical learning does not involve explicit computation of statistics, but rather the extraction of elements of the input into memory traces, and subsequent integration across those memory traces that emphasize consistent information (Thiessen and Pavlik
Statistical downscaling of IPCC sea surface wind and wind energy predictions for U.S. east coastal ocean, Gulf of Mexico and Caribbean Sea

Science.gov (United States)

Yao, Zhigang; Xue, Zuo; He, Ruoying; Bao, Xianwen; Song, Jun

2016-08-01

A multivariate statistical downscaling method is developed to produce regional, high-resolution, coastal surface wind fields based on the IPCC global model predictions for the U.S. east coastal ocean, the Gulf of Mexico (GOM), and the Caribbean Sea. The statistical relationship is built upon linear regressions between the empirical orthogonal function (EOF) spaces of a cross- calibrated, multi-platform, multi-instrument ocean surface wind velocity dataset (predictand) and the global NCEP wind reanalysis (predictor) over a 10 year period from 2000 to 2009. The statistical relationship is validated before applications and its effectiveness is confirmed by the good agreement between downscaled wind fields based on the NCEP reanalysis and in-situ surface wind measured at 16 National Data Buoy Center (NDBC) buoys in the U.S. east coastal ocean and the GOM during 1992-1999. The predictand-predictor relationship is applied to IPCC GFDL model output (2.0°×2.5°) of downscaled coastal wind at 0.25°×0.25° resolution. The temporal and spatial variability of future predicted wind speeds and wind energy potential over the study region are further quantified. It is shown that wind speed and power would significantly be reduced in the high CO2 climate scenario offshore of the mid-Atlantic and northeast U.S., with the speed falling to one quarter of its original value.
Evaluation of burst pressure prediction models for line pipes

Energy Technology Data Exchange (ETDEWEB)

Zhu, Xian-Kui, E-mail: zhux@battelle.org [Battelle Memorial Institute, 505 King Avenue, Columbus, OH 43201 (United States); Leis, Brian N. [Battelle Memorial Institute, 505 King Avenue, Columbus, OH 43201 (United States)

2012-01-15

Accurate prediction of burst pressure plays a central role in engineering design and integrity assessment of oil and gas pipelines. Theoretical and empirical solutions for such prediction are evaluated in this paper relative to a burst pressure database comprising more than 100 tests covering a variety of pipeline steel grades and pipe sizes. Solutions considered include three based on plasticity theory for the end-capped, thin-walled, defect-free line pipe subjected to internal pressure in terms of the Tresca, von Mises, and ZL (or Zhu-Leis) criteria, one based on a cylindrical instability stress (CIS) concept, and a large group of analytical and empirical models previously evaluated by Law and Bowie (International Journal of Pressure Vessels and Piping, 84, 2007: 487-492). It is found that these models can be categorized into either a Tresca-family or a von Mises-family of solutions, except for those due to Margetson and Zhu-Leis models. The viability of predictions is measured via statistical analyses in terms of a mean error and its standard deviation. Consistent with an independent parallel evaluation using another large database, the Zhu-Leis solution is found best for predicting burst pressure, including consideration of strain hardening effects, while the Tresca strength solutions including Barlow, Maximum shear stress, Turner, and the ASME boiler code provide reasonably good predictions for the class of line-pipe steels with intermediate strain hardening response. - Highlights: Black-Right-Pointing-Pointer This paper evaluates different burst pressure prediction models for line pipes. Black-Right-Pointing-Pointer The existing models are categorized into two major groups of Tresca and von Mises solutions. Black-Right-Pointing-Pointer Prediction quality of each model is assessed statistically using a large full-scale burst test database. Black-Right-Pointing-Pointer The Zhu-Leis solution is identified as the best predictive model.
Evaluation of burst pressure prediction models for line pipes

International Nuclear Information System (INIS)

Zhu, Xian-Kui; Leis, Brian N.

2012-01-01

Accurate prediction of burst pressure plays a central role in engineering design and integrity assessment of oil and gas pipelines. Theoretical and empirical solutions for such prediction are evaluated in this paper relative to a burst pressure database comprising more than 100 tests covering a variety of pipeline steel grades and pipe sizes. Solutions considered include three based on plasticity theory for the end-capped, thin-walled, defect-free line pipe subjected to internal pressure in terms of the Tresca, von Mises, and ZL (or Zhu-Leis) criteria, one based on a cylindrical instability stress (CIS) concept, and a large group of analytical and empirical models previously evaluated by Law and Bowie (International Journal of Pressure Vessels and Piping, 84, 2007: 487–492). It is found that these models can be categorized into either a Tresca-family or a von Mises-family of solutions, except for those due to Margetson and Zhu-Leis models. The viability of predictions is measured via statistical analyses in terms of a mean error and its standard deviation. Consistent with an independent parallel evaluation using another large database, the Zhu-Leis solution is found best for predicting burst pressure, including consideration of strain hardening effects, while the Tresca strength solutions including Barlow, Maximum shear stress, Turner, and the ASME boiler code provide reasonably good predictions for the class of line-pipe steels with intermediate strain hardening response. - Highlights: ► This paper evaluates different burst pressure prediction models for line pipes. ► The existing models are categorized into two major groups of Tresca and von Mises solutions. ► Prediction quality of each model is assessed statistically using a large full-scale burst test database. ► The Zhu-Leis solution is identified as the best predictive model.
Alexander Technique Training Coupled With an Integrative Model of Behavioral Prediction in Teachers With Low Back Pain.

Science.gov (United States)

Kamalikhah, Tahereh; Morowatisharifabad, Mohammad Ali; Rezaei-Moghaddam, Farid; Ghasemi, Mohammad; Gholami-Fesharaki, Mohammad; Goklani, Salma

2016-09-01

Individuals suffering from chronic low back pain (CLBP) experience major physical, social, and occupational disruptions. Strong evidence confirms the effectiveness of Alexander technique (AT) training for CLBP. The present study applied an integrative model (IM) of behavioral prediction for improvement of AT training. This was a quasi-experimental study of female teachers with nonspecific LBP in southern Tehran in 2014. Group A contained 42 subjects and group B had 35 subjects. In group A, AT lessons were designed based on IM constructs, while in group B, AT lessons only were taught. The validity and reliability of the AT questionnaire were confirmed using content validity (CVR 0.91, CVI 0.96) and Cronbach's α (0.80). The IM constructs of both groups were measured after the completion of training. Statistical analysis used independent and paired samples t-tests and the univariate generalized linear model (GLM). Significant differences were recorded before and after intervention (P < 0.001) for the model constructs of intention, perceived risk, direct attitude, behavioral beliefs, and knowledge in both groups. Direct attitude and behavioral beliefs in group A were higher than in group B after the intervention (P < 0.03). The educational framework provided by IM for AT training improved attitude and behavioral beliefs that can facilitate the adoption of AT behavior and decreased CLBP.
Crack Growth-Based Predictive Methodology for the Maintenance of the Structural Integrity of Repaired and Nonrepaired Aging Engine Stationary Components

National Research Council Canada - National Science Library

Barron, Michael

1999-01-01

.... Specifically, the FAA's goal was to develop "Crack Growth-Based Predictive Methodologies for the Maintenance of the Structural Integrity of Repaired and Nonrepaired Aging Engine Stationary Components...
Six sigma for organizational excellence a statistical approach

CERN Document Server

Muralidharan, K

2015-01-01

This book discusses the integrated concepts of statistical quality engineering and management tools. It will help readers to understand and apply the concepts of quality through project management and technical analysis, using statistical methods. Prepared in a ready-to-use form, the text will equip practitioners to implement the Six Sigma principles in projects. The concepts discussed are all critically assessed and explained, allowing them to be practically applied in managerial decision-making, and in each chapter, the objectives and connections to the rest of the work are clearly illustrated. To aid in understanding, the book includes a wealth of tables, graphs, descriptions and checklists, as well as charts and plots, worked-out examples and exercises. Perhaps the most unique feature of the book is its approach, using statistical tools, to explain the science behind Six Sigma project management and integrated in engineering concepts. The material on quality engineering and statistical management tools of...
An integrated user-friendly ArcMAP tool for bivariate statistical modeling in geoscience applications

Science.gov (United States)

Jebur, M. N.; Pradhan, B.; Shafri, H. Z. M.; Yusof, Z.; Tehrany, M. S.

2014-10-01

Modeling and classification difficulties are fundamental issues in natural hazard assessment. A geographic information system (GIS) is a domain that requires users to use various tools to perform different types of spatial modeling. Bivariate statistical analysis (BSA) assists in hazard modeling. To perform this analysis, several calculations are required and the user has to transfer data from one format to another. Most researchers perform these calculations manually by using Microsoft Excel or other programs. This process is time consuming and carries a degree of uncertainty. The lack of proper tools to implement BSA in a GIS environment prompted this study. In this paper, a user-friendly tool, BSM (bivariate statistical modeler), for BSA technique is proposed. Three popular BSA techniques such as frequency ratio, weights-of-evidence, and evidential belief function models are applied in the newly proposed ArcMAP tool. This tool is programmed in Python and is created by a simple graphical user interface, which facilitates the improvement of model performance. The proposed tool implements BSA automatically, thus allowing numerous variables to be examined. To validate the capability and accuracy of this program, a pilot test area in Malaysia is selected and all three models are tested by using the proposed program. Area under curve is used to measure the success rate and prediction rate. Results demonstrate that the proposed program executes BSA with reasonable accuracy. The proposed BSA tool can be used in numerous applications, such as natural hazard, mineral potential, hydrological, and other engineering and environmental applications.
An integrated user-friendly ArcMAP tool for bivariate statistical modelling in geoscience applications

Science.gov (United States)

Jebur, M. N.; Pradhan, B.; Shafri, H. Z. M.; Yusoff, Z. M.; Tehrany, M. S.

2015-03-01

Modelling and classification difficulties are fundamental issues in natural hazard assessment. A geographic information system (GIS) is a domain that requires users to use various tools to perform different types of spatial modelling. Bivariate statistical analysis (BSA) assists in hazard modelling. To perform this analysis, several calculations are required and the user has to transfer data from one format to another. Most researchers perform these calculations manually by using Microsoft Excel or other programs. This process is time-consuming and carries a degree of uncertainty. The lack of proper tools to implement BSA in a GIS environment prompted this study. In this paper, a user-friendly tool, bivariate statistical modeler (BSM), for BSA technique is proposed. Three popular BSA techniques, such as frequency ratio, weight-of-evidence (WoE), and evidential belief function (EBF) models, are applied in the newly proposed ArcMAP tool. This tool is programmed in Python and created by a simple graphical user interface (GUI), which facilitates the improvement of model performance. The proposed tool implements BSA automatically, thus allowing numerous variables to be examined. To validate the capability and accuracy of this program, a pilot test area in Malaysia is selected and all three models are tested by using the proposed program. Area under curve (AUC) is used to measure the success rate and prediction rate. Results demonstrate that the proposed program executes BSA with reasonable accuracy. The proposed BSA tool can be used in numerous applications, such as natural hazard, mineral potential, hydrological, and other engineering and environmental applications.
An Integrated Ensemble-Based Operational Framework to Predict Urban Flooding: A Case Study of Hurricane Sandy in the Passaic and Hackensack River Basins

Science.gov (United States)

Saleh, F.; Ramaswamy, V.; Georgas, N.; Blumberg, A. F.; Wang, Y.

2016-12-01

Advances in computational resources and modeling techniques are opening the path to effectively integrate existing complex models. In the context of flood prediction, recent extreme events have demonstrated the importance of integrating components of the hydrosystem to better represent the interactions amongst different physical processes and phenomena. As such, there is a pressing need to develop holistic and cross-disciplinary modeling frameworks that effectively integrate existing models and better represent the operative dynamics. This work presents a novel Hydrologic-Hydraulic-Hydrodynamic Ensemble (H3E) flood prediction framework that operationally integrates existing predictive models representing coastal (New York Harbor Observing and Prediction System, NYHOPS), hydrologic (US Army Corps of Engineers Hydrologic Modeling System, HEC-HMS) and hydraulic (2-dimensional River Analysis System, HEC-RAS) components. The state-of-the-art framework is forced with 125 ensemble meteorological inputs from numerical weather prediction models including the Global Ensemble Forecast System, the European Centre for Medium-Range Weather Forecasts (ECMWF), the Canadian Meteorological Centre (CMC), the Short Range Ensemble Forecast (SREF) and the North American Mesoscale Forecast System (NAM). The framework produces, within a 96-hour forecast horizon, on-the-fly Google Earth flood maps that provide critical information for decision makers and emergency preparedness managers. The utility of the framework was demonstrated by retrospectively forecasting an extreme flood event, hurricane Sandy in the Passaic and Hackensack watersheds (New Jersey, USA). Hurricane Sandy caused significant damage to a number of critical facilities in this area including the New Jersey Transit's main storage and maintenance facility. The results of this work demonstrate that ensemble based frameworks provide improved flood predictions and useful information about associated uncertainties, thus
Statistical Angles on the Lattice QCD Signal-to-Noise Problem

Science.gov (United States)

Wagman, Michael L.

The theory of quantum chromodynamics (QCD) encodes the strong interactions that bind quarks and gluons into nucleons and that bind nucleons into nuclei. Predictive control of QCD would allow nuclear structure and reactions as well as properties of supernovae and neutron stars to be theoretically studied from first principles. Lattice QCD (LQCD) can represent generic QCD predictions in terms of well-defined path integrals, but the sign and signal-to-noise problems have obstructed LQCD calculations of large nuclei and nuclear matter in practice. This thesis presents a statistical study of LQCD correlation functions, with a particular focus on characterizing the structure of the noise associated with quantum fluctuations. The signal-to-noise problem in baryon correlation functions is demonstrated to arise from a sign problem associated with Monte Carlo sampling of complex correlation functions. Properties of circular statistics are used to understand the emergence of a large time noise region where standard energy measurements are unreliable. Power-law tails associated with stable distributions and Levy flights are found to play a central role in the time evolution of baryon correlation functions. Building on these observations, a new statistical analysis technique called phase reweighting is introduced that allow energy levels to be extracted from large-time correlation functions with time-independent signal-to-noise ratios. Phase reweighting effectively includes dynamical refinement of source magnitudes but introduces a bias associated with the phase. This bias can be removed by performing an extrapolation, but at the expense of re-introducing a signal-to-noise problem. Lattice QCD calculations of the ρ+ and nucleon masses and of the ΞΞ(1S0) binding energy show consistency between standard results obtained using smaller-time correlation functions and phase-reweighted results using large-time correlation functions inaccessible to standard statistical analysis
Statistical Techniques for Project Control

CERN Document Server

Badiru, Adedeji B

2012-01-01

A project can be simple or complex. In each case, proven project management processes must be followed. In all cases of project management implementation, control must be exercised in order to assure that project objectives are achieved. Statistical Techniques for Project Control seamlessly integrates qualitative and quantitative tools and techniques for project control. It fills the void that exists in the application of statistical techniques to project control. The book begins by defining the fundamentals of project management then explores how to temper quantitative analysis with qualitati
Using the Integrative Model of Behavioral Prediction to Understand College Students' STI Testing Beliefs, Intentions, and Behaviors.

Science.gov (United States)

Wombacher, Kevin; Dai, Minhao; Matig, Jacob J; Harrington, Nancy Grant

2018-03-22

To identify salient behavioral determinants related to STI testing among college students by testing a model based on the integrative model of behavioral (IMBP) prediction. 265 undergraduate students from a large university in the Southeastern US. Formative and survey research to test an IMBP-based model that explores the relationships between determinants and STI testing intention and behavior. Results of path analyses supported a model in which attitudinal beliefs predicted intention and intention predicted behavior. Normative beliefs and behavioral control beliefs were not significant in the model; however, select individual normative and control beliefs were significantly correlated with intention and behavior. Attitudinal beliefs are the strongest predictor of STI testing intention and behavior. Future efforts to increase STI testing rates should identify and target salient attitudinal beliefs.
Integrated Design Software Predicts the Creep Life of Monolithic Ceramic Components

Science.gov (United States)

1996-01-01

Significant improvements in propulsion and power generation for the next century will require revolutionary advances in high-temperature materials and structural design. Advanced ceramics are candidate materials for these elevated-temperature applications. As design protocols emerge for these material systems, designers must be aware of several innate features, including the degrading ability of ceramics to carry sustained load. Usually, time-dependent failure in ceramics occurs because of two different, delayedfailure mechanisms: slow crack growth and creep rupture. Slow crack growth initiates at a preexisting flaw and continues until a critical crack length is reached, causing catastrophic failure. Creep rupture, on the other hand, occurs because of bulk damage in the material: void nucleation and coalescence that eventually leads to macrocracks which then propagate to failure. Successful application of advanced ceramics depends on proper characterization of material behavior and the use of an appropriate design methodology. The life of a ceramic component can be predicted with the NASA Lewis Research Center's Ceramics Analysis and Reliability Evaluation of Structures (CARES) integrated design programs. CARES/CREEP determines the expected life of a component under creep conditions, and CARES/LIFE predicts the component life due to fast fracture and subcritical crack growth. The previously developed CARES/LIFE program has been used in numerous industrial and Government applications.
Application of Ontology Technology in Health Statistic Data Analysis.

Science.gov (United States)

Guo, Minjiang; Hu, Hongpu; Lei, Xingyun

2017-01-01

Research Purpose: establish health management ontology for analysis of health statistic data. Proposed Methods: this paper established health management ontology based on the analysis of the concepts in China Health Statistics Yearbook, and used protégé to define the syntactic and semantic structure of health statistical data. six classes of top-level ontology concepts and their subclasses had been extracted and the object properties and data properties were defined to establish the construction of these classes. By ontology instantiation, we can integrate multi-source heterogeneous data and enable administrators to have an overall understanding and analysis of the health statistic data. ontology technology provides a comprehensive and unified information integration structure of the health management domain and lays a foundation for the efficient analysis of multi-source and heterogeneous health system management data and enhancement of the management efficiency.

ESB-Based Sensor Web Integration for the Prediction of Electric Power Supply System Vulnerability

Directory of Open Access Journals (Sweden)

Milos Bogdanovic

2013-08-01

Full Text Available Electric power supply companies increasingly rely on enterprise IT systems to provide them with a comprehensive view of the state of the distribution network. Within a utility-wide network, enterprise IT systems collect data from various metering devices. Such data can be effectively used for the prediction of power supply network vulnerability. The purpose of this paper is to present the Enterprise Service Bus (ESB-based Sensor Web integration solution that we have developed with the purpose of enabling prediction of power supply network vulnerability, in terms of a prediction of defect probability for a particular network element. We will give an example of its usage and demonstrate our vulnerability prediction model on data collected from two different power supply companies. The proposed solution is an extension of the GinisSense Sensor Web-based architecture for collecting, processing, analyzing, decision making and alerting based on the data received from heterogeneous data sources. In this case, GinisSense has been upgraded to be capable of operating in an ESB environment and combine Sensor Web and GIS technologies to enable prediction of electric power supply system vulnerability. Aside from electrical values, the proposed solution gathers ambient values from additional sensors installed in the existing power supply network infrastructure. GinisSense aggregates gathered data according to an adapted Omnibus data fusion model and applies decision-making logic on the aggregated data. Detected vulnerabilities are visualized to end-users through means of a specialized Web GIS application.
ESB-based Sensor Web integration for the prediction of electric power supply system vulnerability.

Science.gov (United States)

Stoimenov, Leonid; Bogdanovic, Milos; Bogdanovic-Dinic, Sanja

2013-08-15

Electric power supply companies increasingly rely on enterprise IT systems to provide them with a comprehensive view of the state of the distribution network. Within a utility-wide network, enterprise IT systems collect data from various metering devices. Such data can be effectively used for the prediction of power supply network vulnerability. The purpose of this paper is to present the Enterprise Service Bus (ESB)-based Sensor Web integration solution that we have developed with the purpose of enabling prediction of power supply network vulnerability, in terms of a prediction of defect probability for a particular network element. We will give an example of its usage and demonstrate our vulnerability prediction model on data collected from two different power supply companies. The proposed solution is an extension of the GinisSense Sensor Web-based architecture for collecting, processing, analyzing, decision making and alerting based on the data received from heterogeneous data sources. In this case, GinisSense has been upgraded to be capable of operating in an ESB environment and combine Sensor Web and GIS technologies to enable prediction of electric power supply system vulnerability. Aside from electrical values, the proposed solution gathers ambient values from additional sensors installed in the existing power supply network infrastructure. GinisSense aggregates gathered data according to an adapted Omnibus data fusion model and applies decision-making logic on the aggregated data. Detected vulnerabilities are visualized to end-users through means of a specialized Web GIS application.
ESB-Based Sensor Web Integration for the Prediction of Electric Power Supply System Vulnerability

Science.gov (United States)

Stoimenov, Leonid; Bogdanovic, Milos; Bogdanovic-Dinic, Sanja

2013-01-01

Electric power supply companies increasingly rely on enterprise IT systems to provide them with a comprehensive view of the state of the distribution network. Within a utility-wide network, enterprise IT systems collect data from various metering devices. Such data can be effectively used for the prediction of power supply network vulnerability. The purpose of this paper is to present the Enterprise Service Bus (ESB)-based Sensor Web integration solution that we have developed with the purpose of enabling prediction of power supply network vulnerability, in terms of a prediction of defect probability for a particular network element. We will give an example of its usage and demonstrate our vulnerability prediction model on data collected from two different power supply companies. The proposed solution is an extension of the GinisSense Sensor Web-based architecture for collecting, processing, analyzing, decision making and alerting based on the data received from heterogeneous data sources. In this case, GinisSense has been upgraded to be capable of operating in an ESB environment and combine Sensor Web and GIS technologies to enable prediction of electric power supply system vulnerability. Aside from electrical values, the proposed solution gathers ambient values from additional sensors installed in the existing power supply network infrastructure. GinisSense aggregates gathered data according to an adapted Omnibus data fusion model and applies decision-making logic on the aggregated data. Detected vulnerabilities are visualized to end-users through means of a specialized Web GIS application. PMID:23955435
Stochastic or statistic? Comparing flow duration curve models in ungauged basins and changing climates

Science.gov (United States)

Müller, M. F.; Thompson, S. E.

2015-09-01

The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.
The new statistics: why and how.

Science.gov (United States)

Cumming, Geoff

2014-01-01

We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data-analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesis significance testing (NHST), we need to shift from reliance on NHST to estimation and other preferred techniques. The new statistics refers to recommended practices, including estimation based on effect sizes, confidence intervals, and meta-analysis. The techniques are not new, but adopting them widely would be new for many researchers, as well as highly beneficial. This article explains why the new statistics are important and offers guidance for their use. It describes an eight-step new-statistics strategy for research with integrity, which starts with formulation of research questions in estimation terms, has no place for NHST, and is aimed at building a cumulative quantitative discipline.
Statistical physics of interacting neural networks

Science.gov (United States)

Kinzel, Wolfgang; Metzler, Richard; Kanter, Ido

2001-12-01

Recent results on the statistical physics of time series generation and prediction are presented. A neural network is trained on quasi-periodic and chaotic sequences and overlaps to the sequence generator as well as the prediction errors are calculated numerically. For each network there exists a sequence for which it completely fails to make predictions. Two interacting networks show a transition to perfect synchronization. A pool of interacting networks shows good coordination in the minority game-a model of competition in a closed market. Finally, as a demonstration, a perceptron predicts bit sequences produced by human beings.
Model Predictive Control of Integrated Gasification Combined Cycle Power Plants

Energy Technology Data Exchange (ETDEWEB)

B. Wayne Bequette; Priyadarshi Mahapatra

2010-08-31

The primary project objectives were to understand how the process design of an integrated gasification combined cycle (IGCC) power plant affects the dynamic operability and controllability of the process. Steady-state and dynamic simulation models were developed to predict the process behavior during typical transients that occur in plant operation. Advanced control strategies were developed to improve the ability of the process to follow changes in the power load demand, and to improve performance during transitions between power levels. Another objective of the proposed work was to educate graduate and undergraduate students in the application of process systems and control to coal technology. Educational materials were developed for use in engineering courses to further broaden this exposure to many students. ASPENTECH software was used to perform steady-state and dynamic simulations of an IGCC power plant. Linear systems analysis techniques were used to assess the steady-state and dynamic operability of the power plant under various plant operating conditions. Model predictive control (MPC) strategies were developed to improve the dynamic operation of the power plants. MATLAB and SIMULINK software were used for systems analysis and control system design, and the SIMULINK functionality in ASPEN DYNAMICS was used to test the control strategies on the simulated process. Project funds were used to support a Ph.D. student to receive education and training in coal technology and the application of modeling and simulation techniques.
EUROPEAN INTEGRATION: A MULTILEVEL PROCESS THAT REQUIRES A MULTILEVEL STATISTICAL ANALYSIS

Directory of Open Access Journals (Sweden)

Roxana-Otilia-Sonia HRITCU

2015-11-01

Full Text Available A process of market regulation and a system of multi-level governance and several supranational, national and subnational levels of decision making, European integration subscribes to being a multilevel phenomenon. The individual characteristics of citizens, as well as the environment where the integration process takes place, are important. To understand the European integration and its consequences it is important to develop and test multi-level theories that consider individual-level characteristics, as well as the overall context where individuals act and express their characteristics. A central argument of this paper is that support for European integration is influenced by factors operating at different levels. We review and present theories and related research on the use of multilevel analysis in the European area. This paper draws insights on various aspects and consequences of the European integration to take stock of what we know about how and why to use multilevel modeling.
Statistical evaluation of diagnostic performance topics in ROC analysis

CERN Document Server

Zou, Kelly H; Bandos, Andriy I; Ohno-Machado, Lucila; Rockette, Howard E

2016-01-01

Statistical evaluation of diagnostic performance in general and Receiver Operating Characteristic (ROC) analysis in particular are important for assessing the performance of medical tests and statistical classifiers, as well as for evaluating predictive models or algorithms. This book presents innovative approaches in ROC analysis, which are relevant to a wide variety of applications, including medical imaging, cancer research, epidemiology, and bioinformatics. Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis covers areas including monotone-transformation techniques in parametric ROC analysis, ROC methods for combined and pooled biomarkers, Bayesian hierarchical transformation models, sequential designs and inferences in the ROC setting, predictive modeling, multireader ROC analysis, and free-response ROC (FROC) methodology. The book is suitable for graduate-level students and researchers in statistics, biostatistics, epidemiology, public health, biomedical engineering, radiology, medi...
Statistics of resonances for a class of billiards on the Poincare half-plane

International Nuclear Information System (INIS)

Howard, P J; Mota-Furtado, F; O'Mahony, P F; Uski, V

2005-01-01

The lower boundary of Artin's billiard on the Poincare half-plane is continuously deformed to generate a class of billiards with classical dynamics varying from fully integrable to completely chaotic. The quantum scattering problem in these open billiards is described and the statistics of both real and imaginary parts of the resonant momenta are investigated. The evolution of the resonance positions is followed as the boundary is varied which leads to large changes in their distribution. The transition to arithmetic chaos in Artin's billiard, which is responsible for the Poissonian level-spacing statistics of the bound states in the continuum (cusp forms) at the same time as the formation of a set of resonances all with width 1/4 and real parts determined by the zeros of Riemann's zeta function, is closely examined. Regimes are found which obey the universal predictions of random matrix theory (RMT) as well as exhibiting non-universal long-range correlations. The Brody parameter is used to describe the transitions between different regimes
A Bayesian reliability evaluation method with integrated accelerated degradation testing and field information

International Nuclear Information System (INIS)

Wang, Lizhi; Pan, Rong; Li, Xiaoyang; Jiang, Tongmin

2013-01-01

Accelerated degradation testing (ADT) is a common approach in reliability prediction, especially for products with high reliability. However, oftentimes the laboratory condition of ADT is different from the field condition; thus, to predict field failure, one need to calibrate the prediction made by using ADT data. In this paper a Bayesian evaluation method is proposed to integrate the ADT data from laboratory with the failure data from field. Calibration factors are introduced to calibrate the difference between the lab and the field conditions so as to predict a product's actual field reliability more accurately. The information fusion and statistical inference procedure are carried out through a Bayesian approach and Markov chain Monte Carlo methods. The proposed method is demonstrated by two examples and the sensitivity analysis to prior distribution assumption
Exploring the Associations Among Nutrition, Science, and Mathematics Knowledge for an Integrative, Food-Based Curriculum.

Science.gov (United States)

Stage, Virginia C; Kolasa, Kathryn M; Díaz, Sebastián R; Duffrin, Melani W

2018-01-01

Explore associations between nutrition, science, and mathematics knowledge to provide evidence that integrating food/nutrition education in the fourth-grade curriculum may support gains in academic knowledge. Secondary analysis of a quasi-experimental study. Sample included 438 students in 34 fourth-grade classrooms across North Carolina and Ohio; mean age 10 years old; gender (I = 53.2% female; C = 51.6% female). Dependent variable = post-test-nutrition knowledge; independent variables = baseline-nutrition knowledge, and post-test science and mathematics knowledge. Analyses included descriptive statistics and multiple linear regression. The hypothesized model predicted post-nutrition knowledge (F(437) = 149.4, p mathematics knowledge were predictive of nutrition knowledge indicating use of an integrative science and mathematics curriculum to improve academic knowledge may also simultaneously improve nutrition knowledge among fourth-grade students. Teachers can benefit from integration by meeting multiple academic standards, efficiently using limited classroom time, and increasing nutrition education provided in the classroom. © 2018, American School Health Association.
Combination of statistical and physically based methods to assess shallow slide susceptibility at the basin scale

Science.gov (United States)

Oliveira, Sérgio C.; Zêzere, José L.; Lajas, Sara; Melo, Raquel

2017-07-01

Approaches used to assess shallow slide susceptibility at the basin scale are conceptually different depending on the use of statistical or physically based methods. The former are based on the assumption that the same causes are more likely to produce the same effects, whereas the latter are based on the comparison between forces which tend to promote movement along the slope and the counteracting forces that are resistant to motion. Within this general framework, this work tests two hypotheses: (i) although conceptually and methodologically distinct, the statistical and deterministic methods generate similar shallow slide susceptibility results regarding the model's predictive capacity and spatial agreement; and (ii) the combination of shallow slide susceptibility maps obtained with statistical and physically based methods, for the same study area, generate a more reliable susceptibility model for shallow slide occurrence. These hypotheses were tested at a small test site (13.9 km2) located north of Lisbon (Portugal), using a statistical method (the information value method, IV) and a physically based method (the infinite slope method, IS). The landslide susceptibility maps produced with the statistical and deterministic methods were combined into a new landslide susceptibility map. The latter was based on a set of integration rules defined by the cross tabulation of the susceptibility classes of both maps and analysis of the corresponding contingency tables. The results demonstrate a higher predictive capacity of the new shallow slide susceptibility map, which combines the independent results obtained with statistical and physically based models. Moreover, the combination of the two models allowed the identification of areas where the results of the information value and the infinite slope methods are contradictory. Thus, these areas were classified as uncertain and deserve additional investigation at a more detailed scale.
A study on the fatigue life prediction of tire belt-layers using probabilistic method

International Nuclear Information System (INIS)

Lee, Dong Woo; Park, Jong Sang; Lee, Tae Won; Kim, Seong Rae; Sung, Ki Deug; Huh, Sun Chul

2013-01-01

Tire belt separation failure is occurred by internal cracks generated in *1 and *2 belt layers and by its growth. And belt failure seriously affects tire endurance. Therefore, to improve the tire endurance, it is necessary to analyze tire crack growth behavior and predict fatigue life. Generally, the prediction of tire endurance is performed by the experimental method using tire test machine. But it takes much cost and time to perform experiment. In this paper, to predict tire fatigue life, we applied deterministic fracture mechanics approach, based on finite element analysis. Also, probabilistic analysis method based on statistics using Monte Carlo simulation is presented. Above mentioned two methods include a global-local finite element analysis to provide the detail necessary to model explicitly an internal crack and calculate the J-integral for tire life prediction.
Integrating Non-Tidal Sea Level data from altimetry and tide gauges for coastal sea level prediction

DEFF Research Database (Denmark)

Cheng, Yongcun; Andersen, Ole Baltazar; Knudsen, Per

2012-01-01

The main objective of this paper is to integrate Non-Tidal Sea Level (NSL) from the joint TOPEX, Jason-1 and Jason-2 satellite altimetry with tide gauge data at the west and north coast of the United Kingdom for coastal sea level prediction. The temporal correlation coefficient between altimetric...... NSLs and tide gauge data reaches a maximum higher than 90% for each gauge. The results show that the multivariate regression approach can efficiently integrate the two types of data in the coastal waters of the area. The Multivariate Regression Model is established by integrating the along-track NSL...... from the joint TOPEX/Jason-1/Jason-2 altimeters with that from eleven tide gauges. The model results give a maximum hindcast skill of 0.95, which means maximum 95% of NSL variance can be explained by the model. The minimum Root Mean Square Error (RMSe) between altimetric observations and model...
Statistical prediction of seasonal discharge in Central Asia for water resources management: development of a generic (pre-)operational modeling tool

Science.gov (United States)

Apel, Heiko; Baimaganbetov, Azamat; Kalashnikova, Olga; Gavrilenko, Nadejda; Abdykerimova, Zharkinay; Agalhanova, Marina; Gerlitz, Lars; Unger-Shayesteh, Katy; Vorogushyn, Sergiy; Gafurov, Abror

2017-04-01

for robustness by a leave-one-out cross validation. Based on the cross validation the predictive uncertainty was quantified for every prediction model. According to the official procedures of the hydromet services forecasts of the mean seasonal discharge of the period April to September are derived every month starting from January until June. The application of the model for several catchments in Central Asia - ranging from small to the largest rivers - for the period 2000-2015 provided skillful forecasts for most catchments already in January. The skill of the prediction increased every month, with R2 values often in the range 0.8 - 0.9 in April just before the prediction period. The forecasts further improve in the following months, most likely due to the integration of spring precipitation, which is not included in the predictors before May, or spring discharge, which contains indicative information for the overall seasonal discharge. In summary, the proposed generic automatic forecast model development tool provides robust predictions for seasonal water availability in Central Asia, which will be tested against the official forecasts in the upcoming years, with the vision of eventual operational implementation.
On predicting quantal cross sections by interpolation: Surprisal analysis of j/sub z/CCS and statistical j/sub z/ results

International Nuclear Information System (INIS)

Goldflam, R.; Kouri, D.J.

1976-01-01

New methods for predicting the full matrix of integral cross sections are developed by combining the surprisal analysis of Bernstein and Levine with the j/sub z/-conserving coupled states method (j/sub z/CCS) of McGuire, Kouri, and Pack and with the statistical j/sub z/ approximation (Sj/sub z/) of Kouri, Shimoni, and Heil. A variety of approaches is possible and only three are studied in the present work. These are (a) a surprisal fit of the j=0→j' column of the j/sub z/CCS cross section matrix (thereby requiring only a solution of the lambda=0 set of j/sub z/CCS equations), (b) a surprisal fit of the lambda-bar=0 Sj/sub z/ cross section matrix (again requiring solution of the lambda=0 set of j/sub z/CCS equations only), and (c) a surprisal fit of a lambda-bar not equal to 0 Sj/sub z/ submatrix (involving input cross sections for j,j'> or =lambda-bar transitions only). The last approach requires the solution of the lambda=lambda-bar set of j/sub z/CCS equations only, which requires less computation effort than the effective potential method. We explore three different choices for the prior and two-parameter (i.e., linear) and three-parameter (i.e., parabolic) fits as applied to Ar--N 2 collisions. The results are in general very encouraging and for one choice of prior give results which are within 20% of the exact j/sub z/CCS results
Semiclassical analysis, Witten Laplacians, and statistical mechanis

CERN Document Server

Helffer, Bernard

2002-01-01

This important book explains how the technique of Witten Laplacians may be useful in statistical mechanics. It considers the problem of analyzing the decay of correlations, after presenting its origin in statistical mechanics. In addition, it compares the Witten Laplacian approach with other techniques, such as the transfer matrix approach and its semiclassical analysis. The author concludes by providing a complete proof of the uniform Log-Sobolev inequality. Contents: Witten Laplacians Approach; Problems in Statistical Mechanics with Discrete Spins; Laplace Integrals and Transfer Operators; S
Internally-directed cognition and mindfulness:An integrative perspective derived from reactive versus predictive control systems theory

Directory of Open Access Journals (Sweden)

Mattie eTops

2014-05-01

Full Text Available In the present paper we will apply the Predictive And Reactive Control Systems (PARCS theory as a framework that integrates competing theories of neural substrates of awareness by describing the default mode network (DMN and anterior insula (AI as parts of two different behavioral and homeostatic control systems. The DMN, a network that becomes active at rest when there is no external stimulation or task to perform, has been implicated in self-reflective awareness and prospection. By contrast, the AI is associated with awareness and task-related attention. This has led to competing theories stressing the role of the DMN in self-awareness versus the role of interoceptive and emotional information integration in the AI in awareness of the emotional moment. In PARCS, the respective functions of the DMN and AI in a specific control system explains their association with different qualities of awareness, and how mental states can shift from one state (e.g., prospective self-reflection to the other (e.g., awareness of the emotional moment depending on the relative dominance of control systems. These shifts between reactive and predictive control are part of processes that enable the intake of novel information, integration of this novel information within existing knowledge structures, and the creation of a continuous personal context in which novel information can be integrated and understood. As such, PARCS can explain key characteristics of mental states, such as their temporal and spatial focus (e.g., a focus on the here and now vs. the future; a 1st person vs. a 3rd person perspective. PARCS further relates mental states to brain states and functions, such as activation of the DMN or hemispheric asymmetry in frontal cortical functions. Together, PARCS deepens the understanding of a broad range of mental states, including mindfulness, mind wandering, rumination, autobiographical memory, imagery, and the experience of self.
Study on integrated approach of Nuclear Accident Hazard Predicting, Warning, and Optimized Controlling System based on GIS

International Nuclear Information System (INIS)

Tang Lijuan; Huang Shunxiang; Wang Xinming

2012-01-01

The issue of nuclear safety becomes the attention focus of international society after the nuclear accident happened in Fukushima. Aiming at the requirements of the prevention and controlling of Nuclear Accident establishment of Nuclear Accident Hazard Predicting, Warning and optimized Controlling System (NAPWS) is a imperative project that our country and army are desiderating, which includes multiple fields of subject as nuclear physics, atmospheric science, security science, computer science and geographical information technology, etc. Multiplatform, multi-system and multi-mode are integrated effectively based on GIS, accordingly the Predicting, Warning, and Optimized Controlling technology System of Nuclear Accident Hazard is established. (authors)

Some links on this page may take you to non-federal websites. Their policies may differ from this site.