Statistical criteria for characterizing irradiance time series.
Energy Technology Data Exchange (ETDEWEB)
Stein, Joshua S.; Ellis, Abraham; Hansen, Clifford W.
2010-10-01
We propose and examine several statistical criteria for characterizing time series of solar irradiance. Time series of irradiance are used in analyses that seek to quantify the performance of photovoltaic (PV) power systems over time. Time series of irradiance are either measured or are simulated using models. Simulations of irradiance are often calibrated to or generated from statistics for observed irradiance and simulations are validated by comparing the simulation output to the observed irradiance. Criteria used in this comparison should derive from the context of the analyses in which the simulated irradiance is to be used. We examine three statistics that characterize time series and their use as criteria for comparing time series. We demonstrate these statistics using observed irradiance data recorded in August 2007 in Las Vegas, Nevada, and in June 2009 in Albuquerque, New Mexico.
Trottini, Mario; Vigo, Isabel; Belda, Santiago
2015-01-01
Given a time series, running trends analysis (RTA) involves evaluating least squares trends over overlapping time windows of L consecutive time points, with overlap by all but one observation. This produces a new series called the “running trends series,” which is used as summary statistics of the original series for further analysis. In recent years, RTA has been widely used in climate applied research as summary statistics for time series and time series association. There is no doubt that ...
The Statistical Analysis of Time Series
Anderson, T W
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Time Series Analysis Based on Running Mann Whitney Z Statistics
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Inverse statistical approach in heartbeat time series
International Nuclear Information System (INIS)
Ebadi, H; Shirazi, A H; Mani, Ali R; Jafari, G R
2011-01-01
We present an investigation on heart cycle time series, using inverse statistical analysis, a concept borrowed from studying turbulence. Using this approach, we studied the distribution of the exit times needed to achieve a predefined level of heart rate alteration. Such analysis uncovers the most likely waiting time needed to reach a certain change in the rate of heart beat. This analysis showed a significant difference between the raw data and shuffled data, when the heart rate accelerates or decelerates to a rare event. We also report that inverse statistical analysis can distinguish between the electrocardiograms taken from healthy volunteers and patients with heart failure
Weighted statistical parameters for irregularly sampled time series
Rimoldini, Lorenzo
2014-01-01
Unevenly spaced time series are common in astronomy because of the day-night cycle, weather conditions, dependence on the source position in the sky, allocated telescope time and corrupt measurements, for example, or inherent to the scanning law of satellites like Hipparcos and the forthcoming Gaia. Irregular sampling often causes clumps of measurements and gaps with no data which can severely disrupt the values of estimators. This paper aims at improving the accuracy of common statistical parameters when linear interpolation (in time or phase) can be considered an acceptable approximation of a deterministic signal. A pragmatic solution is formulated in terms of a simple weighting scheme, adapting to the sampling density and noise level, applicable to large data volumes at minimal computational cost. Tests on time series from the Hipparcos periodic catalogue led to significant improvements in the overall accuracy and precision of the estimators with respect to the unweighted counterparts and those weighted by inverse-squared uncertainties. Automated classification procedures employing statistical parameters weighted by the suggested scheme confirmed the benefits of the improved input attributes. The classification of eclipsing binaries, Mira, RR Lyrae, Delta Cephei and Alpha2 Canum Venaticorum stars employing exclusively weighted descriptive statistics achieved an overall accuracy of 92 per cent, about 6 per cent higher than with unweighted estimators.
Interrupted Time Series Versus Statistical Process Control in Quality Improvement Projects.
Andersson Hagiwara, Magnus; Andersson Gäre, Boel; Elg, Mattias
2016-01-01
To measure the effect of quality improvement interventions, it is appropriate to use analysis methods that measure data over time. Examples of such methods include statistical process control analysis and interrupted time series with segmented regression analysis. This article compares the use of statistical process control analysis and interrupted time series with segmented regression analysis for evaluating the longitudinal effects of quality improvement interventions, using an example study on an evaluation of a computerized decision support system.
A general statistical test for correlations in a finite-length time series.
Hanson, Jeffery A; Yang, Haw
2008-06-07
The statistical properties of the autocorrelation function from a time series composed of independently and identically distributed stochastic variables has been studied. Analytical expressions for the autocorrelation function's variance have been derived. It has been found that two common ways of calculating the autocorrelation, moving-average and Fourier transform, exhibit different uncertainty characteristics. For periodic time series, the Fourier transform method is preferred because it gives smaller uncertainties that are uniform through all time lags. Based on these analytical results, a statistically robust method has been proposed to test the existence of correlations in a time series. The statistical test is verified by computer simulations and an application to single-molecule fluorescence spectroscopy is discussed.
On statistical inference in time series analysis of the evolution of road safety.
Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora
2013-11-01
Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.
Time series analysis time series analysis methods and applications
Rao, Tata Subba; Rao, C R
2012-01-01
The field of statistics not only affects all areas of scientific activity, but also many other matters such as public policy. It is branching rapidly into so many different subjects that a series of handbooks is the only way of comprehensively presenting the various aspects of statistical methodology, applications, and recent developments. The Handbook of Statistics is a series of self-contained reference books. Each volume is devoted to a particular topic in statistics, with Volume 30 dealing with time series. The series is addressed to the entire community of statisticians and scientists in various disciplines who use statistical methodology in their work. At the same time, special emphasis is placed on applications-oriented techniques, with the applied statistician in mind as the primary audience. Comprehensively presents the various aspects of statistical methodology Discusses a wide variety of diverse applications and recent developments Contributors are internationally renowened experts in their respect...
ESTIMATING RELIABILITY OF DISTURBANCES IN SATELLITE TIME SERIES DATA BASED ON STATISTICAL ANALYSIS
Directory of Open Access Journals (Sweden)
Z.-G. Zhou
2016-06-01
Full Text Available Normally, the status of land cover is inherently dynamic and changing continuously on temporal scale. However, disturbances or abnormal changes of land cover — caused by such as forest fire, flood, deforestation, and plant diseases — occur worldwide at unknown times and locations. Timely detection and characterization of these disturbances is of importance for land cover monitoring. Recently, many time-series-analysis methods have been developed for near real-time or online disturbance detection, using satellite image time series. However, the detection results were only labelled with “Change/ No change” by most of the present methods, while few methods focus on estimating reliability (or confidence level of the detected disturbances in image time series. To this end, this paper propose a statistical analysis method for estimating reliability of disturbances in new available remote sensing image time series, through analysis of full temporal information laid in time series data. The method consists of three main steps. (1 Segmenting and modelling of historical time series data based on Breaks for Additive Seasonal and Trend (BFAST. (2 Forecasting and detecting disturbances in new time series data. (3 Estimating reliability of each detected disturbance using statistical analysis based on Confidence Interval (CI and Confidence Levels (CL. The method was validated by estimating reliability of disturbance regions caused by a recent severe flooding occurred around the border of Russia and China. Results demonstrated that the method can estimate reliability of disturbances detected in satellite image with estimation error less than 5% and overall accuracy up to 90%.
van den Broek, PLC; van Egmond, J; van Rijn, CM; Takens, F; Coenen, AML; Booij, LHDJ
2005-01-01
Background: This study assessed the feasibility of online calculation of the correlation integral (C(r)) aiming to apply C(r)derived statistics. For real-time application it is important to reduce calculation time. It is shown how our method works for EEG time series. Methods: To achieve online
Broek, P.L.C. van den; Egmond, J. van; Rijn, C.M. van; Takens, F.; Coenen, A.M.L.; Booij, L.H.D.J.
2005-01-01
This study assessed the feasibility of online calculation of the correlation integral (C(r)) aiming to apply C(r)-derived statistics. For real-time application it is important to reduce calculation time. It is shown how our method works for EEG time series. Methods: To achieve online calculation of
Nonlinear Fluctuation Behavior of Financial Time Series Model by Statistical Physics System
Directory of Open Access Journals (Sweden)
Wuyang Cheng
2014-01-01
Full Text Available We develop a random financial time series model of stock market by one of statistical physics systems, the stochastic contact interacting system. Contact process is a continuous time Markov process; one interpretation of this model is as a model for the spread of an infection, where the epidemic spreading mimics the interplay of local infections and recovery of individuals. From this financial model, we study the statistical behaviors of return time series, and the corresponding behaviors of returns for Shanghai Stock Exchange Composite Index (SSECI and Hang Seng Index (HSI are also comparatively studied. Further, we investigate the Zipf distribution and multifractal phenomenon of returns and price changes. Zipf analysis and MF-DFA analysis are applied to investigate the natures of fluctuations for the stock market.
Statistical methods of parameter estimation for deterministically chaotic time series
Pisarenko, V. F.; Sornette, D.
2004-03-01
We discuss the possibility of applying some standard statistical methods (the least-square method, the maximum likelihood method, and the method of statistical moments for estimation of parameters) to deterministically chaotic low-dimensional dynamic system (the logistic map) containing an observational noise. A “segmentation fitting” maximum likelihood (ML) method is suggested to estimate the structural parameter of the logistic map along with the initial value x1 considered as an additional unknown parameter. The segmentation fitting method, called “piece-wise” ML, is similar in spirit but simpler and has smaller bias than the “multiple shooting” previously proposed. Comparisons with different previously proposed techniques on simulated numerical examples give favorable results (at least, for the investigated combinations of sample size N and noise level). Besides, unlike some suggested techniques, our method does not require the a priori knowledge of the noise variance. We also clarify the nature of the inherent difficulties in the statistical analysis of deterministically chaotic time series and the status of previously proposed Bayesian approaches. We note the trade off between the need of using a large number of data points in the ML analysis to decrease the bias (to guarantee consistency of the estimation) and the unstable nature of dynamical trajectories with exponentially fast loss of memory of the initial condition. The method of statistical moments for the estimation of the parameter of the logistic map is discussed. This method seems to be the unique method whose consistency for deterministically chaotic time series is proved so far theoretically (not only numerically).
Time series prediction: statistical and neural techniques
Zahirniak, Daniel R.; DeSimio, Martin P.
1996-03-01
In this paper we compare the performance of nonlinear neural network techniques to those of linear filtering techniques in the prediction of time series. Specifically, we compare the results of using the nonlinear systems, known as multilayer perceptron and radial basis function neural networks, with the results obtained using the conventional linear Wiener filter, Kalman filter and Widrow-Hoff adaptive filter in predicting future values of stationary and non- stationary time series. Our results indicate the performance of each type of system is heavily dependent upon the form of the time series being predicted and the size of the system used. In particular, the linear filters perform adequately for linear or near linear processes while the nonlinear systems perform better for nonlinear processes. Since the linear systems take much less time to be developed, they should be tried prior to using the nonlinear systems when the linearity properties of the time series process are unknown.
Hybrid perturbation methods based on statistical time series models
San-Juan, Juan Félix; San-Martín, Montserrat; Pérez, Iván; López, Rosario
2016-04-01
In this work we present a new methodology for orbit propagation, the hybrid perturbation theory, based on the combination of an integration method and a prediction technique. The former, which can be a numerical, analytical or semianalytical theory, generates an initial approximation that contains some inaccuracies derived from the fact that, in order to simplify the expressions and subsequent computations, not all the involved forces are taken into account and only low-order terms are considered, not to mention the fact that mathematical models of perturbations not always reproduce physical phenomena with absolute precision. The prediction technique, which can be based on either statistical time series models or computational intelligence methods, is aimed at modelling and reproducing missing dynamics in the previously integrated approximation. This combination results in the precision improvement of conventional numerical, analytical and semianalytical theories for determining the position and velocity of any artificial satellite or space debris object. In order to validate this methodology, we present a family of three hybrid orbit propagators formed by the combination of three different orders of approximation of an analytical theory and a statistical time series model, and analyse their capability to process the effect produced by the flattening of the Earth. The three considered analytical components are the integration of the Kepler problem, a first-order and a second-order analytical theories, whereas the prediction technique is the same in the three cases, namely an additive Holt-Winters method.
Introduction to Time Series Modeling
Kitagawa, Genshiro
2010-01-01
In time series modeling, the behavior of a certain phenomenon is expressed in relation to the past values of itself and other covariates. Since many important phenomena in statistical analysis are actually time series and the identification of conditional distribution of the phenomenon is an essential part of the statistical modeling, it is very important and useful to learn fundamental methods of time series modeling. Illustrating how to build models for time series using basic methods, "Introduction to Time Series Modeling" covers numerous time series models and the various tools f
CROSAT: A digital computer program for statistical-spectral analysis of two discrete time series
International Nuclear Information System (INIS)
Antonopoulos Domis, M.
1978-03-01
The program CROSAT computes directly from two discrete time series auto- and cross-spectra, transfer and coherence functions, using a Fast Fourier Transform subroutine. Statistical analysis of the time series is optional. While of general use the program is constructed to be immediately compatible with the ICL 4-70 and H316 computers at AEE Winfrith, and perhaps with minor modifications, with any other hardware system. (author)
Record statistics of financial time series and geometric random walks.
Sabir, Behlool; Santhanam, M S
2014-09-01
The study of record statistics of correlated series in physics, such as random walks, is gaining momentum, and several analytical results have been obtained in the past few years. In this work, we study the record statistics of correlated empirical data for which random walk models have relevance. We obtain results for the records statistics of select stock market data and the geometric random walk, primarily through simulations. We show that the distribution of the age of records is a power law with the exponent α lying in the range 1.5≤α≤1.8. Further, the longest record ages follow the Fréchet distribution of extreme value theory. The records statistics of geometric random walk series is in good agreement with that obtained from empirical stock data.
A statistical approach for segregating cognitive task stages from multivariate fMRI BOLD time series
Directory of Open Access Journals (Sweden)
Charmaine eDemanuele
2015-10-01
Full Text Available Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from fMRI blood oxygenation level dependent (BOLD time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC, but not in the primary visual cortex (V1. Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel
A new Markov-chain-related statistical approach for modelling synthetic wind power time series
International Nuclear Information System (INIS)
Pesch, T; Hake, J F; Schröders, S; Allelein, H J
2015-01-01
The integration of rising shares of volatile wind power in the generation mix is a major challenge for the future energy system. To address the uncertainties involved in wind power generation, models analysing and simulating the stochastic nature of this energy source are becoming increasingly important. One statistical approach that has been frequently used in the literature is the Markov chain approach. Recently, the method was identified as being of limited use for generating wind time series with time steps shorter than 15–40 min as it is not capable of reproducing the autocorrelation characteristics accurately. This paper presents a new Markov-chain-related statistical approach that is capable of solving this problem by introducing a variable second lag. Furthermore, additional features are presented that allow for the further adjustment of the generated synthetic time series. The influences of the model parameter settings are examined by meaningful parameter variations. The suitability of the approach is demonstrated by an application analysis with the example of the wind feed-in in Germany. It shows that—in contrast to conventional Markov chain approaches—the generated synthetic time series do not systematically underestimate the required storage capacity to balance wind power fluctuation. (paper)
Brenčič, Mihael
2016-01-01
Northern hemisphere elementary circulation mechanisms, defined with the Dzerdzeevski classification and published on a daily basis from 1899-2012, are analysed with statistical methods as continuous categorical time series. Classification consists of 41 elementary circulation mechanisms (ECM), which are assigned to calendar days. Empirical marginal probabilities of each ECM were determined. Seasonality and the periodicity effect were investigated with moving dispersion filters and randomisation procedure on the ECM categories as well as with the time analyses of the ECM mode. The time series were determined as being non-stationary with strong time-dependent trends. During the investigated period, periodicity interchanges with periods when no seasonality is present. In the time series structure, the strongest division is visible at the milestone of 1986, showing that the atmospheric circulation pattern reflected in the ECM has significantly changed. This change is result of the change in the frequency of ECM categories; before 1986, the appearance of ECM was more diverse, and afterwards fewer ECMs appear. The statistical approach applied to the categorical climatic time series opens up new potential insight into climate variability and change studies that have to be performed in the future.
How to statistically analyze nano exposure measurement results: using an ARIMA time series approach
International Nuclear Information System (INIS)
Klein Entink, Rinke H.; Fransman, Wouter; Brouwer, Derk H.
2011-01-01
Measurement strategies for exposure to nano-sized particles differ from traditional integrated sampling methods for exposure assessment by the use of real-time instruments. The resulting measurement series is a time series, where typically the sequential measurements are not independent from each other but show a pattern of autocorrelation. This article addresses the statistical difficulties when analyzing real-time measurements for exposure assessment to manufactured nano objects. To account for autocorrelation patterns, Autoregressive Integrated Moving Average (ARIMA) models are proposed. A simulation study shows the pitfalls of using a standard t-test and the application of ARIMA models is illustrated with three real-data examples. Some practical suggestions for the data analysis of real-time exposure measurements conclude this article.
Detecting nonlinear structure in time series
International Nuclear Information System (INIS)
Theiler, J.
1991-01-01
We describe an approach for evaluating the statistical significance of evidence for nonlinearity in a time series. The formal application of our method requires the careful statement of a null hypothesis which characterizes a candidate linear process, the generation of an ensemble of ''surrogate'' data sets which are similar to the original time series but consistent with the null hypothesis, and the computation of a discriminating statistic for the original and for each of the surrogate data sets. The idea is to test the original time series against the null hypothesis by checking whether the discriminating statistic computed for the original time series differs significantly from the statistics computed for each of the surrogate sets. While some data sets very cleanly exhibit low-dimensional chaos, there are many cases where the evidence is sketchy and difficult to evaluate. We hope to provide a framework within which such claims of nonlinearity can be evaluated. 5 refs., 4 figs
Shirota, Yukari; Hashimoto, Takako; Fitri Sari, Riri
2018-03-01
It has been very significant to visualize time series big data. In the paper we shall discuss a new analysis method called “statistical shape analysis” or “geometry driven statistics” on time series statistical data in economics. In the paper, we analyse the agriculture, value added and industry, value added (percentage of GDP) changes from 2000 to 2010 in Asia. We handle the data as a set of landmarks on a two-dimensional image to see the deformation using the principal components. The point of the analysis method is the principal components of the given formation which are eigenvectors of its bending energy matrix. The local deformation can be expressed as the set of non-Affine transformations. The transformations give us information about the local differences between in 2000 and in 2010. Because the non-Affine transformation can be decomposed into a set of partial warps, we present the partial warps visually. The statistical shape analysis is widely used in biology but, in economics, no application can be found. In the paper, we investigate its potential to analyse the economic data.
Stochastic models for time series
Doukhan, Paul
2018-01-01
This book presents essential tools for modelling non-linear time series. The first part of the book describes the main standard tools of probability and statistics that directly apply to the time series context to obtain a wide range of modelling possibilities. Functional estimation and bootstrap are discussed, and stationarity is reviewed. The second part describes a number of tools from Gaussian chaos and proposes a tour of linear time series models. It goes on to address nonlinearity from polynomial or chaotic models for which explicit expansions are available, then turns to Markov and non-Markov linear models and discusses Bernoulli shifts time series models. Finally, the volume focuses on the limit theory, starting with the ergodic theorem, which is seen as the first step for statistics of time series. It defines the distributional range to obtain generic tools for limit theory under long or short-range dependences (LRD/SRD) and explains examples of LRD behaviours. More general techniques (central limit ...
Duality between Time Series and Networks
Campanharo, Andriana S. L. O.; Sirer, M. Irmak; Malmgren, R. Dean; Ramos, Fernando M.; Amaral, Luís A. Nunes.
2011-01-01
Studying the interaction between a system's components and the temporal evolution of the system are two common ways to uncover and characterize its internal workings. Recently, several maps from a time series to a network have been proposed with the intent of using network metrics to characterize time series. Although these maps demonstrate that different time series result in networks with distinct topological properties, it remains unclear how these topological properties relate to the original time series. Here, we propose a map from a time series to a network with an approximate inverse operation, making it possible to use network statistics to characterize time series and time series statistics to characterize networks. As a proof of concept, we generate an ensemble of time series ranging from periodic to random and confirm that application of the proposed map retains much of the information encoded in the original time series (or networks) after application of the map (or its inverse). Our results suggest that network analysis can be used to distinguish different dynamic regimes in time series and, perhaps more importantly, time series analysis can provide a powerful set of tools that augment the traditional network analysis toolkit to quantify networks in new and useful ways. PMID:21858093
Berti, Matteo; Corsini, Alessandro; Franceschini, Silvia; Iannacone, Jean Pascal
2013-04-01
The application of space borne synthetic aperture radar interferometry has progressed, over the last two decades, from the pioneer use of single interferograms for analyzing changes on the earth's surface to the development of advanced multi-interferogram techniques to analyze any sort of natural phenomena which involves movements of the ground. The success of multi-interferograms techniques in the analysis of natural hazards such as landslides and subsidence is widely documented in the scientific literature and demonstrated by the consensus among the end-users. Despite the great potential of this technique, radar interpretation of slope movements is generally based on the sole analysis of average displacement velocities, while the information embraced in multi interferogram time series is often overlooked if not completely neglected. The underuse of PS time series is probably due to the detrimental effect of residual atmospheric errors, which make the PS time series characterized by erratic, irregular fluctuations often difficult to interpret, and also to the difficulty of performing a visual, supervised analysis of the time series for a large dataset. In this work is we present a procedure for automatic classification of PS time series based on a series of statistical characterization tests. The procedure allows to classify the time series into six distinctive target trends (0=uncorrelated; 1=linear; 2=quadratic; 3=bilinear; 4=discontinuous without constant velocity; 5=discontinuous with change in velocity) and retrieve for each trend a series of descriptive parameters which can be efficiently used to characterize the temporal changes of ground motion. The classification algorithms were developed and tested using an ENVISAT datasets available in the frame of EPRS-E project (Extraordinary Plan of Environmental Remote Sensing) of the Italian Ministry of Environment (track "Modena", Northern Apennines). This dataset was generated using standard processing, then the
A Course in Time Series Analysis
Peña, Daniel; Tsay, Ruey S
2011-01-01
New statistical methods and future directions of research in time series A Course in Time Series Analysis demonstrates how to build time series models for univariate and multivariate time series data. It brings together material previously available only in the professional literature and presents a unified view of the most advanced procedures available for time series model building. The authors begin with basic concepts in univariate time series, providing an up-to-date presentation of ARIMA models, including the Kalman filter, outlier analysis, automatic methods for building ARIMA models, a
van den Akker, R.
2007-01-01
This thesis adresses statistical problems in econometrics. The first part contributes statistical methodology for nonnegative integer-valued time series. The second part of this thesis discusses semiparametric estimation in copula models and develops semiparametric lower bounds for a large class of
Neural Network Models for Time Series Forecasts
Tim Hill; Marcus O'Connor; William Remus
1996-01-01
Neural networks have been advocated as an alternative to traditional statistical forecasting methods. In the present experiment, time series forecasts produced by neural networks are compared with forecasts from six statistical time series methods generated in a major forecasting competition (Makridakis et al. [Makridakis, S., A. Anderson, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. Parzen, R. Winkler. 1982. The accuracy of extrapolation (time series) methods: Results of a ...
International Nuclear Information System (INIS)
Kopsaftopoulos, Fotis P; Fassois, Spilios D
2011-01-01
A comparative assessment of several vibration based statistical time series methods for Structural Health Monitoring (SHM) is presented via their application to a scale aircraft skeleton laboratory structure. A brief overview of the methods, which are either scalar or vector type, non-parametric or parametric, and pertain to either the response-only or excitation-response cases, is provided. Damage diagnosis, including both the detection and identification subproblems, is tackled via scalar or vector vibration signals. The methods' effectiveness is assessed via repeated experiments under various damage scenarios, with each scenario corresponding to the loosening of one or more selected bolts. The results of the study confirm the 'global' damage detection capability and effectiveness of statistical time series methods for SHM.
西埜, 晴久
2004-01-01
The paper investigates an application of long-memory processes to economic time series. We show properties of long-memory processes, which are motivated to model a long-memory phenomenon in economic time series. An FARIMA model is described as an example of long-memory model in statistical terms. The paper explains basic limit theorems and estimation methods for long-memory processes in order to apply long-memory models to economic time series.
Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.
2018-01-01
This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.
Record statistics of a strongly correlated time series: random walks and Lévy flights
Godrèche, Claude; Majumdar, Satya N.; Schehr, Grégory
2017-08-01
We review recent advances on the record statistics of strongly correlated time series, whose entries denote the positions of a random walk or a Lévy flight on a line. After a brief survey of the theory of records for independent and identically distributed random variables, we focus on random walks. During the last few years, it was indeed realized that random walks are a very useful ‘laboratory’ to test the effects of correlations on the record statistics. We start with the simple one-dimensional random walk with symmetric jumps (both continuous and discrete) and discuss in detail the statistics of the number of records, as well as of the ages of the records, i.e. the lapses of time between two successive record breaking events. Then we review the results that were obtained for a wide variety of random walk models, including random walks with a linear drift, continuous time random walks, constrained random walks (like the random walk bridge) and the case of multiple independent random walkers. Finally, we discuss further observables related to records, like the record increments, as well as some questions raised by physical applications of record statistics, like the effects of measurement error and noise.
Mendoza-Rosas, Ana Teresa; De la Cruz-Reyna, Servando
2008-09-01
The probabilistic analysis of volcanic eruption time series is an essential step for the assessment of volcanic hazard and risk. Such series describe complex processes involving different types of eruptions over different time scales. A statistical method linking geological and historical eruption time series is proposed for calculating the probabilities of future eruptions. The first step of the analysis is to characterize the eruptions by their magnitudes. As is the case in most natural phenomena, lower magnitude events are more frequent, and the behavior of the eruption series may be biased by such events. On the other hand, eruptive series are commonly studied using conventional statistics and treated as homogeneous Poisson processes. However, time-dependent series, or sequences including rare or extreme events, represented by very few data of large eruptions require special methods of analysis, such as the extreme-value theory applied to non-homogeneous Poisson processes. Here we propose a general methodology for analyzing such processes attempting to obtain better estimates of the volcanic hazard. This is done in three steps: Firstly, the historical eruptive series is complemented with the available geological eruption data. The linking of these series is done assuming an inverse relationship between the eruption magnitudes and the occurrence rate of each magnitude class. Secondly, we perform a Weibull analysis of the distribution of repose time between successive eruptions. Thirdly, the linked eruption series are analyzed as a non-homogeneous Poisson process with a generalized Pareto distribution as intensity function. As an application, the method is tested on the eruption series of five active polygenetic Mexican volcanoes: Colima, Citlaltépetl, Nevado de Toluca, Popocatépetl and El Chichón, to obtain hazard estimates.
International Work-Conference on Time Series
Pomares, Héctor
2016-01-01
This volume presents selected peer-reviewed contributions from The International Work-Conference on Time Series, ITISE 2015, held in Granada, Spain, July 1-3, 2015. It discusses topics in time series analysis and forecasting, advanced methods and online learning in time series, high-dimensional and complex/big data time series as well as forecasting in real problems. The International Work-Conferences on Time Series (ITISE) provide a forum for scientists, engineers, educators and students to discuss the latest ideas and implementations in the foundations, theory, models and applications in the field of time series analysis and forecasting. It focuses on interdisciplinary and multidisciplinary research encompassing the disciplines of computer science, mathematics, statistics and econometrics.
Watanabe, Hayafumi; Sano, Yukie; Takayasu, Hideki; Takayasu, Misako
2016-11-01
To elucidate the nontrivial empirical statistical properties of fluctuations of a typical nonsteady time series representing the appearance of words in blogs, we investigated approximately 3×10^{9} Japanese blog articles over a period of six years and analyze some corresponding mathematical models. First, we introduce a solvable nonsteady extension of the random diffusion model, which can be deduced by modeling the behavior of heterogeneous random bloggers. Next, we deduce theoretical expressions for both the temporal and ensemble fluctuation scalings of this model, and demonstrate that these expressions can reproduce all empirical scalings over eight orders of magnitude. Furthermore, we show that the model can reproduce other statistical properties of time series representing the appearance of words in blogs, such as functional forms of the probability density and correlations in the total number of blogs. As an application, we quantify the abnormality of special nationwide events by measuring the fluctuation scalings of 1771 basic adjectives.
Quantifying memory in complex physiological time-series.
Shirazi, Amir H; Raoufy, Mohammad R; Ebadi, Haleh; De Rui, Michele; Schiff, Sami; Mazloom, Roham; Hajizadeh, Sohrab; Gharibzadeh, Shahriar; Dehpour, Ahmad R; Amodio, Piero; Jafari, G Reza; Montagnese, Sara; Mani, Ali R
2013-01-01
In a time-series, memory is a statistical feature that lasts for a period of time and distinguishes the time-series from a random, or memory-less, process. In the present study, the concept of "memory length" was used to define the time period, or scale over which rare events within a physiological time-series do not appear randomly. The method is based on inverse statistical analysis and provides empiric evidence that rare fluctuations in cardio-respiratory time-series are 'forgotten' quickly in healthy subjects while the memory for such events is significantly prolonged in pathological conditions such as asthma (respiratory time-series) and liver cirrhosis (heart-beat time-series). The memory length was significantly higher in patients with uncontrolled asthma compared to healthy volunteers. Likewise, it was significantly higher in patients with decompensated cirrhosis compared to those with compensated cirrhosis and healthy volunteers. We also observed that the cardio-respiratory system has simple low order dynamics and short memory around its average, and high order dynamics around rare fluctuations.
Introduction to time series analysis and forecasting
Montgomery, Douglas C; Kulahci, Murat
2015-01-01
Praise for the First Edition ""…[t]he book is great for readers who need to apply the methods and models presented but have little background in mathematics and statistics."" -MAA Reviews Thoroughly updated throughout, Introduction to Time Series Analysis and Forecasting, Second Edition presents the underlying theories of time series analysis that are needed to analyze time-oriented data and construct real-world short- to medium-term statistical forecasts. Authored by highly-experienced academics and professionals in engineering statistics, the Second Edition features discussions on both
Models for dependent time series
Tunnicliffe Wilson, Granville; Haywood, John
2015-01-01
Models for Dependent Time Series addresses the issues that arise and the methodology that can be applied when the dependence between time series is described and modeled. Whether you work in the economic, physical, or life sciences, the book shows you how to draw meaningful, applicable, and statistically valid conclusions from multivariate (or vector) time series data.The first four chapters discuss the two main pillars of the subject that have been developed over the last 60 years: vector autoregressive modeling and multivariate spectral analysis. These chapters provide the foundational mater
Markovic, Gabriela; Schult, Marie-Louise; Bartfai, Aniko; Elg, Mattias
2017-01-31
Progress in early cognitive recovery after acquired brain injury is uneven and unpredictable, and thus the evaluation of rehabilitation is complex. The use of time-series measurements is susceptible to statistical change due to process variation. To evaluate the feasibility of using a time-series method, statistical process control, in early cognitive rehabilitation. Participants were 27 patients with acquired brain injury undergoing interdisciplinary rehabilitation of attention within 4 months post-injury. The outcome measure, the Paced Auditory Serial Addition Test, was analysed using statistical process control. Statistical process control identifies if and when change occurs in the process according to 3 patterns: rapid, steady or stationary performers. The statistical process control method was adjusted, in terms of constructing the baseline and the total number of measurement points, in order to measure a process in change. Statistical process control methodology is feasible for use in early cognitive rehabilitation, since it provides information about change in a process, thus enabling adjustment of the individual treatment response. Together with the results indicating discernible subgroups that respond differently to rehabilitation, statistical process control could be a valid tool in clinical decision-making. This study is a starting-point in understanding the rehabilitation process using a real-time-measurements approach.
Woodward, Wayne A; Elliott, Alan C
2011-01-01
""There is scarcely a standard technique that the reader will find left out … this book is highly recommended for those requiring a ready introduction to applicable methods in time series and serves as a useful resource for pedagogical purposes.""-International Statistical Review (2014), 82""Current time series theory for practice is well summarized in this book.""-Emmanuel Parzen, Texas A&M University""What an extraordinary range of topics covered, all very insightfully. I like [the authors'] innovations very much, such as the AR factor table.""-David Findley, U.S. Census Bureau (retired)""…
Forecasting Cryptocurrencies Financial Time Series
Catania, Leopoldo; Grassi, Stefano; Ravazzolo, Francesco
2018-01-01
This paper studies the predictability of cryptocurrencies time series. We compare several alternative univariate and multivariate models in point and density forecasting of four of the most capitalized series: Bitcoin, Litecoin, Ripple and Ethereum. We apply a set of crypto–predictors and rely on Dynamic Model Averaging to combine a large set of univariate Dynamic Linear Models and several multivariate Vector Autoregressive models with different forms of time variation. We find statistical si...
Time-series-analysis techniques applied to nuclear-material accounting
International Nuclear Information System (INIS)
Pike, D.H.; Morrison, G.W.; Downing, D.J.
1982-05-01
This document is designed to introduce the reader to the applications of Time Series Analysis techniques to Nuclear Material Accountability data. Time series analysis techniques are designed to extract information from a collection of random variables ordered by time by seeking to identify any trends, patterns, or other structure in the series. Since nuclear material accountability data is a time series, one can extract more information using time series analysis techniques than by using other statistical techniques. Specifically, the objective of this document is to examine the applicability of time series analysis techniques to enhance loss detection of special nuclear materials. An introductory section examines the current industry approach which utilizes inventory differences. The error structure of inventory differences is presented. Time series analysis techniques discussed include the Shewhart Control Chart, the Cumulative Summation of Inventory Differences Statistics (CUSUM) and the Kalman Filter and Linear Smoother
Introduction to time series analysis and forecasting
Montgomery, Douglas C; Kulahci, Murat
2008-01-01
An accessible introduction to the most current thinking in and practicality of forecasting techniques in the context of time-oriented data. Analyzing time-oriented data and forecasting are among the most important problems that analysts face across many fields, ranging from finance and economics to production operations and the natural sciences. As a result, there is a widespread need for large groups of people in a variety of fields to understand the basic concepts of time series analysis and forecasting. Introduction to Time Series Analysis and Forecasting presents the time series analysis branch of applied statistics as the underlying methodology for developing practical forecasts, and it also bridges the gap between theory and practice by equipping readers with the tools needed to analyze time-oriented data and construct useful, short- to medium-term, statistically based forecasts.
Frontiers in Time Series and Financial Econometrics
Ling, S.; McAleer, M.J.; Tong, H.
2015-01-01
__Abstract__ Two of the fastest growing frontiers in econometrics and quantitative finance are time series and financial econometrics. Significant theoretical contributions to financial econometrics have been made by experts in statistics, econometrics, mathematics, and time series analysis. The purpose of this special issue of the journal on “Frontiers in Time Series and Financial Econometrics” is to highlight several areas of research by leading academics in which novel methods have contrib...
Time series modeling, computation, and inference
Prado, Raquel
2010-01-01
The authors systematically develop a state-of-the-art analysis and modeling of time series. … this book is well organized and well written. The authors present various statistical models for engineers to solve problems in time series analysis. Readers no doubt will learn state-of-the-art techniques from this book.-Hsun-Hsien Chang, Computing Reviews, March 2012My favorite chapters were on dynamic linear models and vector AR and vector ARMA models.-William Seaver, Technometrics, August 2011… a very modern entry to the field of time-series modelling, with a rich reference list of the current lit
Time averaging, ageing and delay analysis of financial time series
Cherstvy, Andrey G.; Vinod, Deepak; Aghion, Erez; Chechkin, Aleksei V.; Metzler, Ralf
2017-06-01
We introduce three strategies for the analysis of financial time series based on time averaged observables. These comprise the time averaged mean squared displacement (MSD) as well as the ageing and delay time methods for varying fractions of the financial time series. We explore these concepts via statistical analysis of historic time series for several Dow Jones Industrial indices for the period from the 1960s to 2015. Remarkably, we discover a simple universal law for the delay time averaged MSD. The observed features of the financial time series dynamics agree well with our analytical results for the time averaged measurables for geometric Brownian motion, underlying the famed Black-Scholes-Merton model. The concepts we promote here are shown to be useful for financial data analysis and enable one to unveil new universal features of stock market dynamics.
Compounding approach for univariate time series with nonstationary variances
Schäfer, Rudi; Barkhofen, Sonja; Guhr, Thomas; Stöckmann, Hans-Jürgen; Kuhl, Ulrich
2015-12-01
A defining feature of nonstationary systems is the time dependence of their statistical parameters. Measured time series may exhibit Gaussian statistics on short time horizons, due to the central limit theorem. The sample statistics for long time horizons, however, averages over the time-dependent variances. To model the long-term statistical behavior, we compound the local distribution with the distribution of its parameters. Here, we consider two concrete, but diverse, examples of such nonstationary systems: the turbulent air flow of a fan and a time series of foreign exchange rates. Our main focus is to empirically determine the appropriate parameter distribution for the compounding approach. To this end, we extract the relevant time scales by decomposing the time signals into windows and determine the distribution function of the thus obtained local variances.
Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall
2016-01-01
Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.
Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu
2015-09-21
Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.
Directory of Open Access Journals (Sweden)
Alessandro Chiaudani
2017-11-01
Full Text Available In this research, univariate and bivariate statistical methods were applied to rainfall, river and piezometric level datasets belonging to 24-year time series (1986–2009. These methods, which often are used to understand the effects of precipitation on rivers and karstic springs discharge, have been used to assess piezometric level response to rainfall and river level fluctuations in a porous aquifer. A rain gauge, a river level gauge and three wells, located in Central Italy along the lower Pescara River valley in correspondence of its important alluvial aquifer, provided the data. Statistical analysis has been used within a known hydrogeological framework, which has been refined by mean of a photo-interpretation and a GPS survey. Water–groundwater relationships were identified following the autocorrelation and cross-correlation analyses. Spectral analysis and mono-fractal features of time series were assessed to provide information on multi-year variability, data distributions, their fractal dimension and the distribution return time within the historical time series. The statistical–mathematical results were interpreted through fieldwork that identified distinct groundwater flowpaths within the aquifer and enabled the implementation of a conceptual model, improving the knowledge on water resources management tools.
Reconstruction of ensembles of coupled time-delay systems from time series.
Sysoev, I V; Prokhorov, M D; Ponomarenko, V I; Bezruchko, B P
2014-06-01
We propose a method to recover from time series the parameters of coupled time-delay systems and the architecture of couplings between them. The method is based on a reconstruction of model delay-differential equations and estimation of statistical significance of couplings. It can be applied to networks composed of nonidentical nodes with an arbitrary number of unidirectional and bidirectional couplings. We test our method on chaotic and periodic time series produced by model equations of ensembles of diffusively coupled time-delay systems in the presence of noise, and apply it to experimental time series obtained from electronic oscillators with delayed feedback coupled by resistors.
International Nuclear Information System (INIS)
Sanchez Merino, G.; Cortes Rpdicio, J.; Lope Lope, R.; Martin Gonzalez, T.; Garcia Fidalgo, M. A.
2013-01-01
The aim of the present work is to study the dependence of temporal resolution with the activity using statistical techniques applied to the series of values time series measurements of temporal resolution during daily equipment checks. (Author)
Studies on time series applications in environmental sciences
Bărbulescu, Alina
2016-01-01
Time series analysis and modelling represent a large study field, implying the approach from the perspective of the time and frequency, with applications in different domains. Modelling hydro-meteorological time series is difficult due to the characteristics of these series, as long range dependence, spatial dependence, the correlation with other series. Continuous spatial data plays an important role in planning, risk assessment and decision making in environmental management. In this context, in this book we present various statistical tests and modelling techniques used for time series analysis, as well as applications to hydro-meteorological series from Dobrogea, a region situated in the south-eastern part of Romania, less studied till now. Part of the results are accompanied by their R code. .
Modeling financial time series with S-plus
Zivot, Eric
2003-01-01
The field of financial econometrics has exploded over the last decade This book represents an integration of theory, methods, and examples using the S-PLUS statistical modeling language and the S+FinMetrics module to facilitate the practice of financial econometrics This is the first book to show the power of S-PLUS for the analysis of time series data It is written for researchers and practitioners in the finance industry, academic researchers in economics and finance, and advanced MBA and graduate students in economics and finance Readers are assumed to have a basic knowledge of S-PLUS and a solid grounding in basic statistics and time series concepts Eric Zivot is an associate professor and Gary Waterman Distinguished Scholar in the Economics Department at the University of Washington, and is co-director of the nascent Professional Master's Program in Computational Finance He regularly teaches courses on econometric theory, financial econometrics and time series econometrics, and is the recipient of the He...
Correlation and multifractality in climatological time series
International Nuclear Information System (INIS)
Pedron, I T
2010-01-01
Climate can be described by statistical analysis of mean values of atmospheric variables over a period. It is possible to detect correlations in climatological time series and to classify its behavior. In this work the Hurst exponent, which can characterize correlation and persistence in time series, is obtained by using the Detrended Fluctuation Analysis (DFA) method. Data series of temperature, precipitation, humidity, solar radiation, wind speed, maximum squall, atmospheric pressure and randomic series are studied. Furthermore, the multifractality of such series is analyzed applying the Multifractal Detrended Fluctuation Analysis (MF-DFA) method. The results indicate presence of correlation (persistent character) in all climatological series and multifractality as well. A larger set of data, and longer, could provide better results indicating the universality of the exponents.
Time Series Analysis of Wheat Futures Reward in China
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
Different from the fact that the main researches are focused on single futures contract and lack of the comparison of different periods, this paper described the statistical characteristics of wheat futures reward time series of Zhengzhou Commodity Exchange in recent three years. Besides the basic statistic analysis, the paper used the GARCH and EGARCH model to describe the time series which had the ARCH effect and analyzed the persistence of volatility shocks and the leverage effect. The results showed that compared with that of normal one,wheat futures reward series were abnormality, leptokurtic and thick tail distribution. The study also found that two-part of the reward series had no autocorrelation. Among the six correlative series, three ones presented the ARCH effect. By using of the Auto-regressive Distributed Lag Model, GARCH model and EGARCH model, the paper demonstrates the persistence of volatility shocks and the leverage effect on the wheat futures reward time series. The results reveal that on the one hand, the statistical characteristics of the wheat futures reward are similar to the aboard mature futures market as a whole. But on the other hand, the results reflect some shortages such as the immatureness and the over-control by the government in the Chinese future market.
Statistical Analysis of fMRI Time-Series: A Critical Review of the GLM Approach
Directory of Open Access Journals (Sweden)
Martin M Monti
2011-03-01
Full Text Available Functional Magnetic Resonance Imaging (fMRI is one of the most widely used tools to study the neural underpinnings of human cognition. Standard analysis of fMRI data relies on a General Linear Model (GLM approach to separate stimulus induced signals from noise. Crucially, this approach relies on a number of assumptions about the data which, for inferences to be valid, must be met. The current paper reviews the GLM approach to analysis of fMRI time-series, focusing in particular on the degree to which such data abides by the assumptions of the GLM framework, and on the methods that have been developed to correct for any violation of those assumptions. Rather than biasing estimates of effect size, the major consequence of non-conformity to the assumptions is to introduce bias into estimates of the variance, thus affecting test statistics, power and false positive rates. Furthermore, this bias can have pervasive effects on both individual subject and group-level statistics, potentially yielding qualitatively different results across replications, especially after the thresholding procedures commonly used for inference-making.
A Review of Some Aspects of Robust Inference for Time Series.
1984-09-01
REVIEW OF SOME ASPECTSOF ROBUST INFERNCE FOR TIME SERIES by Ad . Dougla Main TE "iAL REPOW No. 63 Septermber 1984 Department of Statistics University of ...clear. One cannot hope to have a good method for dealing with outliers in time series by using only an instantaneous nonlinear transformation of the data...AI.49 716 A REVIEWd OF SOME ASPECTS OF ROBUST INFERENCE FOR TIME 1/1 SERIES(U) WASHINGTON UNIV SEATTLE DEPT OF STATISTICS R D MARTIN SEP 84 TR-53
Time Series, Stochastic Processes and Completeness of Quantum Theory
International Nuclear Information System (INIS)
Kupczynski, Marian
2011-01-01
Most of physical experiments are usually described as repeated measurements of some random variables. Experimental data registered by on-line computers form time series of outcomes. The frequencies of different outcomes are compared with the probabilities provided by the algorithms of quantum theory (QT). In spite of statistical predictions of QT a claim was made that it provided the most complete description of the data and of the underlying physical phenomena. This claim could be easily rejected if some fine structures, averaged out in the standard descriptive statistical analysis, were found in time series of experimental data. To search for these structures one has to use more subtle statistical tools which were developed to study time series produced by various stochastic processes. In this talk we review some of these tools. As an example we show how the standard descriptive statistical analysis of the data is unable to reveal a fine structure in a simulated sample of AR (2) stochastic process. We emphasize once again that the violation of Bell inequalities gives no information on the completeness or the non locality of QT. The appropriate way to test the completeness of quantum theory is to search for fine structures in time series of the experimental data by means of the purity tests or by studying the autocorrelation and partial autocorrelation functions.
Introduction to time series and forecasting
Brockwell, Peter J
2016-01-01
This book is aimed at the reader who wishes to gain a working knowledge of time series and forecasting methods as applied to economics, engineering and the natural and social sciences. It assumes knowledge only of basic calculus, matrix algebra and elementary statistics. This third edition contains detailed instructions for the use of the professional version of the Windows-based computer package ITSM2000, now available as a free download from the Springer Extras website. The logic and tools of time series model-building are developed in detail. Numerous exercises are included and the software can be used to analyze and forecast data sets of the user's own choosing. The book can also be used in conjunction with other time series packages such as those included in R. The programs in ITSM2000 however are menu-driven and can be used with minimal investment of time in the computational details. The core of the book covers stationary processes, ARMA and ARIMA processes, multivariate time series and state-space mod...
Faes, Luca; Zhao, He; Chon, Ki H; Nollo, Giandomenico
2009-03-01
We propose a method to extend to time-varying (TV) systems the procedure for generating typical surrogate time series, in order to test the presence of nonlinear dynamics in potentially nonstationary signals. The method is based on fitting a TV autoregressive (AR) model to the original series and then regressing the model coefficients with random replacements of the model residuals to generate TV AR surrogate series. The proposed surrogate series were used in combination with a TV sample entropy (SE) discriminating statistic to assess nonlinearity in both simulated and experimental time series, in comparison with traditional time-invariant (TIV) surrogates combined with the TIV SE discriminating statistic. Analysis of simulated time series showed that using TIV surrogates, linear nonstationary time series may be erroneously regarded as nonlinear and weak TV nonlinearities may remain unrevealed, while the use of TV AR surrogates markedly increases the probability of a correct interpretation. Application to short (500 beats) heart rate variability (HRV) time series recorded at rest (R), after head-up tilt (T), and during paced breathing (PB) showed: 1) modifications of the SE statistic that were well interpretable with the known cardiovascular physiology; 2) significant contribution of nonlinear dynamics to HRV in all conditions, with significant increase during PB at 0.2 Hz respiration rate; and 3) a disagreement between TV AR surrogates and TIV surrogates in about a quarter of the series, suggesting that nonstationarity may affect HRV recordings and bias the outcome of the traditional surrogate-based nonlinearity test.
On robust forecasting of autoregressive time series under censoring
Kharin, Y.; Badziahin, I.
2009-01-01
Problems of robust statistical forecasting are considered for autoregressive time series observed under distortions generated by interval censoring. Three types of robust forecasting statistics are developed; meansquare risk is evaluated for the developed forecasting statistics. Numerical results are given.
Capturing Structure Implicitly from Time-Series having Limited Data
Emaasit, Daniel; Johnson, Matthew
2018-01-01
Scientific fields such as insider-threat detection and highway-safety planning often lack sufficient amounts of time-series data to estimate statistical models for the purpose of scientific discovery. Moreover, the available limited data are quite noisy. This presents a major challenge when estimating time-series models that are robust to overfitting and have well-calibrated uncertainty estimates. Most of the current literature in these fields involve visualizing the time-series for noticeabl...
Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng
2018-02-01
Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
Homogenising time series: beliefs, dogmas and facts
Domonkos, P.
2011-06-01
In the recent decades various homogenisation methods have been developed, but the real effects of their application on time series are still not known sufficiently. The ongoing COST action HOME (COST ES0601) is devoted to reveal the real impacts of homogenisation methods more detailed and with higher confidence than earlier. As a part of the COST activity, a benchmark dataset was built whose characteristics approach well the characteristics of real networks of observed time series. This dataset offers much better opportunity than ever before to test the wide variety of homogenisation methods, and analyse the real effects of selected theoretical recommendations. Empirical results show that real observed time series usually include several inhomogeneities of different sizes. Small inhomogeneities often have similar statistical characteristics than natural changes caused by climatic variability, thus the pure application of the classic theory that change-points of observed time series can be found and corrected one-by-one is impossible. However, after homogenisation the linear trends, seasonal changes and long-term fluctuations of time series are usually much closer to the reality than in raw time series. Some problems around detecting multiple structures of inhomogeneities, as well as that of time series comparisons within homogenisation procedures are discussed briefly in the study.
Characterization of time series via Rényi complexity-entropy curves
Jauregui, M.; Zunino, L.; Lenzi, E. K.; Mendes, R. S.; Ribeiro, H. V.
2018-05-01
One of the most useful tools for distinguishing between chaotic and stochastic time series is the so-called complexity-entropy causality plane. This diagram involves two complexity measures: the Shannon entropy and the statistical complexity. Recently, this idea has been generalized by considering the Tsallis monoparametric generalization of the Shannon entropy, yielding complexity-entropy curves. These curves have proven to enhance the discrimination among different time series related to stochastic and chaotic processes of numerical and experimental nature. Here we further explore these complexity-entropy curves in the context of the Rényi entropy, which is another monoparametric generalization of the Shannon entropy. By combining the Rényi entropy with the proper generalization of the statistical complexity, we associate a parametric curve (the Rényi complexity-entropy curve) with a given time series. We explore this approach in a series of numerical and experimental applications, demonstrating the usefulness of this new technique for time series analysis. We show that the Rényi complexity-entropy curves enable the differentiation among time series of chaotic, stochastic, and periodic nature. In particular, time series of stochastic nature are associated with curves displaying positive curvature in a neighborhood of their initial points, whereas curves related to chaotic phenomena have a negative curvature; finally, periodic time series are represented by vertical straight lines.
Long-memory time series theory and methods
Palma, Wilfredo
2007-01-01
Wilfredo Palma, PhD, is Chairman and Professor of Statistics in the Department of Statistics at Pontificia Universidad Católica de Chile. Dr. Palma has published several refereed articles and has received over a dozen academic honors and awards. His research interests include time series analysis, prediction theory, state space systems, linear models, and econometrics.
On clustering fMRI time series
DEFF Research Database (Denmark)
Goutte, Cyril; Toft, Peter Aundal; Rostrup, E.
1999-01-01
Analysis of fMRI time series is often performed by extracting one or more parameters for the individual voxels. Methods based, e.g., on various statistical tests are then used to yield parameters corresponding to probability of activation or activation strength. However, these methods do...
Non-linear time series extreme events and integer value problems
Turkman, Kamil Feridun; Zea Bermudez, Patrícia
2014-01-01
This book offers a useful combination of probabilistic and statistical tools for analyzing nonlinear time series. Key features of the book include a study of the extremal behavior of nonlinear time series and a comprehensive list of nonlinear models that address different aspects of nonlinearity. Several inferential methods, including quasi likelihood methods, sequential Markov Chain Monte Carlo Methods and particle filters, are also included so as to provide an overall view of the available tools for parameter estimation for nonlinear models. A chapter on integer time series models based on several thinning operations, which brings together all recent advances made in this area, is also included. Readers should have attended a prior course on linear time series, and a good grasp of simulation-based inferential methods is recommended. This book offers a valuable resource for second-year graduate students and researchers in statistics and other scientific areas who need a basic understanding of nonlinear time ...
Permutation entropy of finite-length white-noise time series.
Little, Douglas J; Kane, Deb M
2016-08-01
Permutation entropy (PE) is commonly used to discriminate complex structure from white noise in a time series. While the PE of white noise is well understood in the long time-series limit, analysis in the general case is currently lacking. Here the expectation value and variance of white-noise PE are derived as functions of the number of ordinal pattern trials, N, and the embedding dimension, D. It is demonstrated that the probability distribution of the white-noise PE converges to a χ^{2} distribution with D!-1 degrees of freedom as N becomes large. It is further demonstrated that the PE variance for an arbitrary time series can be estimated as the variance of a related metric, the Kullback-Leibler entropy (KLE), allowing the qualitative N≫D! condition to be recast as a quantitative estimate of the N required to achieve a desired PE calculation precision. Application of this theory to statistical inference is demonstrated in the case of an experimentally obtained noise series, where the probability of obtaining the observed PE value was calculated assuming a white-noise time series. Standard statistical inference can be used to draw conclusions whether the white-noise null hypothesis can be accepted or rejected. This methodology can be applied to other null hypotheses, such as discriminating whether two time series are generated from different complex system states.
Zhang, Qian; Harman, Ciaran J.; Kirchner, James W.
2018-02-01
River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends because (1) fractal scaling has the potential to lead to false inference about the statistical significance of trends and (2) the abundance of irregularly spaced data in water-quality monitoring networks complicates efforts to quantify fractal scaling. Traditional methods for estimating fractal scaling - in the form of spectral slope (β) or other equivalent scaling parameters (e.g., Hurst exponent) - are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1) they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β = 0) to Brown noise (β = 2) and (2) their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths) in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining autocorrelation, as the interpolation methods consistently underestimate or overestimate β under a wide range of prescribed β values and gap distributions. Second, the widely used Lomb-Scargle spectral method also consistently underestimates β. A previously published modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among all methods for a wide range of
Kolokythas, Kostantinos; Vasileios, Salamalikis; Athanassios, Argiriou; Kazantzidis, Andreas
2015-04-01
The wind is a result of complex interactions of numerous mechanisms taking place in small or large scales, so, the better knowledge of its behavior is essential in a variety of applications, especially in the field of power production coming from wind turbines. In the literature there is a considerable number of models, either physical or statistical ones, dealing with the problem of simulation and prediction of wind speed. Among others, Artificial Neural Networks (ANNs) are widely used for the purpose of wind forecasting and, in the great majority of cases, outperform other conventional statistical models. In this study, a number of ANNs with different architectures, which have been created and applied in a dataset of wind time series, are compared to Auto Regressive Integrated Moving Average (ARIMA) statistical models. The data consist of mean hourly wind speeds coming from a wind farm on a hilly Greek region and cover a period of one year (2013). The main goal is to evaluate the models ability to simulate successfully the wind speed at a significant point (target). Goodness-of-fit statistics are performed for the comparison of the different methods. In general, the ANN showed the best performance in the estimation of wind speed prevailing over the ARIMA models.
Non-parametric characterization of long-term rainfall time series
Tiwari, Harinarayan; Pandey, Brij Kishor
2018-03-01
The statistical study of rainfall time series is one of the approaches for efficient hydrological system design. Identifying, and characterizing long-term rainfall time series could aid in improving hydrological systems forecasting. In the present study, eventual statistics was applied for the long-term (1851-2006) rainfall time series under seven meteorological regions of India. Linear trend analysis was carried out using Mann-Kendall test for the observed rainfall series. The observed trend using the above-mentioned approach has been ascertained using the innovative trend analysis method. Innovative trend analysis has been found to be a strong tool to detect the general trend of rainfall time series. Sequential Mann-Kendall test has also been carried out to examine nonlinear trends of the series. The partial sum of cumulative deviation test is also found to be suitable to detect the nonlinear trend. Innovative trend analysis, sequential Mann-Kendall test and partial cumulative deviation test have potential to detect the general as well as nonlinear trend for the rainfall time series. Annual rainfall analysis suggests that the maximum changes in mean rainfall is 11.53% for West Peninsular India, whereas the maximum fall in mean rainfall is 7.8% for the North Mountainous Indian region. The innovative trend analysis method is also capable of finding the number of change point available in the time series. Additionally, we have performed von Neumann ratio test and cumulative deviation test to estimate the departure from homogeneity. Singular spectrum analysis has been applied in this study to evaluate the order of departure from homogeneity in the rainfall time series. Monsoon season (JS) of North Mountainous India and West Peninsular India zones has higher departure from homogeneity and singular spectrum analysis shows the results to be in coherence with the same.
Evolutionary Algorithms for the Detection of Structural Breaks in Time Series
DEFF Research Database (Denmark)
Doerr, Benjamin; Fischer, Paul; Hilbert, Astrid
2013-01-01
Detecting structural breaks is an essential task for the statistical analysis of time series, for example, for fitting parametric models to it. In short, structural breaks are points in time at which the behavior of the time series changes. Typically, no solid background knowledge of the time...
How to statistically analyze nano exposure measurement results: Using an ARIMA time series approach
Klein Entink, R.H.; Fransman, W.; Brouwer, D.H.
2011-01-01
Measurement strategies for exposure to nano-sized particles differ from traditional integrated sampling methods for exposure assessment by the use of real-time instruments. The resulting measurement series is a time series, where typically the sequential measurements are not independent from each
Time series modeling in traffic safety research.
Lavrenz, Steven M; Vlahogianni, Eleni I; Gkritza, Konstantina; Ke, Yue
2018-08-01
The use of statistical models for analyzing traffic safety (crash) data has been well-established. However, time series techniques have traditionally been underrepresented in the corresponding literature, due to challenges in data collection, along with a limited knowledge of proper methodology. In recent years, new types of high-resolution traffic safety data, especially in measuring driver behavior, have made time series modeling techniques an increasingly salient topic of study. Yet there remains a dearth of information to guide analysts in their use. This paper provides an overview of the state of the art in using time series models in traffic safety research, and discusses some of the fundamental techniques and considerations in classic time series modeling. It also presents ongoing and future opportunities for expanding the use of time series models, and explores newer modeling techniques, including computational intelligence models, which hold promise in effectively handling ever-larger data sets. The information contained herein is meant to guide safety researchers in understanding this broad area of transportation data analysis, and provide a framework for understanding safety trends that can influence policy-making. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Hybrid Joint Moment Ratio Test for Financial Time Series
P.A. Groenendijk (Patrick); A. Lucas (André); C.G. de Vries (Casper)
1998-01-01
textabstractWe advocate the use of absolute moment ratio statistics in conjunction with standard variance ratio statistics in order to disentangle linear dependence, non-linear dependence, and leptokurtosis in financial time series. Both statistics are computed for multiple return horizons
Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance.
Liu, Yongli; Chen, Jingli; Wu, Shuai; Liu, Zhizhong; Chao, Hao
2018-01-01
Clustering time series data is of great significance since it could extract meaningful statistics and other characteristics. Especially in biomedical engineering, outstanding clustering algorithms for time series may help improve the health level of people. Considering data scale and time shifts of time series, in this paper, we introduce two incremental fuzzy clustering algorithms based on a Dynamic Time Warping (DTW) distance. For recruiting Single-Pass and Online patterns, our algorithms could handle large-scale time series data by splitting it into a set of chunks which are processed sequentially. Besides, our algorithms select DTW to measure distance of pair-wise time series and encourage higher clustering accuracy because DTW could determine an optimal match between any two time series by stretching or compressing segments of temporal data. Our new algorithms are compared to some existing prominent incremental fuzzy clustering algorithms on 12 benchmark time series datasets. The experimental results show that the proposed approaches could yield high quality clusters and were better than all the competitors in terms of clustering accuracy.
Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance
Chen, Jingli; Wu, Shuai; Liu, Zhizhong; Chao, Hao
2018-01-01
Clustering time series data is of great significance since it could extract meaningful statistics and other characteristics. Especially in biomedical engineering, outstanding clustering algorithms for time series may help improve the health level of people. Considering data scale and time shifts of time series, in this paper, we introduce two incremental fuzzy clustering algorithms based on a Dynamic Time Warping (DTW) distance. For recruiting Single-Pass and Online patterns, our algorithms could handle large-scale time series data by splitting it into a set of chunks which are processed sequentially. Besides, our algorithms select DTW to measure distance of pair-wise time series and encourage higher clustering accuracy because DTW could determine an optimal match between any two time series by stretching or compressing segments of temporal data. Our new algorithms are compared to some existing prominent incremental fuzzy clustering algorithms on 12 benchmark time series datasets. The experimental results show that the proposed approaches could yield high quality clusters and were better than all the competitors in terms of clustering accuracy. PMID:29795600
Stochastic time series analysis of hydrology data for water resources
Sathish, S.; Khadar Babu, S. K.
2017-11-01
The prediction to current publication of stochastic time series analysis in hydrology and seasonal stage. The different statistical tests for predicting the hydrology time series on Thomas-Fiering model. The hydrology time series of flood flow have accept a great deal of consideration worldwide. The concentration of stochastic process areas of time series analysis method are expanding with develop concerns about seasonal periods and global warming. The recent trend by the researchers for testing seasonal periods in the hydrologic flowseries using stochastic process on Thomas-Fiering model. The present article proposed to predict the seasonal periods in hydrology using Thomas-Fiering model.
Time domain series system definition and gear set reliability modeling
International Nuclear Information System (INIS)
Xie, Liyang; Wu, Ningxiang; Qian, Wenxue
2016-01-01
Time-dependent multi-configuration is a typical feature for mechanical systems such as gear trains and chain drives. As a series system, a gear train is distinct from a traditional series system, such as a chain, in load transmission path, system-component relationship, system functioning manner, as well as time-dependent system configuration. Firstly, the present paper defines time-domain series system to which the traditional series system reliability model is not adequate. Then, system specific reliability modeling technique is proposed for gear sets, including component (tooth) and subsystem (tooth-pair) load history description, material priori/posterior strength expression, time-dependent and system specific load-strength interference analysis, as well as statistically dependent failure events treatment. Consequently, several system reliability models are developed for gear sets with different tooth numbers in the scenario of tooth root material ultimate tensile strength failure. The application of the models is discussed in the last part, and the differences between the system specific reliability model and the traditional series system reliability model are illustrated by virtue of several numerical examples. - Highlights: • A new type of series system, i.e. time-domain multi-configuration series system is defined, that is of great significance to reliability modeling. • Multi-level statistical analysis based reliability modeling method is presented for gear transmission system. • Several system specific reliability models are established for gear set reliability estimation. • The differences between the traditional series system reliability model and the new model are illustrated.
Time series analysis of nuclear instrumentation in EBR-II
International Nuclear Information System (INIS)
Imel, G.R.
1996-01-01
Results of a time series analysis of the scaler count data from the 3 wide range nuclear detectors in the Experimental Breeder Reactor-II are presented. One of the channels was replaced, and it was desired to determine if there was any statistically significant change (ie, improvement) in the channel's response after the replacement. Data were collected from all 3 channels for 16-day periods before and after detector replacement. Time series analysis and statistical tests showed that there was no significant change after the detector replacement. Also, there were no statistically significant differences among the 3 channels, either before or after the replacement. Finally, it was determined that errors in the reactivity change inferred from subcritical count monitoring during fuel handling would be on the other of 20-30 cents for single count intervals
Bootstrap Power of Time Series Goodness of fit tests
Directory of Open Access Journals (Sweden)
Sohail Chand
2013-10-01
Full Text Available In this article, we looked at power of various versions of Box and Pierce statistic and Cramer von Mises test. An extensive simulation study has been conducted to compare the power of these tests. Algorithms have been provided for the power calculations and comparison has also been made between the semi parametric bootstrap methods used for time series. Results show that Box-Pierce statistic and its various versions have good power against linear time series models but poor power against non linear models while situation reverses for Cramer von Mises test. Moreover, we found that dynamic bootstrap method is better than xed design bootstrap method.
Directory of Open Access Journals (Sweden)
Q. Zhang
2018-02-01
Full Text Available River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends because (1 fractal scaling has the potential to lead to false inference about the statistical significance of trends and (2 the abundance of irregularly spaced data in water-quality monitoring networks complicates efforts to quantify fractal scaling. Traditional methods for estimating fractal scaling – in the form of spectral slope (β or other equivalent scaling parameters (e.g., Hurst exponent – are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1 they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β = 0 to Brown noise (β = 2 and (2 their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining autocorrelation, as the interpolation methods consistently underestimate or overestimate β under a wide range of prescribed β values and gap distributions. Second, the widely used Lomb–Scargle spectral method also consistently underestimates β. A previously published modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui
2014-07-01
The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Characterizing time series via complexity-entropy curves
Ribeiro, Haroldo V.; Jauregui, Max; Zunino, Luciano; Lenzi, Ervin K.
2017-06-01
The search for patterns in time series is a very common task when dealing with complex systems. This is usually accomplished by employing a complexity measure such as entropies and fractal dimensions. However, such measures usually only capture a single aspect of the system dynamics. Here, we propose a family of complexity measures for time series based on a generalization of the complexity-entropy causality plane. By replacing the Shannon entropy by a monoparametric entropy (Tsallis q entropy) and after considering the proper generalization of the statistical complexity (q complexity), we build up a parametric curve (the q -complexity-entropy curve) that is used for characterizing and classifying time series. Based on simple exact results and numerical simulations of stochastic processes, we show that these curves can distinguish among different long-range, short-range, and oscillating correlated behaviors. Also, we verify that simulated chaotic and stochastic time series can be distinguished based on whether these curves are open or closed. We further test this technique in experimental scenarios related to chaotic laser intensity, stock price, sunspot, and geomagnetic dynamics, confirming its usefulness. Finally, we prove that these curves enhance the automatic classification of time series with long-range correlations and interbeat intervals of healthy subjects and patients with heart disease.
Frontiers in Time Series and Financial Econometrics : An overview
S. Ling (Shiqing); M.J. McAleer (Michael); H. Tong (Howell)
2015-01-01
markdownabstract__Abstract__ Two of the fastest growing frontiers in econometrics and quantitative finance are time series and financial econometrics. Significant theoretical contributions to financial econometrics have been made by experts in statistics, econometrics, mathematics, and time
Frontiers in Time Series and Financial Econometrics: An Overview
S. Ling (Shiqing); M.J. McAleer (Michael); H. Tong (Howell)
2015-01-01
markdownabstract__Abstract__ Two of the fastest growing frontiers in econometrics and quantitative finance are time series and financial econometrics. Significant theoretical contributions to financial econometrics have been made by experts in statistics, econometrics, mathematics, and time
Lecture notes for Advanced Time Series Analysis
DEFF Research Database (Denmark)
Madsen, Henrik; Holst, Jan
1997-01-01
A first version of this notes was used at the lectures in Grenoble, and they are now extended and improved (together with Jan Holst), and used in Ph.D. courses on Advanced Time Series Analysis at IMM and at the Department of Mathematical Statistics, University of Lund, 1994, 1997, ...
Predicting long-term catchment nutrient export: the use of nonlinear time series models
Valent, Peter; Howden, Nicholas J. K.; Szolgay, Jan; Komornikova, Magda
2010-05-01
After the Second World War the nitrate concentrations in European water bodies changed significantly as the result of increased nitrogen fertilizer use and changes in land use. However, in the last decades, as a consequence of the implementation of nitrate-reducing measures in Europe, the nitrate concentrations in water bodies slowly decrease. This causes that the mean and variance of the observed time series also changes with time (nonstationarity and heteroscedascity). In order to detect changes and properly describe the behaviour of such time series by time series analysis, linear models (such as autoregressive (AR), moving average (MA) and autoregressive moving average models (ARMA)), are no more suitable. Time series with sudden changes in statistical characteristics can cause various problems in the calibration of traditional water quality models and thus give biased predictions. Proper statistical analysis of these non-stationary and heteroscedastic time series with the aim of detecting and subsequently explaining the variations in their statistical characteristics requires the use of nonlinear time series models. This information can be then used to improve the model building and calibration of conceptual water quality model or to select right calibration periods in order to produce reliable predictions. The objective of this contribution is to analyze two long time series of nitrate concentrations of the rivers Ouse and Stour with advanced nonlinear statistical modelling techniques and compare their performance with traditional linear models of the ARMA class in order to identify changes in the time series characteristics. The time series were analysed with nonlinear models with multiple regimes represented by self-exciting threshold autoregressive (SETAR) and Markov-switching models (MSW). The analysis showed that, based on the value of residual sum of squares (RSS) in both datasets, SETAR and MSW models described the time-series better than models of the
Soni, Kirti; Parmar, Kulwinder Singh; Kapoor, Sangeeta; Kumar, Nishant
2016-05-15
A lot of studies in the literature of Aerosol Optical Depth (AOD) done by using Moderate Resolution Imaging Spectroradiometer (MODIS) derived data, but the accuracy of satellite data in comparison to ground data derived from ARrosol Robotic NETwork (AERONET) has been always questionable. So to overcome from this situation, comparative study of a comprehensive ground based and satellite data for the period of 2001-2012 is modeled. The time series model is used for the accurate prediction of AOD and statistical variability is compared to assess the performance of the model in both cases. Root mean square error (RMSE), mean absolute percentage error (MAPE), stationary R-squared, R-squared, maximum absolute percentage error (MAPE), normalized Bayesian information criterion (NBIC) and Ljung-Box methods are used to check the applicability and validity of the developed ARIMA models revealing significant precision in the model performance. It was found that, it is possible to predict the AOD by statistical modeling using time series obtained from past data of MODIS and AERONET as input data. Moreover, the result shows that MODIS data can be formed from AERONET data by adding 0.251627 ± 0.133589 and vice-versa by subtracting. From the forecast available for AODs for the next four years (2013-2017) by using the developed ARIMA model, it is concluded that the forecasted ground AOD has increased trend. Copyright © 2016 Elsevier B.V. All rights reserved.
A high-fidelity weather time series generator using the Markov Chain process on a piecewise level
Hersvik, K.; Endrerud, O.-E. V.
2017-12-01
A method is developed for generating a set of unique weather time-series based on an existing weather series. The method allows statistically valid weather variations to take place within repeated simulations of offshore operations. The numerous generated time series need to share the same statistical qualities as the original time series. Statistical qualities here refer mainly to the distribution of weather windows available for work, including durations and frequencies of such weather windows, and seasonal characteristics. The method is based on the Markov chain process. The core new development lies in how the Markov Process is used, specifically by joining small pieces of random length time series together rather than joining individual weather states, each from a single time step, which is a common solution found in the literature. This new Markov model shows favorable characteristics with respect to the requirements set forth and all aspects of the validation performed.
A new non-parametric stationarity test of time series in the time domain
Jin, Lei
2014-11-07
© 2015 The Royal Statistical Society and Blackwell Publishing Ltd. We propose a new double-order selection test for checking second-order stationarity of a time series. To develop the test, a sequence of systematic samples is defined via Walsh functions. Then the deviations of the autocovariances based on these systematic samples from the corresponding autocovariances of the whole time series are calculated and the uniform asymptotic joint normality of these deviations over different systematic samples is obtained. With a double-order selection scheme, our test statistic is constructed by combining the deviations at different lags in the systematic samples. The null asymptotic distribution of the statistic proposed is derived and the consistency of the test is shown under fixed and local alternatives. Simulation studies demonstrate well-behaved finite sample properties of the method proposed. Comparisons with some existing tests in terms of power are given both analytically and empirically. In addition, the method proposed is applied to check the stationarity assumption of a chemical process viscosity readings data set.
A new non-parametric stationarity test of time series in the time domain
Jin, Lei; Wang, Suojin; Wang, Haiyan
2014-01-01
© 2015 The Royal Statistical Society and Blackwell Publishing Ltd. We propose a new double-order selection test for checking second-order stationarity of a time series. To develop the test, a sequence of systematic samples is defined via Walsh
Statistical models and time series forecasting of sulfur dioxide: a case study Tehran.
Hassanzadeh, S; Hosseinibalam, F; Alizadeh, R
2009-08-01
This study performed a time-series analysis, frequency distribution and prediction of SO(2) levels for five stations (Pardisan, Vila, Azadi, Gholhak and Bahman) in Tehran for the period of 2000-2005. Most sites show a quite similar characteristic with highest pollution in autumn-winter time and least pollution in spring-summer. The frequency distributions show higher peaks at two residential sites. The potential for SO(2) problems is high because of high emissions and the close geographical proximity of the major industrial and urban centers. The ACF and PACF are nonzero for several lags, indicating a mixed (ARMA) model, then at Bahman station an ARMA model was used for forecasting SO(2). The partial autocorrelations become close to 0 after about 5 lags while the autocorrelations remain strong through all the lags shown. The results proved that ARMA (2,2) model can provides reliable, satisfactory predictions for time series.
The Use of Computer-Assisted Identification of ARIMA Time-Series.
Brown, Roger L.
This study was conducted to determine the effects of using various levels of tutorial statistical software for the tentative identification of nonseasonal ARIMA models, a statistical technique proposed by Box and Jenkins for the interpretation of time-series data. The Box-Jenkins approach is an iterative process encompassing several stages of…
Pavlos, G. P.; Malandraki, O.; Khabarova, O.; Livadiotis, G.; Pavlos, E.; Karakatsanis, L. P.; Iliopoulos, A. C.; Parisis, K.
2017-12-01
In this work we study the non-extensivity of Solar Wind space plasma by using electric-magnetic field data obtained by in situ spacecraft observations at different dynamical states of solar wind system especially in interplanetary coronal mass ejections (ICMEs), Interplanetary shocks, magnetic islands, or near the Earth Bow shock. Especially, we study the energetic particle non extensive fractional acceleration mechanism producing kappa distributions as well as the intermittent turbulence mechanism producing multifractal structures related with the Tsallis q-entropy principle. We present some new and significant results concerning the dynamics of ICMEs observed in the near Earth at L1 solar wind environment, as well as its effect in Earth's magnetosphere as well as magnetic islands. In-situ measurements of energetic particles at L1 are analyzed, in response to major solar eruptive events at the Sun (intense flares, fast CMEs). The statistical characteristics are obtained and compared for the Solar Energetic Particles (SEPs) originating at the Sun, the energetic particle enhancements associated with local acceleration during the CME-driven shock passage over the spacecraft (Energetic Particle Enhancements, ESPs) as well as the energetic particle signatures observed during the passage of the ICME. The results are referred to Tsallis non-extensive statistics and in particular to the estimation of Tsallis q-triplet, (qstat, qsen, qrel) of electric-magnetic field and the kappa distributions of solar energetic particles time series of the ICME, magnetic islands, resulting from the solar eruptive activity or the internal Solar Wind dynamics. Our results reveal significant differences in statistical and dynamical features, indicating important variations of the magnetic field dynamics both in time and space domains during the shock event, in terms of rate of entropy production, relaxation dynamics and non-equilibrium meta-stable stationary states.
Data imputation analysis for Cosmic Rays time series
Fernandes, R. C.; Lucio, P. S.; Fernandez, J. H.
2017-05-01
The occurrence of missing data concerning Galactic Cosmic Rays time series (GCR) is inevitable since loss of data is due to mechanical and human failure or technical problems and different periods of operation of GCR stations. The aim of this study was to perform multiple dataset imputation in order to depict the observational dataset. The study has used the monthly time series of GCR Climax (CLMX) and Roma (ROME) from 1960 to 2004 to simulate scenarios of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90% of missing data compared to observed ROME series, with 50 replicates. Then, the CLMX station as a proxy for allocation of these scenarios was used. Three different methods for monthly dataset imputation were selected: AMÉLIA II - runs the bootstrap Expectation Maximization algorithm, MICE - runs an algorithm via Multivariate Imputation by Chained Equations and MTSDI - an Expectation Maximization algorithm-based method for imputation of missing values in multivariate normal time series. The synthetic time series compared with the observed ROME series has also been evaluated using several skill measures as such as RMSE, NRMSE, Agreement Index, R, R2, F-test and t-test. The results showed that for CLMX and ROME, the R2 and R statistics were equal to 0.98 and 0.96, respectively. It was observed that increases in the number of gaps generate loss of quality of the time series. Data imputation was more efficient with MTSDI method, with negligible errors and best skill coefficients. The results suggest a limit of about 60% of missing data for imputation, for monthly averages, no more than this. It is noteworthy that CLMX, ROME and KIEL stations present no missing data in the target period. This methodology allowed reconstructing 43 time series.
GPS Position Time Series @ JPL
Owen, Susan; Moore, Angelyn; Kedar, Sharon; Liu, Zhen; Webb, Frank; Heflin, Mike; Desai, Shailen
2013-01-01
Different flavors of GPS time series analysis at JPL - Use same GPS Precise Point Positioning Analysis raw time series - Variations in time series analysis/post-processing driven by different users. center dot JPL Global Time Series/Velocities - researchers studying reference frame, combining with VLBI/SLR/DORIS center dot JPL/SOPAC Combined Time Series/Velocities - crustal deformation for tectonic, volcanic, ground water studies center dot ARIA Time Series/Coseismic Data Products - Hazard monitoring and response focused center dot ARIA data system designed to integrate GPS and InSAR - GPS tropospheric delay used for correcting InSAR - Caltech's GIANT time series analysis uses GPS to correct orbital errors in InSAR - Zhen Liu's talking tomorrow on InSAR Time Series analysis
STATIONARITY OF ANNUAL MAXIMUM DAILY STREAMFLOW TIME SERIES IN SOUTH-EAST BRAZILIAN RIVERS
Directory of Open Access Journals (Sweden)
Jorge Machado Damázio
2015-08-01
Full Text Available DOI: 10.12957/cadest.2014.18302The paper presents a statistical analysis of annual maxima daily streamflow between 1931 and 2013 in South-East Brazil focused in detecting and modelling non-stationarity aspects. Flood protection for the large valleys in South-East Brazil is provided by multiple purpose reservoir systems built during 20th century, which design and operation plans has been done assuming stationarity of historical flood time series. Land cover changes and rapidly-increasing level of atmosphere greenhouse gases of the last century may be affecting flood regimes in these valleys so that it can be that nonstationary modelling should be applied to re-asses dam safety and flood control operation rules at the existent reservoir system. Six annual maximum daily streamflow time series are analysed. The time series were plotted together with fitted smooth loess functions and non-parametric statistical tests are performed to check the significance of apparent trends shown by the plots. Non-stationarity is modelled by fitting univariate extreme value distribution functions which location varies linearly with time. Stationarity and non-stationarity modelling are compared with the likelihood ratio statistic. In four of the six analyzed time series non-stationarity modelling outperformed stationarity modelling.Keywords: Stationarity; Extreme Value Distributions; Flood Frequency Analysis; Maximum Likelihood Method.
Ramler, Ivan P.; Chapman, Jessica L.
2011-01-01
In this article we describe a semester-long project, based on the popular video game series Guitar Hero, designed to introduce upper-level undergraduate statistics students to statistical research. Some of the goals of this project are to help students develop statistical thinking that allows them to approach and answer open-ended research…
Evaluation of scaling invariance embedded in short time series.
Directory of Open Access Journals (Sweden)
Xue Pan
Full Text Available Scaling invariance of time series has been making great contributions in diverse research fields. But how to evaluate scaling exponent from a real-world series is still an open problem. Finite length of time series may induce unacceptable fluctuation and bias to statistical quantities and consequent invalidation of currently used standard methods. In this paper a new concept called correlation-dependent balanced estimation of diffusion entropy is developed to evaluate scale-invariance in very short time series with length ~10(2. Calculations with specified Hurst exponent values of 0.2,0.3,...,0.9 show that by using the standard central moving average de-trending procedure this method can evaluate the scaling exponents for short time series with ignorable bias (≤0.03 and sharp confidential interval (standard deviation ≤0.05. Considering the stride series from ten volunteers along an approximate oval path of a specified length, we observe that though the averages and deviations of scaling exponents are close, their evolutionary behaviors display rich patterns. It has potential use in analyzing physiological signals, detecting early warning signals, and so on. As an emphasis, the our core contribution is that by means of the proposed method one can estimate precisely shannon entropy from limited records.
Evaluation of scaling invariance embedded in short time series.
Pan, Xue; Hou, Lei; Stephen, Mutua; Yang, Huijie; Zhu, Chenping
2014-01-01
Scaling invariance of time series has been making great contributions in diverse research fields. But how to evaluate scaling exponent from a real-world series is still an open problem. Finite length of time series may induce unacceptable fluctuation and bias to statistical quantities and consequent invalidation of currently used standard methods. In this paper a new concept called correlation-dependent balanced estimation of diffusion entropy is developed to evaluate scale-invariance in very short time series with length ~10(2). Calculations with specified Hurst exponent values of 0.2,0.3,...,0.9 show that by using the standard central moving average de-trending procedure this method can evaluate the scaling exponents for short time series with ignorable bias (≤0.03) and sharp confidential interval (standard deviation ≤0.05). Considering the stride series from ten volunteers along an approximate oval path of a specified length, we observe that though the averages and deviations of scaling exponents are close, their evolutionary behaviors display rich patterns. It has potential use in analyzing physiological signals, detecting early warning signals, and so on. As an emphasis, the our core contribution is that by means of the proposed method one can estimate precisely shannon entropy from limited records.
Highly comparative time-series analysis: the empirical structure of time series and their methods.
Fulcher, Ben D; Little, Max A; Jones, Nick S
2013-06-06
The process of collecting and organizing sets of observations represents a common theme throughout the history of science. However, despite the ubiquity of scientists measuring, recording and analysing the dynamics of different processes, an extensive organization of scientific time-series data and analysis methods has never been performed. Addressing this, annotated collections of over 35 000 real-world and model-generated time series, and over 9000 time-series analysis algorithms are analysed in this work. We introduce reduced representations of both time series, in terms of their properties measured by diverse scientific methods, and of time-series analysis methods, in terms of their behaviour on empirical time series, and use them to organize these interdisciplinary resources. This new approach to comparing across diverse scientific data and methods allows us to organize time-series datasets automatically according to their properties, retrieve alternatives to particular analysis methods developed in other scientific disciplines and automate the selection of useful methods for time-series classification and regression tasks. The broad scientific utility of these tools is demonstrated on datasets of electroencephalograms, self-affine time series, heartbeat intervals, speech signals and others, in each case contributing novel analysis techniques to the existing literature. Highly comparative techniques that compare across an interdisciplinary literature can thus be used to guide more focused research in time-series analysis for applications across the scientific disciplines.
"Observation Obscurer" - Time Series Viewer, Editor and Processor
Andronov, I. L.
The program is described, which contains a set of subroutines suitable for East viewing and interactive filtering and processing of regularly and irregularly spaced time series. Being a 32-bit DOS application, it may be used as a default fast viewer/editor of time series in any compute shell ("commander") or in Windows. It allows to view the data in the "time" or "phase" mode, to remove ("obscure") or filter outstanding bad points; to make scale transformations and smoothing using few methods (e.g. mean with phase binning, determination of the statistically opti- mal number of phase bins; "running parabola" (Andronov, 1997, As. Ap. Suppl, 125, 207) fit and to make time series analysis using some methods, e.g. correlation, autocorrelation and histogram analysis: determination of extrema etc. Some features have been developed specially for variable star observers, e.g. the barycentric correction, the creation and fast analysis of "OC" diagrams etc. The manual for "hot keys" is presented. The computer code was compiled with a 32-bit Free Pascal (www.freepascal.org).
Costa, Marco; A. Manuela Gonçalves
2012-01-01
In this work are discussed some statistical approaches that combine multivariate statistical techniques and time series analysis in order to describe and model spatial patterns and temporal evolution by observing hydrological series of water quality variables recorded in time and space. These approaches are illustrated with a data set collected in the River Ave hydrological basin located in the Northwest region of Portugal.
Time series analysis and its applications with R examples
Shumway, Robert H
2017-01-01
The fourth edition of this popular graduate textbook, like its predecessors, presents a balanced and comprehensive treatment of both time and frequency domain methods with accompanying theory. Numerous examples using nontrivial data illustrate solutions to problems such as discovering natural and anthropogenic climate change, evaluating pain perception experiments using functional magnetic resonance imaging, and monitoring a nuclear test ban treaty. The book is designed as a textbook for graduate level students in the physical, biological, and social sciences and as a graduate level text in statistics. Some parts may also serve as an undergraduate introductory course. Theory and methodology are separated to allow presentations on different levels. In addition to coverage of classical methods of time series regression, ARIMA models, spectral analysis and state-space models, the text includes modern developments including categorical time series analysis, multivariate spectral methods, long memory series, nonli...
Kriging Methodology and Its Development in Forecasting Econometric Time Series
Directory of Open Access Journals (Sweden)
Andrej Gajdoš
2017-03-01
Full Text Available One of the approaches for forecasting future values of a time series or unknown spatial data is kriging. The main objective of the paper is to introduce a general scheme of kriging in forecasting econometric time series using a family of linear regression time series models (shortly named as FDSLRM which apply regression not only to a trend but also to a random component of the observed time series. Simultaneously performing a Monte Carlo simulation study with a real electricity consumption dataset in the R computational langure and environment, we investigate the well-known problem of “negative” estimates of variance components when kriging predictions fail. Our following theoretical analysis, including also the modern apparatus of advanced multivariate statistics, gives us the formulation and proof of a general theorem about the explicit form of moments (up to sixth order for a Gaussian time series observation. This result provides a basis for further theoretical and computational research in the kriging methodology development.
A neuro-fuzzy computing technique for modeling hydrological time series
Nayak, P. C.; Sudheer, K. P.; Rangan, D. M.; Ramasastri, K. S.
2004-05-01
Intelligent computing tools such as artificial neural network (ANN) and fuzzy logic approaches are proven to be efficient when applied individually to a variety of problems. Recently there has been a growing interest in combining both these approaches, and as a result, neuro-fuzzy computing techniques have evolved. This approach has been tested and evaluated in the field of signal processing and related areas, but researchers have only begun evaluating the potential of this neuro-fuzzy hybrid approach in hydrologic modeling studies. This paper presents the application of an adaptive neuro fuzzy inference system (ANFIS) to hydrologic time series modeling, and is illustrated by an application to model the river flow of Baitarani River in Orissa state, India. An introduction to the ANFIS modeling approach is also presented. The advantage of the method is that it does not require the model structure to be known a priori, in contrast to most of the time series modeling techniques. The results showed that the ANFIS forecasted flow series preserves the statistical properties of the original flow series. The model showed good performance in terms of various statistical indices. The results are highly promising, and a comparative analysis suggests that the proposed modeling approach outperforms ANNs and other traditional time series models in terms of computational speed, forecast errors, efficiency, peak flow estimation etc. It was observed that the ANFIS model preserves the potential of the ANN approach fully, and eases the model building process.
Directory of Open Access Journals (Sweden)
Ana-Maria CALOMFIR (METESCU
2015-12-01
Full Text Available In recent years, research in the capital markets and management of portfolios has been producing more questions than it has been answering: the need for a new paradigm or a new way of looking at things has become more and more concludent. The existing and classical view of capital markets, based on efficient market hypothesis, has a definite theory for the last six decades, but it is still not capable of significantly increase the understanding of how capital markets function. The purpose of this article is to theoretically describe a less used statistic coefficient, having a vast area of applicability due to its robustness, and which can easily divide the random series from a non-random series, even if the random series is non-Gaussian: the Hurst exponent.
Angeler, David G; Viedma, Olga; Moreno, José M
2009-11-01
Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
A KST framework for correlation network construction from time series signals
Qi, Jin-Peng; Gu, Quan; Zhu, Ying; Zhang, Ping
2018-04-01
A KST (Kolmogorov-Smirnov test and T statistic) method is used for construction of a correlation network based on the fluctuation of each time series within the multivariate time signals. In this method, each time series is divided equally into multiple segments, and the maximal data fluctuation in each segment is calculated by a KST change detection procedure. Connections between each time series are derived from the data fluctuation matrix, and are used for construction of the fluctuation correlation network (FCN). The method was tested with synthetic simulations and the result was compared with those from using KS or T only for detection of data fluctuation. The novelty of this study is that the correlation analyses was based on the data fluctuation in each segment of each time series rather than on the original time signals, which would be more meaningful for many real world applications and for analysis of large-scale time signals where prior knowledge is uncertain.
feets: feATURE eXTRACTOR for tIME sERIES
Cabral, Juan; Sanchez, Bruno; Ramos, Felipe; Gurovich, Sebastián; Granitto, Pablo; VanderPlas, Jake
2018-06-01
feets characterizes and analyzes light-curves from astronomical photometric databases for modelling, classification, data cleaning, outlier detection and data analysis. It uses machine learning algorithms to determine the numerical descriptors that characterize and distinguish the different variability classes of light-curves; these range from basic statistical measures such as the mean or standard deviation to complex time-series characteristics such as the autocorrelation function. The library is not restricted to the astronomical field and could also be applied to any kind of time series. This project is a derivative work of FATS (ascl:1711.017).
Complex network approach to characterize the statistical features of the sunspot series
International Nuclear Information System (INIS)
Zou, Yong; Liu, Zonghua; Small, Michael; Kurths, Jürgen
2014-01-01
Complex network approaches have been recently developed as an alternative framework to study the statistical features of time-series data. We perform a visibility-graph analysis on both the daily and monthly sunspot series. Based on the data, we propose two ways to construct the network: one is from the original observable measurements and the other is from a negative-inverse-transformed series. The degree distribution of the derived networks for the strong maxima has clear non-Gaussian properties, while the degree distribution for minima is bimodal. The long-term variation of the cycles is reflected by hubs in the network that span relatively large time intervals. Based on standard network structural measures, we propose to characterize the long-term correlations by waiting times between two subsequent events. The persistence range of the solar cycles has been identified over 15–1000 days by a power-law regime with scaling exponent γ = 2.04 of the occurrence time of two subsequent strong minima. In contrast, a persistent trend is not present in the maximal numbers, although maxima do have significant deviations from an exponential form. Our results suggest some new insights for evaluating existing models. (paper)
International Nuclear Information System (INIS)
Gao Zhong-Ke; Hu Li-Dan; Jin Ning-De
2013-01-01
We generate a directed weighted complex network by a method based on Markov transition probability to represent an experimental two-phase flow. We first systematically carry out gas—liquid two-phase flow experiments for measuring the time series of flow signals. Then we construct directed weighted complex networks from various time series in terms of a network generation method based on Markov transition probability. We find that the generated network inherits the main features of the time series in the network structure. In particular, the networks from time series with different dynamics exhibit distinct topological properties. Finally, we construct two-phase flow directed weighted networks from experimental signals and associate the dynamic behavior of gas-liquid two-phase flow with the topological statistics of the generated networks. The results suggest that the topological statistics of two-phase flow networks allow quantitative characterization of the dynamic flow behavior in the transitions among different gas—liquid flow patterns. (general)
Stochastic generation of hourly wind speed time series
International Nuclear Information System (INIS)
Shamshad, A.; Wan Mohd Ali Wan Hussin; Bawadi, M.A.; Mohd Sanusi, S.A.
2006-01-01
In the present study hourly wind speed data of Kuala Terengganu in Peninsular Malaysia are simulated by using transition matrix approach of Markovian process. The wind speed time series is divided into various states based on certain criteria. The next wind speed states are selected based on the previous states. The cumulative probability transition matrix has been formed in which each row ends with 1. Using the uniform random numbers between 0 and 1, a series of future states is generated. These states have been converted to the corresponding wind speed values using another uniform random number generator. The accuracy of the model has been determined by comparing the statistical characteristics such as average, standard deviation, root mean square error, probability density function and autocorrelation function of the generated data to those of the original data. The generated wind speed time series data is capable to preserve the wind speed characteristics of the observed data
International Work-Conference on Time Series
Pomares, Héctor; Valenzuela, Olga
2017-01-01
This volume of selected and peer-reviewed contributions on the latest developments in time series analysis and forecasting updates the reader on topics such as analysis of irregularly sampled time series, multi-scale analysis of univariate and multivariate time series, linear and non-linear time series models, advanced time series forecasting methods, applications in time series analysis and forecasting, advanced methods and online learning in time series and high-dimensional and complex/big data time series. The contributions were originally presented at the International Work-Conference on Time Series, ITISE 2016, held in Granada, Spain, June 27-29, 2016. The series of ITISE conferences provides a forum for scientists, engineers, educators and students to discuss the latest ideas and implementations in the foundations, theory, models and applications in the field of time series analysis and forecasting. It focuses on interdisciplinary and multidisciplinary rese arch encompassing the disciplines of comput...
Jain, Lakhmi
2012-01-01
Data mining is one of the most rapidly growing research areas in computer science and statistics. In Volume 2 of this three volume series, we have brought together contributions from some of the most prestigious researchers in theoretical data mining. Each of the chapters is self contained. Statisticians and applied scientists/ engineers will find this volume valuable. Additionally, it provides a sourcebook for graduate students interested in the current direction of research in data mining.
Characteristics of the transmission of autoregressive sub-patterns in financial time series
Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong
2014-09-01
There are many types of autoregressive patterns in financial time series, and they form a transmission process. Here, we define autoregressive patterns quantitatively through an econometrical regression model. We present a computational algorithm that sets the autoregressive patterns as nodes and transmissions between patterns as edges, and then converts the transmission process of autoregressive patterns in a time series into a network. We utilised daily Shanghai (securities) composite index time series to study the transmission characteristics of autoregressive patterns. We found statistically significant evidence that the financial market is not random and that there are similar characteristics between parts and whole time series. A few types of autoregressive sub-patterns and transmission patterns drive the oscillations of the financial market. A clustering effect on fluctuations appears in the transmission process, and certain non-major autoregressive sub-patterns have high media capabilities in the financial time series. Different stock indexes exhibit similar characteristics in the transmission of fluctuation information. This work not only proposes a distinctive perspective for analysing financial time series but also provides important information for investors.
Pardo-Igúzquiza, Eulogio; Rodríguez-Tovar, Francisco J.
2012-12-01
Many spectral analysis techniques have been designed assuming sequences taken with a constant sampling interval. However, there are empirical time series in the geosciences (sediment cores, fossil abundance data, isotope analysis, …) that do not follow regular sampling because of missing data, gapped data, random sampling or incomplete sequences, among other reasons. In general, interpolating an uneven series in order to obtain a succession with a constant sampling interval alters the spectral content of the series. In such cases it is preferable to follow an approach that works with the uneven data directly, avoiding the need for an explicit interpolation step. The Lomb-Scargle periodogram is a popular choice in such circumstances, as there are programs available in the public domain for its computation. One new computer program for spectral analysis improves the standard Lomb-Scargle periodogram approach in two ways: (1) It explicitly adjusts the statistical significance to any bias introduced by variance reduction smoothing, and (2) it uses a permutation test to evaluate confidence levels, which is better suited than parametric methods when neighbouring frequencies are highly correlated. Another novel program for cross-spectral analysis offers the advantage of estimating the Lomb-Scargle cross-periodogram of two uneven time series defined on the same interval, and it evaluates the confidence levels of the estimated cross-spectra by a non-parametric computer intensive permutation test. Thus, the cross-spectrum, the squared coherence spectrum, the phase spectrum, and the Monte Carlo statistical significance of the cross-spectrum and the squared-coherence spectrum can be obtained. Both of the programs are written in ANSI Fortran 77, in view of its simplicity and compatibility. The program code is of public domain, provided on the website of the journal (http://www.iamg.org/index.php/publisher/articleview/frmArticleID/112/). Different examples (with simulated and
Time Series Factor Analysis with an Application to Measuring Money
Gilbert, Paul D.; Meijer, Erik
2005-01-01
Time series factor analysis (TSFA) and its associated statistical theory is developed. Unlike dynamic factor analysis (DFA), TSFA obviates the need for explicitly modeling the process dynamics of the underlying phenomena. It also differs from standard factor analysis (FA) in important respects: the
Shimada, Yutaka; Ikeguchi, Tohru; Shigehara, Takaomi
2012-10-01
In this Letter, we propose a framework to transform a complex network to a time series. The transformation from complex networks to time series is realized by the classical multidimensional scaling. Applying the transformation method to a model proposed by Watts and Strogatz [Nature (London) 393, 440 (1998)], we show that ring lattices are transformed to periodic time series, small-world networks to noisy periodic time series, and random networks to random time series. We also show that these relationships are analytically held by using the circulant-matrix theory and the perturbation theory of linear operators. The results are generalized to several high-dimensional lattices.
The Timeseries Toolbox - A Web Application to Enable Accessible, Reproducible Time Series Analysis
Veatch, W.; Friedman, D.; Baker, B.; Mueller, C.
2017-12-01
The vast majority of data analyzed by climate researchers are repeated observations of physical process or time series data. This data lends itself of a common set of statistical techniques and models designed to determine trends and variability (e.g., seasonality) of these repeated observations. Often, these same techniques and models can be applied to a wide variety of different time series data. The Timeseries Toolbox is a web application designed to standardize and streamline these common approaches to time series analysis and modeling with particular attention to hydrologic time series used in climate preparedness and resilience planning and design by the U. S. Army Corps of Engineers. The application performs much of the pre-processing of time series data necessary for more complex techniques (e.g. interpolation, aggregation). With this tool, users can upload any dataset that conforms to a standard template and immediately begin applying these techniques to analyze their time series data.
Normalization methods in time series of platelet function assays
Van Poucke, Sven; Zhang, Zhongheng; Roest, Mark; Vukicevic, Milan; Beran, Maud; Lauwereins, Bart; Zheng, Ming-Hua; Henskens, Yvonne; Lancé, Marcus; Marcus, Abraham
2016-01-01
Abstract Platelet function can be quantitatively assessed by specific assays such as light-transmission aggregometry, multiple-electrode aggregometry measuring the response to adenosine diphosphate (ADP), arachidonic acid, collagen, and thrombin-receptor activating peptide and viscoelastic tests such as rotational thromboelastometry (ROTEM). The task of extracting meaningful statistical and clinical information from high-dimensional data spaces in temporal multivariate clinical data represented in multivariate time series is complex. Building insightful visualizations for multivariate time series demands adequate usage of normalization techniques. In this article, various methods for data normalization (z-transformation, range transformation, proportion transformation, and interquartile range) are presented and visualized discussing the most suited approach for platelet function data series. Normalization was calculated per assay (test) for all time points and per time point for all tests. Interquartile range, range transformation, and z-transformation demonstrated the correlation as calculated by the Spearman correlation test, when normalized per assay (test) for all time points. When normalizing per time point for all tests, no correlation could be abstracted from the charts as was the case when using all data as 1 dataset for normalization. PMID:27428217
New significance test methods for Fourier analysis of geophysical time series
Directory of Open Access Journals (Sweden)
Z. Zhang
2011-09-01
Full Text Available When one applies the discrete Fourier transform to analyze finite-length time series, discontinuities at the data boundaries will distort its Fourier power spectrum. In this paper, based on a rigid statistics framework, we present a new significance test method which can extract the intrinsic feature of a geophysical time series very well. We show the difference in significance level compared with traditional Fourier tests by analyzing the Arctic Oscillation (AO and the Nino3.4 time series. In the AO, we find significant peaks at about 2.8, 4.3, and 5.7 yr periods and in Nino3.4 at about 12 yr period in tests against red noise. These peaks are not significant in traditional tests.
Xia, Li C; Steele, Joshua A; Cram, Jacob A; Cardon, Zoe G; Simmons, Sheri L; Vallino, Joseph J; Fuhrman, Jed A; Sun, Fengzhu
2011-01-01
The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa.
Acute ischaemic stroke prediction from physiological time series patterns
Directory of Open Access Journals (Sweden)
Qing Zhang,
2013-05-01
Full Text Available BackgroundStroke is one of the major diseases with human mortality. Recent clinical research has indicated that early changes in common physiological variables represent a potential therapeutic target, thus the manipulation of these variables may eventually yield an effective way to optimise stroke recovery.AimsWe examined correlations between physiological parameters of patients during the first 48 hours after a stroke, and their stroke outcomes after 3 months. We wanted to discover physiological determinants that could be used to improve health outcomes by supporting the medical decisions that need to be made early on a patient’s stroke experience.Method We applied regression-based machine learning techniques to build a prediction algorithm that can forecast 3-month outcomes from initial physiological time series data during the first 48 hours after stroke. In our method, not only did we use statistical characteristics as traditional prediction features, but also we adopted trend patterns of time series data as new key features.ResultsWe tested our prediction method on a real physiological data set of stroke patients. The experiment results revealed an average high precision rate: 90%. We also tested prediction methods only considering statistical characteristics of physiological data, and concluded an average precision rate: 71%.ConclusionWe demonstrated that using trend pattern features in prediction methods improved the accuracy of stroke outcome prediction. Therefore, trend patterns of physiological time series data have an important role in the early treatment of patients with acute ischaemic stroke.
Synthetic river flow time series generator for dispatch and spot price forecast
International Nuclear Information System (INIS)
Flores, R.A.
2007-01-01
Decision-making in electricity markets is complicated by uncertainties in demand growth, power supplies and fuel prices. In Peru, where the electrical power system is highly dependent on water resources at dams and river flows, hydrological uncertainties play a primary role in planning, price and dispatch forecast. This paper proposed a signal processing method for generating new synthetic river flow time series as a support for planning and spot market price forecasting. River flow time series are natural phenomena representing a continuous-time domain process. As an alternative synthetic representation of the original river flow time series, this proposed signal processing method preserves correlations, basic statistics and seasonality. It takes into account deterministic, periodic and non periodic components such as those due to the El Nino Southern Oscillation phenomenon. The new synthetic time series has many correlations with the original river flow time series, rendering it suitable for possible replacement of the classical method of sorting historical river flow time series. As a dispatch and planning approach to spot pricing, the proposed method offers higher accuracy modeling by decomposing the signal into deterministic, periodic, non periodic and stochastic sub signals. 4 refs., 4 tabs., 13 figs
Clustering Multivariate Time Series Using Hidden Markov Models
Directory of Open Access Journals (Sweden)
Shima Ghassempour
2014-03-01
Full Text Available In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs, where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers.
DEFF Research Database (Denmark)
Hisdal, H.; Holmqvist, E.; Hyvärinen, V.
Awareness that emission of greenhouse gases will raise the global temperature and change the climate has led to studies trying to identify such changes in long-term climate and hydrologic time series. This report, written by the......Awareness that emission of greenhouse gases will raise the global temperature and change the climate has led to studies trying to identify such changes in long-term climate and hydrologic time series. This report, written by the...
Statistics without Tears: Complex Statistics with Simple Arithmetic
Smith, Brian
2011-01-01
One of the often overlooked aspects of modern statistics is the analysis of time series data. Modern introductory statistics courses tend to rush to probabilistic applications involving risk and confidence. Rarely does the first level course linger on such useful and fascinating topics as time series decomposition, with its practical applications…
Time series analysis of barometric pressure data
International Nuclear Information System (INIS)
La Rocca, Paola; Riggi, Francesco; Riggi, Daniele
2010-01-01
Time series of atmospheric pressure data, collected over a period of several years, were analysed to provide undergraduate students with educational examples of application of simple statistical methods of analysis. In addition to basic methods for the analysis of periodicities, a comparison of two forecast models, one based on autoregression algorithms, and the other making use of an artificial neural network, was made. Results show that the application of artificial neural networks may give slightly better results compared to traditional methods.
Interpretation of a compositional time series
Tolosana-Delgado, R.; van den Boogaart, K. G.
2012-04-01
Common methods for multivariate time series analysis use linear operations, from the definition of a time-lagged covariance/correlation to the prediction of new outcomes. However, when the time series response is a composition (a vector of positive components showing the relative importance of a set of parts in a total, like percentages and proportions), then linear operations are afflicted of several problems. For instance, it has been long recognised that (auto/cross-)correlations between raw percentages are spurious, more dependent on which other components are being considered than on any natural link between the components of interest. Also, a long-term forecast of a composition in models with a linear trend will ultimately predict negative components. In general terms, compositional data should not be treated in a raw scale, but after a log-ratio transformation (Aitchison, 1986: The statistical analysis of compositional data. Chapman and Hill). This is so because the information conveyed by a compositional data is relative, as stated in their definition. The principle of working in coordinates allows to apply any sort of multivariate analysis to a log-ratio transformed composition, as long as this transformation is invertible. This principle is of full application to time series analysis. We will discuss how results (both auto/cross-correlation functions and predictions) can be back-transformed, viewed and interpreted in a meaningful way. One view is to use the exhaustive set of all possible pairwise log-ratios, which allows to express the results into D(D - 1)/2 separate, interpretable sets of one-dimensional models showing the behaviour of each possible pairwise log-ratios. Another view is the interpretation of estimated coefficients or correlations back-transformed in terms of compositions. These two views are compatible and complementary. These issues are illustrated with time series of seasonal precipitation patterns at different rain gauges of the USA
Using entropy to cut complex time series
Mertens, David; Poncela Casasnovas, Julia; Spring, Bonnie; Amaral, L. A. N.
2013-03-01
Using techniques from statistical physics, physicists have modeled and analyzed human phenomena varying from academic citation rates to disease spreading to vehicular traffic jams. The last decade's explosion of digital information and the growing ubiquity of smartphones has led to a wealth of human self-reported data. This wealth of data comes at a cost, including non-uniform sampling and statistically significant but physically insignificant correlations. In this talk I present our work using entropy to identify stationary sub-sequences of self-reported human weight from a weight management web site. Our entropic approach-inspired by the infomap network community detection algorithm-is far less biased by rare fluctuations than more traditional time series segmentation techniques. Supported by the Howard Hughes Medical Institute
A Non-standard Empirical Likelihood for Time Series
DEFF Research Database (Denmark)
Nordman, Daniel J.; Bunzel, Helle; Lahiri, Soumendra N.
Standard blockwise empirical likelihood (BEL) for stationary, weakly dependent time series requires specifying a fixed block length as a tuning parameter for setting confidence regions. This aspect can be difficult and impacts coverage accuracy. As an alternative, this paper proposes a new version...... of BEL based on a simple, though non-standard, data-blocking rule which uses a data block of every possible length. Consequently, the method involves no block selection and is also anticipated to exhibit better coverage performance. Its non-standard blocking scheme, however, induces non......-standard asymptotics and requires a significantly different development compared to standard BEL. We establish the large-sample distribution of log-ratio statistics from the new BEL method for calibrating confidence regions for mean or smooth function parameters of time series. This limit law is not the usual chi...
A comment on measuring the Hurst exponent of financial time series
Couillard, Michel; Davison, Matt
2005-03-01
A fundamental hypothesis of quantitative finance is that stock price variations are independent and can be modeled using Brownian motion. In recent years, it was proposed to use rescaled range analysis and its characteristic value, the Hurst exponent, to test for independence in financial time series. Theoretically, independent time series should be characterized by a Hurst exponent of 1/2. However, finite Brownian motion data sets will always give a value of the Hurst exponent larger than 1/2 and without an appropriate statistical test such a value can mistakenly be interpreted as evidence of long term memory. We obtain a more precise statistical significance test for the Hurst exponent and apply it to real financial data sets. Our empirical analysis shows no long-term memory in some financial returns, suggesting that Brownian motion cannot be rejected as a model for price dynamics.
A simple and fast representation space for classifying complex time series
International Nuclear Information System (INIS)
Zunino, Luciano; Olivares, Felipe; Bariviera, Aurelio F.; Rosso, Osvaldo A.
2017-01-01
In the context of time series analysis considerable effort has been directed towards the implementation of efficient discriminating statistical quantifiers. Very recently, a simple and fast representation space has been introduced, namely the number of turning points versus the Abbe value. It is able to separate time series from stationary and non-stationary processes with long-range dependences. In this work we show that this bidimensional approach is useful for distinguishing complex time series: different sets of financial and physiological data are efficiently discriminated. Additionally, a multiscale generalization that takes into account the multiple time scales often involved in complex systems has been also proposed. This multiscale analysis is essential to reach a higher discriminative power between physiological time series in health and disease. - Highlights: • A bidimensional scheme has been tested for classification purposes. • A multiscale generalization is introduced. • Several practical applications confirm its usefulness. • Different sets of financial and physiological data are efficiently distinguished. • This multiscale bidimensional approach has high potential as discriminative tool.
A simple and fast representation space for classifying complex time series
Energy Technology Data Exchange (ETDEWEB)
Zunino, Luciano, E-mail: lucianoz@ciop.unlp.edu.ar [Centro de Investigaciones Ópticas (CONICET La Plata – CIC), C.C. 3, 1897 Gonnet (Argentina); Departamento de Ciencias Básicas, Facultad de Ingeniería, Universidad Nacional de La Plata (UNLP), 1900 La Plata (Argentina); Olivares, Felipe, E-mail: olivaresfe@gmail.com [Instituto de Física, Pontificia Universidad Católica de Valparaíso (PUCV), 23-40025 Valparaíso (Chile); Bariviera, Aurelio F., E-mail: aurelio.fernandez@urv.cat [Department of Business, Universitat Rovira i Virgili, Av. Universitat 1, 43204 Reus (Spain); Rosso, Osvaldo A., E-mail: oarosso@gmail.com [Instituto de Física, Universidade Federal de Alagoas (UFAL), BR 104 Norte km 97, 57072-970, Maceió, Alagoas (Brazil); Instituto Tecnológico de Buenos Aires (ITBA) and CONICET, C1106ACD, Av. Eduardo Madero 399, Ciudad Autónoma de Buenos Aires (Argentina); Complex Systems Group, Facultad de Ingeniería y Ciencias Aplicadas, Universidad de los Andes, Av. Mons. Álvaro del Portillo 12.455, Las Condes, Santiago (Chile)
2017-03-18
In the context of time series analysis considerable effort has been directed towards the implementation of efficient discriminating statistical quantifiers. Very recently, a simple and fast representation space has been introduced, namely the number of turning points versus the Abbe value. It is able to separate time series from stationary and non-stationary processes with long-range dependences. In this work we show that this bidimensional approach is useful for distinguishing complex time series: different sets of financial and physiological data are efficiently discriminated. Additionally, a multiscale generalization that takes into account the multiple time scales often involved in complex systems has been also proposed. This multiscale analysis is essential to reach a higher discriminative power between physiological time series in health and disease. - Highlights: • A bidimensional scheme has been tested for classification purposes. • A multiscale generalization is introduced. • Several practical applications confirm its usefulness. • Different sets of financial and physiological data are efficiently distinguished. • This multiscale bidimensional approach has high potential as discriminative tool.
Kolmogorov Space in Time Series Data
Kanjamapornkul, K.; Pinčák, R.
2016-01-01
We provide the proof that the space of time series data is a Kolmogorov space with $T_{0}$-separation axiom using the loop space of time series data. In our approach we define a cyclic coordinate of intrinsic time scale of time series data after empirical mode decomposition. A spinor field of time series data comes from the rotation of data around price and time axis by defining a new extradimension to time series data. We show that there exist hidden eight dimensions in Kolmogorov space for ...
Assessing Coupling Dynamics from an Ensemble of Time Series
Directory of Open Access Journals (Sweden)
Germán Gómez-Herrero
2015-04-01
Full Text Available Finding interdependency relations between time series provides valuable knowledge about the processes that generated the signals. Information theory sets a natural framework for important classes of statistical dependencies. However, a reliable estimation from information-theoretic functionals is hampered when the dependency to be assessed is brief or evolves in time. Here, we show that these limitations can be partly alleviated when we have access to an ensemble of independent repetitions of the time series. In particular, we gear a data-efficient estimator of probability densities to make use of the full structure of trial-based measures. By doing so, we can obtain time-resolved estimates for a family of entropy combinations (including mutual information, transfer entropy and their conditional counterparts, which are more accurate than the simple average of individual estimates over trials. We show with simulated and real data generated by coupled electronic circuits that the proposed approach allows one to recover the time-resolved dynamics of the coupling between different subsystems.
Multiple Indicator Stationary Time Series Models.
Sivo, Stephen A.
2001-01-01
Discusses the propriety and practical advantages of specifying multivariate time series models in the context of structural equation modeling for time series and longitudinal panel data. For time series data, the multiple indicator model specification improves on classical time series analysis. For panel data, the multiple indicator model…
Model-based Clustering of Categorical Time Series with Multinomial Logit Classification
Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea
2010-09-01
A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.
DEFF Research Database (Denmark)
Moskowitz, Tobias J.; Ooi, Yao Hua; Heje Pedersen, Lasse
2012-01-01
We document significant “time series momentum” in equity index, currency, commodity, and bond futures for each of the 58 liquid instruments we consider. We find persistence in returns for one to 12 months that partially reverses over longer horizons, consistent with sentiment theories of initial...... under-reaction and delayed over-reaction. A diversified portfolio of time series momentum strategies across all asset classes delivers substantial abnormal returns with little exposure to standard asset pricing factors and performs best during extreme markets. Examining the trading activities...
Statistical modeling of isoform splicing dynamics from RNA-seq time series data.
Huang, Yuanhua; Sanguinetti, Guido
2016-10-01
Isoform quantification is an important goal of RNA-seq experiments, yet it remains problematic for genes with low expression or several isoforms. These difficulties may in principle be ameliorated by exploiting correlated experimental designs, such as time series or dosage response experiments. Time series RNA-seq experiments, in particular, are becoming increasingly popular, yet there are no methods that explicitly leverage the experimental design to improve isoform quantification. Here, we present DICEseq, the first isoform quantification method tailored to correlated RNA-seq experiments. DICEseq explicitly models the correlations between different RNA-seq experiments to aid the quantification of isoforms across experiments. Numerical experiments on simulated datasets show that DICEseq yields more accurate results than state-of-the-art methods, an advantage that can become considerable at low coverage levels. On real datasets, our results show that DICEseq provides substantially more reproducible and robust quantifications, increasing the correlation of estimates from replicate datasets by up to 10% on genes with low or moderate expression levels (bottom third of all genes). Furthermore, DICEseq permits to quantify the trade-off between temporal sampling of RNA and depth of sequencing, frequently an important choice when planning experiments. Our results have strong implications for the design of RNA-seq experiments, and offer a novel tool for improved analysis of such datasets. Python code is freely available at http://diceseq.sf.net G.Sanguinetti@ed.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Michael Weber; Michaela Denk
2011-01-01
International organizations collect data from national authorities to create multivariate cross-sectional time series for their analyses. As data from countries with not yet well-established statistical systems may be incomplete, the bridging of data gaps is a crucial challenge. This paper investigates data structures and missing data patterns in the cross-sectional time series framework, reviews missing value imputation techniques used for micro data in official statistics, and discusses the...
Constructing networks from a dynamical system perspective for multivariate nonlinear time series.
Nakamura, Tomomichi; Tanizawa, Toshihiro; Small, Michael
2016-03-01
We describe a method for constructing networks for multivariate nonlinear time series. We approach the interaction between the various scalar time series from a deterministic dynamical system perspective and provide a generic and algorithmic test for whether the interaction between two measured time series is statistically significant. The method can be applied even when the data exhibit no obvious qualitative similarity: a situation in which the naive method utilizing the cross correlation function directly cannot correctly identify connectivity. To establish the connectivity between nodes we apply the previously proposed small-shuffle surrogate (SSS) method, which can investigate whether there are correlation structures in short-term variabilities (irregular fluctuations) between two data sets from the viewpoint of deterministic dynamical systems. The procedure to construct networks based on this idea is composed of three steps: (i) each time series is considered as a basic node of a network, (ii) the SSS method is applied to verify the connectivity between each pair of time series taken from the whole multivariate time series, and (iii) the pair of nodes is connected with an undirected edge when the null hypothesis cannot be rejected. The network constructed by the proposed method indicates the intrinsic (essential) connectivity of the elements included in the system or the underlying (assumed) system. The method is demonstrated for numerical data sets generated by known systems and applied to several experimental time series.
Student understanding of Taylor series expansions in statistical mechanics
Directory of Open Access Journals (Sweden)
Trevor I. Smith
2013-08-01
Full Text Available One goal of physics instruction is to have students learn to make physical meaning of specific mathematical expressions, concepts, and procedures in different physical settings. As part of research investigating student learning in statistical physics, we are developing curriculum materials that guide students through a derivation of the Boltzmann factor using a Taylor series expansion of entropy. Using results from written surveys, classroom observations, and both individual think-aloud and teaching interviews, we present evidence that many students can recognize and interpret series expansions, but they often lack fluency in creating and using a Taylor series appropriately, despite previous exposures in both calculus and physics courses.
Student understanding of Taylor series expansions in statistical mechanics
Smith, Trevor I.; Thompson, John R.; Mountcastle, Donald B.
2013-12-01
One goal of physics instruction is to have students learn to make physical meaning of specific mathematical expressions, concepts, and procedures in different physical settings. As part of research investigating student learning in statistical physics, we are developing curriculum materials that guide students through a derivation of the Boltzmann factor using a Taylor series expansion of entropy. Using results from written surveys, classroom observations, and both individual think-aloud and teaching interviews, we present evidence that many students can recognize and interpret series expansions, but they often lack fluency in creating and using a Taylor series appropriately, despite previous exposures in both calculus and physics courses.
Emerging properties of financial time series in the ``Game of Life''
Hernández-Montoya, A. R.; Coronel-Brizio, H. F.; Stevens-Ramírez, G. A.; Rodríguez-Achach, M.; Politi, M.; Scalas, E.
2011-12-01
We explore the spatial complexity of Conway’s “Game of Life,” a prototypical cellular automaton by means of a geometrical procedure generating a two-dimensional random walk from a bidimensional lattice with periodical boundaries. The one-dimensional projection of this process is analyzed and it turns out that some of its statistical properties resemble the so-called stylized facts observed in financial time series. The scope and meaning of this result are discussed from the viewpoint of complex systems. In particular, we stress how the supposed peculiarities of financial time series are, often, overrated in their importance.
A multiscale view on inverse statistics and gain/loss asymmetry in financial time series
International Nuclear Information System (INIS)
Siven, Johannes; Lins, Jeffrey; Hansen, Jonas Lundbek
2009-01-01
Researchers have studied the first-passage time of financial time series and observed that the smallest time interval needed for a stock index to move a given distance is typically shorter for negative than for positive price movements. The same is not observed for the index constituents, the individual stocks. We use the discrete wavelet transform to show that this is a long, rather than short, timescale phenomenon—if enough low frequency content of the price process is removed, the asymmetry disappears. We also propose a model which explains the asymmetry in terms of prolonged, correlated downward movements of individual stocks
Detecting structural breaks in time series via genetic algorithms
DEFF Research Database (Denmark)
Doerr, Benjamin; Fischer, Paul; Hilbert, Astrid
2016-01-01
of the time series under consideration is available. Therefore, a black-box optimization approach is our method of choice for detecting structural breaks. We describe a genetic algorithm framework which easily adapts to a large number of statistical settings. To evaluate the usefulness of different crossover...... and mutation operations for this problem, we conduct extensive experiments to determine good choices for the parameters and operators of the genetic algorithm. One surprising observation is that use of uniform and one-point crossover together gave significantly better results than using either crossover...... operator alone. Moreover, we present a specific fitness function which exploits the sparse structure of the break points and which can be evaluated particularly efficiently. The experiments on artificial and real-world time series show that the resulting algorithm detects break points with high precision...
Rodgers, Joseph Lee; Beasley, William Howard; Schuelke, Matthew
2014-01-01
Many data structures, particularly time series data, are naturally seasonal, cyclical, or otherwise circular. Past graphical methods for time series have focused on linear plots. In this article, we move graphical analysis onto the circle. We focus on 2 particular methods, one old and one new. Rose diagrams are circular histograms and can be produced in several different forms using the RRose software system. In addition, we propose, develop, illustrate, and provide software support for a new circular graphical method, called Wrap-Around Time Series Plots (WATS Plots), which is a graphical method useful to support time series analyses in general but in particular in relation to interrupted time series designs. We illustrate the use of WATS Plots with an interrupted time series design evaluating the effect of the Oklahoma City bombing on birthrates in Oklahoma County during the 10 years surrounding the bombing of the Murrah Building in Oklahoma City. We compare WATS Plots with linear time series representations and overlay them with smoothing and error bands. Each method is shown to have advantages in relation to the other; in our example, the WATS Plots more clearly show the existence and effect size of the fertility differential.
Zhang, J.; Ives, A. R.; Turner, M. G.; Kucharik, C. J.
2017-12-01
Previous studies have identified global agricultural regions where "stagnation" of long-term crop yield increases has occurred. These studies have used a variety of simple statistical methods that often ignore important aspects of time series regression modeling. These methods can lead to differing and contradictory results, which creates uncertainty regarding food security given rapid global population growth. Here, we present a new statistical framework incorporating time series-based algorithms into standard regression models to quantify spatiotemporal yield trends of US maize, soybean, and winter wheat from 1970-2016. Our primary goal was to quantify spatial differences in yield trends for these three crops using USDA county level data. This information was used to identify regions experiencing the largest changes in the rate of yield increases over time, and to determine whether abrupt shifts in the rate of yield increases have occurred. Although crop yields continue to increase in most maize-, soybean-, and winter wheat-growing areas, yield increases have stagnated in some key agricultural regions during the most recent 15 to 16 years: some maize-growing areas, except for the northern Great Plains, have shown a significant trend towards smaller annual yield increases for maize; soybean has maintained an consistent long-term yield gains in the Northern Great Plains, the Midwest, and southeast US, but has experienced a shift to smaller annual increases in other regions; winter wheat maintained a moderate annual increase in eastern South Dakota and eastern US locations, but showed a decline in the magnitude of annual increases across the central Great Plains and western US regions. Our results suggest that there were abrupt shifts in the rate of annual yield increases in a variety of US regions among the three crops. The framework presented here can be broadly applied to additional yield trend analyses for different crops and regions of the Earth.
Time-series modeling: applications to long-term finfish monitoring data
International Nuclear Information System (INIS)
Bireley, L.E.
1985-01-01
The growing concern and awareness that developed during the 1970's over the effects that industry had on the environment caused the electric utility industry in particular to develop monitoring programs. These programs generate long-term series of data that are not very amenable to classical normal-theory statistical analysis. The monitoring data collected from three finfish programs (impingement, trawl and seine) at the Millstone Nuclear Power Station were typical of such series and thus were used to develop methodology that used the full extent of the information in the series. The basis of the methodology was classic Box-Jenkins time-series modeling; however, the models also included deterministic components that involved flow, season and time as predictor variables. Time entered into the models as harmonic regression terms. Of the 32 models fitted to finfish catch data, 19 were found to account for more than 70% of the historical variation. The models were than used to forecast finfish catches a year in advance and comparisons were made to actual data. Usually the confidence intervals associated with the forecasts encompassed most of the observed data. The technique can provide the basis for intervention analysis in future impact assessments
Visibility Graph Based Time Series Analysis.
Stephen, Mutua; Gu, Changgui; Yang, Huijie
2015-01-01
Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq) and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.
Visibility Graph Based Time Series Analysis.
Directory of Open Access Journals (Sweden)
Mutua Stephen
Full Text Available Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.
Trend analysis using non-stationary time series clustering based on the finite element method
Gorji Sefidmazgi, M.; Sayemuzzaman, M.; Homaifar, A.; Jha, M. K.; Liess, S.
2014-05-01
In order to analyze low-frequency variability of climate, it is useful to model the climatic time series with multiple linear trends and locate the times of significant changes. In this paper, we have used non-stationary time series clustering to find change points in the trends. Clustering in a multi-dimensional non-stationary time series is challenging, since the problem is mathematically ill-posed. Clustering based on the finite element method (FEM) is one of the methods that can analyze multidimensional time series. One important attribute of this method is that it is not dependent on any statistical assumption and does not need local stationarity in the time series. In this paper, it is shown how the FEM-clustering method can be used to locate change points in the trend of temperature time series from in situ observations. This method is applied to the temperature time series of North Carolina (NC) and the results represent region-specific climate variability despite higher frequency harmonics in climatic time series. Next, we investigated the relationship between the climatic indices with the clusters/trends detected based on this clustering method. It appears that the natural variability of climate change in NC during 1950-2009 can be explained mostly by AMO and solar activity.
The Usage of Time Series Control Charts for Financial Process Analysis
Directory of Open Access Journals (Sweden)
Kovářík Martin
2012-09-01
Full Text Available We will deal with financial proceedings of the company using methods of SPC (Statistical Process Control, specifically through time series control charts. The paper will outline the intersection of two disciplines which are econometrics and statistical process control. The theoretical part will discuss the methodology of time series control charts and in the research part there will be this methodology demonstrated in three case studies. The first study will focus on the regulation of simulated financial flows for a company by CUSUM control chart. The second study will involve the regulation of financial flows for a heteroskedastic financial process by EWMA control chart. The last case study of our paper will be devoted to applications of ARIMA, EWMA and CUSUM control charts in the financial data that are sensitive to the mean shifting while calculating the autocorrelation in the data. In this paper, we highlight the versatility of control charts not only in manufacturing but also in managing the financial stability of cash flows.
Ratio-based lengths of intervals to improve fuzzy time series forecasting.
Huarng, Kunhuang; Yu, Tiffany Hui-Kuang
2006-04-01
The objective of this study is to explore ways of determining the useful lengths of intervals in fuzzy time series. It is suggested that ratios, instead of equal lengths of intervals, can more properly represent the intervals among observations. Ratio-based lengths of intervals are, therefore, proposed to improve fuzzy time series forecasting. Algebraic growth data, such as enrollments and the stock index, and exponential growth data, such as inventory demand, are chosen as the forecasting targets, before forecasting based on the various lengths of intervals is performed. Furthermore, sensitivity analyses are also carried out for various percentiles. The ratio-based lengths of intervals are found to outperform the effective lengths of intervals, as well as the arbitrary ones in regard to the different statistical measures. The empirical analysis suggests that the ratio-based lengths of intervals can also be used to improve fuzzy time series forecasting.
He, Yuning
2015-01-01
Safety of unmanned aerial systems (UAS) is paramount, but the large number of dynamically changing controller parameters makes it hard to determine if the system is currently stable, and the time before loss of control if not. We propose a hierarchical statistical model using Treed Gaussian Processes to predict (i) whether a flight will be stable (success) or become unstable (failure), (ii) the time-to-failure if unstable, and (iii) time series outputs for flight variables. We first classify the current flight input into success or failure types, and then use separate models for each class to predict the time-to-failure and time series outputs. As different inputs may cause failures at different times, we have to model variable length output curves. We use a basis representation for curves and learn the mappings from input to basis coefficients. We demonstrate the effectiveness of our prediction methods on a NASA neuro-adaptive flight control system.
Energy Technology Data Exchange (ETDEWEB)
Sanchez Merino, G.; Cortes Rpdicio, J.; Lope Lope, R.; Martin Gonzalez, T.; Garcia Fidalgo, M. A.
2013-07-01
The aim of the present work is to study the dependence of temporal resolution with the activity using statistical techniques applied to the series of values time series measurements of temporal resolution during daily equipment checks. (Author)
Liu, Wensong; Yang, Jie; Zhao, Jinqi; Shi, Hongtao; Yang, Le
2018-02-12
The traditional unsupervised change detection methods based on the pixel level can only detect the changes between two different times with same sensor, and the results are easily affected by speckle noise. In this paper, a novel method is proposed to detect change based on time-series data from different sensors. Firstly, the overall difference image of the time-series PolSAR is calculated by omnibus test statistics, and difference images between any two images in different times are acquired by R j test statistics. Secondly, the difference images are segmented with a Generalized Statistical Region Merging (GSRM) algorithm which can suppress the effect of speckle noise. Generalized Gaussian Mixture Model (GGMM) is then used to obtain the time-series change detection maps in the final step of the proposed method. To verify the effectiveness of the proposed method, we carried out the experiment of change detection using time-series PolSAR images acquired by Radarsat-2 and Gaofen-3 over the city of Wuhan, in China. Results show that the proposed method can not only detect the time-series change from different sensors, but it can also better suppress the influence of speckle noise and improve the overall accuracy and Kappa coefficient.
The application of complex network time series analysis in turbulent heated jets
International Nuclear Information System (INIS)
Charakopoulos, A. K.; Karakasidis, T. E.; Liakopoulos, A.; Papanicolaou, P. N.
2014-01-01
In the present study, we applied the methodology of the complex network-based time series analysis to experimental temperature time series from a vertical turbulent heated jet. More specifically, we approach the hydrodynamic problem of discriminating time series corresponding to various regions relative to the jet axis, i.e., time series corresponding to regions that are close to the jet axis from time series originating at regions with a different dynamical regime based on the constructed network properties. Applying the transformation phase space method (k nearest neighbors) and also the visibility algorithm, we transformed time series into networks and evaluated the topological properties of the networks such as degree distribution, average path length, diameter, modularity, and clustering coefficient. The results show that the complex network approach allows distinguishing, identifying, and exploring in detail various dynamical regions of the jet flow, and associate it to the corresponding physical behavior. In addition, in order to reject the hypothesis that the studied networks originate from a stochastic process, we generated random network and we compared their statistical properties with that originating from the experimental data. As far as the efficiency of the two methods for network construction is concerned, we conclude that both methodologies lead to network properties that present almost the same qualitative behavior and allow us to reveal the underlying system dynamics
Recurrence Density Enhanced Complex Networks for Nonlinear Time Series Analysis
Costa, Diego G. De B.; Reis, Barbara M. Da F.; Zou, Yong; Quiles, Marcos G.; Macau, Elbert E. N.
We introduce a new method, which is entitled Recurrence Density Enhanced Complex Network (RDE-CN), to properly analyze nonlinear time series. Our method first transforms a recurrence plot into a figure of a reduced number of points yet preserving the main and fundamental recurrence properties of the original plot. This resulting figure is then reinterpreted as a complex network, which is further characterized by network statistical measures. We illustrate the computational power of RDE-CN approach by time series by both the logistic map and experimental fluid flows, which show that our method distinguishes different dynamics sufficiently well as the traditional recurrence analysis. Therefore, the proposed methodology characterizes the recurrence matrix adequately, while using a reduced set of points from the original recurrence plots.
Network structure of multivariate time series.
Lacasa, Lucas; Nicosia, Vincenzo; Latora, Vito
2015-10-21
Our understanding of a variety of phenomena in physics, biology and economics crucially depends on the analysis of multivariate time series. While a wide range tools and techniques for time series analysis already exist, the increasing availability of massive data structures calls for new approaches for multidimensional signal processing. We present here a non-parametric method to analyse multivariate time series, based on the mapping of a multidimensional time series into a multilayer network, which allows to extract information on a high dimensional dynamical system through the analysis of the structure of the associated multiplex network. The method is simple to implement, general, scalable, does not require ad hoc phase space partitioning, and is thus suitable for the analysis of large, heterogeneous and non-stationary time series. We show that simple structural descriptors of the associated multiplex networks allow to extract and quantify nontrivial properties of coupled chaotic maps, including the transition between different dynamical phases and the onset of various types of synchronization. As a concrete example we then study financial time series, showing that a multiplex network analysis can efficiently discriminate crises from periods of financial stability, where standard methods based on time-series symbolization often fail.
Horváth, Csilla; Kornelis, Marcel; Leeflang, Peter S.H.
2002-01-01
In this review, we give a comprehensive summary of time series techniques in marketing, and discuss a variety of time series analysis (TSA) techniques and models. We classify them in the sets (i) univariate TSA, (ii) multivariate TSA, and (iii) multiple TSA. We provide relevant marketing
Self-potential time series analysis in a seismic area of the Southern Apennines: preliminary results
Di Bello, G.; Lapenna, V.; Satriano, C.; Tramutoli, V.
1994-01-01
The self-potential time series recorded during the period May 1991 - August 1992 by an automatic station, located in a seismic area of Southern Apennines, is analyzed. We deal with the spectral and the statistical features of the electrotellurie precursors: they can play a major role in the approach to seismic prediction. The time-dynamics of the experimental time series is investigated, the cyclic components and the time trends are removed. In particular we consider the influence of external...
RankExplorer: Visualization of Ranking Changes in Large Time Series Data.
Shi, Conglei; Cui, Weiwei; Liu, Shixia; Xu, Panpan; Chen, Wei; Qu, Huamin
2012-12-01
For many applications involving time series data, people are often interested in the changes of item values over time as well as their ranking changes. For example, people search many words via search engines like Google and Bing every day. Analysts are interested in both the absolute searching number for each word as well as their relative rankings. Both sets of statistics may change over time. For very large time series data with thousands of items, how to visually present ranking changes is an interesting challenge. In this paper, we propose RankExplorer, a novel visualization method based on ThemeRiver to reveal the ranking changes. Our method consists of four major components: 1) a segmentation method which partitions a large set of time series curves into a manageable number of ranking categories; 2) an extended ThemeRiver view with embedded color bars and changing glyphs to show the evolution of aggregation values related to each ranking category over time as well as the content changes in each ranking category; 3) a trend curve to show the degree of ranking changes over time; 4) rich user interactions to support interactive exploration of ranking changes. We have applied our method to some real time series data and the case studies demonstrate that our method can reveal the underlying patterns related to ranking changes which might otherwise be obscured in traditional visualizations.
International Nuclear Information System (INIS)
Bolzani, M.J.A.; Guarnieri, F.L.; Vieira, Paulo Cesar
2009-01-01
Nowadays, wavelet analysis of turbulent flows have become increasingly popular. However, the study of geometric characteristics from wavelet functions is still poorly explored. In this work we compare the performance of two wavelet functions in extracting the coherent structures from solar wind velocity time series. The data series are from years 1996 to 2002 (except 1998 and 1999). The wavelet algorithm decomposes the annual time-series in two components: the coherent part and non-coherent one, using the daubechies-4 and haar wavelet function. The threshold assumed is based on a percentage of maximum variance found in each dyadic scale. After the extracting procedure, we applied the power spectral density on the original time series and coherent time series to obtain spectral indices. The results from spectral indices show higher values for the coherent part obtained by daubechies-4 than those obtained by the haar wavelet function. Using the kurtosis statistical parameter, on coherent and non-coherent time series, it was possible to conjecture that the differences found between two wavelet functions may be associated with their geometric forms. (author)
Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.
2017-12-01
Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.
Data mining in time series databases
Kandel, Abraham; Bunke, Horst
2004-01-01
Adding the time dimension to real-world databases produces Time SeriesDatabases (TSDB) and introduces new aspects and difficulties to datamining and knowledge discovery. This book covers the state-of-the-artmethodology for mining time series databases. The novel data miningmethods presented in the book include techniques for efficientsegmentation, indexing, and classification of noisy and dynamic timeseries. A graph-based method for anomaly detection in time series isdescribed and the book also studies the implications of a novel andpotentially useful representation of time series as strings. Theproblem of detecting changes in data mining models that are inducedfrom temporal databases is additionally discussed.
Volatility behavior of visibility graph EMD financial time series from Ising interacting system
Zhang, Bo; Wang, Jun; Fang, Wen
2015-08-01
A financial market dynamics model is developed and investigated by stochastic Ising system, where the Ising model is the most popular ferromagnetic model in statistical physics systems. Applying two graph based analysis and multiscale entropy method, we investigate and compare the statistical volatility behavior of return time series and the corresponding IMF series derived from the empirical mode decomposition (EMD) method. And the real stock market indices are considered to be comparatively studied with the simulation data of the proposed model. Further, we find that the degree distribution of visibility graph for the simulation series has the power law tails, and the assortative network exhibits the mixing pattern property. All these features are in agreement with the real market data, the research confirms that the financial model established by the Ising system is reasonable.
Wu, Zi Yi; Xie, Ping; Sang, Yan Fang; Gu, Hai Ting
2018-04-01
The phenomenon of jump is one of the importantly external forms of hydrological variabi-lity under environmental changes, representing the adaption of hydrological nonlinear systems to the influence of external disturbances. Presently, the related studies mainly focus on the methods for identifying the jump positions and jump times in hydrological time series. In contrast, few studies have focused on the quantitative description and classification of jump degree in hydrological time series, which make it difficult to understand the environmental changes and evaluate its potential impacts. Here, we proposed a theatrically reliable and easy-to-apply method for the classification of jump degree in hydrological time series, using the correlation coefficient as a basic index. The statistical tests verified the accuracy, reasonability, and applicability of this method. The relationship between the correlation coefficient and the jump degree of series were described using mathematical equation by derivation. After that, several thresholds of correlation coefficients under different statistical significance levels were chosen, based on which the jump degree could be classified into five levels: no, weak, moderate, strong and very strong. Finally, our method was applied to five diffe-rent observed hydrological time series, with diverse geographic and hydrological conditions in China. The results of the classification of jump degrees in those series were closely accorded with their physically hydrological mechanisms, indicating the practicability of our method.
Ghosh, Sayantan; Manimaran, P.; Panigrahi, Prasanta K.
2011-11-01
We make use of wavelet transform to study the multi-scale, self-similar behavior and deviations thereof, in the stock prices of large companies, belonging to different economic sectors. The stock market returns exhibit multi-fractal characteristics, with some of the companies showing deviations at small and large scales. The fact that, the wavelets belonging to the Daubechies’ (Db) basis enables one to isolate local polynomial trends of different degrees, plays the key role in isolating fluctuations at different scales. One of the primary motivations of this work is to study the emergence of the k-3 behavior [X. Gabaix, P. Gopikrishnan, V. Plerou, H. Stanley, A theory of power law distributions in financial market fluctuations, Nature 423 (2003) 267-270] of the fluctuations starting with high frequency fluctuations. We make use of Db4 and Db6 basis sets to respectively isolate local linear and quadratic trends at different scales in order to study the statistical characteristics of these financial time series. The fluctuations reveal fat tail non-Gaussian behavior, unstable periodic modulations, at finer scales, from which the characteristic k-3 power law behavior emerges at sufficiently large scales. We further identify stable periodic behavior through the continuous Morlet wavelet.
DEFF Research Database (Denmark)
Fischer, Paul; Hilbert, Astrid
2012-01-01
We introduce a platform which supplies an easy-to-handle, interactive, extendable, and fast analysis tool for time series analysis. In contrast to other software suits like Maple, Matlab, or R, which use a command-line-like interface and where the user has to memorize/look-up the appropriate...... commands, our application is select-and-click-driven. It allows to derive many different sequences of deviations for a given time series and to visualize them in different ways in order to judge their expressive power and to reuse the procedure found. For many transformations or model-ts, the user may...... choose between manual and automated parameter selection. The user can dene new transformations and add them to the system. The application contains efficient implementations of advanced and recent techniques for time series analysis including techniques related to extreme value analysis and filtering...
Meshram, Sarita Gajbhiye; Singh, Sudhir Kumar; Meshram, Chandrashekhar; Deo, Ravinesh C.; Ambade, Balram
2017-12-01
Trend analysis of long-term rainfall records can be used to facilitate better agriculture water management decision and climate risk studies. The main objective of this study was to identify the existing trends in the long-term rainfall time series over the period 1901-2010 utilizing 12 hydrological stations located at the Ken River basin (KRB) in Madhya Pradesh, India. To investigate the different trends, the rainfall time series data were divided into annual and seasonal (i.e., pre-monsoon, monsoon, post-monsoon, and winter season) sub-sets, and a statistical analysis of data using the non-parametric Mann-Kendall (MK) test and the Sen's slope approach was applied to identify the nature of the existing trends in rainfall series for the Ken River basin. The obtained results were further interpolated with the aid of the Quantum Geographic Information System (GIS) approach employing the inverse distance weighted approach. The results showed that the monsoon and the winter season exhibited a negative trend in rainfall changes over the period of study, and this was true for all stations, although the changes during the pre- and the post-monsoon seasons were less significant. The outcomes of this research study also suggest significant decreases in the seasonal and annual trends of rainfall amounts in the study period. These findings showing a clear signature of climate change impacts on KRB region potentially have implications in terms of climate risk management strategies to be developed during major growing and harvesting seasons and also to aid in the appropriate water resource management strategies that must be implemented in decision-making process.
A Review of Subsequence Time Series Clustering
Directory of Open Access Journals (Sweden)
Seyedjamal Zolhavarieh
2014-01-01
Full Text Available Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies.
A review of subsequence time series clustering.
Zolhavarieh, Seyedjamal; Aghabozorgi, Saeed; Teh, Ying Wah
2014-01-01
Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies.
A Review of Subsequence Time Series Clustering
Teh, Ying Wah
2014-01-01
Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies. PMID:25140332
Analysis of Heavy-Tailed Time Series
DEFF Research Database (Denmark)
Xie, Xiaolei
This thesis is about analysis of heavy-tailed time series. We discuss tail properties of real-world equity return series and investigate the possibility that a single tail index is shared by all return series of actively traded equities in a market. Conditions for this hypothesis to be true...... are identified. We study the eigenvalues and eigenvectors of sample covariance and sample auto-covariance matrices of multivariate heavy-tailed time series, and particularly for time series with very high dimensions. Asymptotic approximations of the eigenvalues and eigenvectors of such matrices are found...... and expressed in terms of the parameters of the dependence structure, among others. Furthermore, we study an importance sampling method for estimating rare-event probabilities of multivariate heavy-tailed time series generated by matrix recursion. We show that the proposed algorithm is efficient in the sense...
Adaptive time-variant models for fuzzy-time-series forecasting.
Wong, Wai-Keung; Bai, Enjian; Chu, Alice Wai-Ching
2010-12-01
A fuzzy time series has been applied to the prediction of enrollment, temperature, stock indices, and other domains. Related studies mainly focus on three factors, namely, the partition of discourse, the content of forecasting rules, and the methods of defuzzification, all of which greatly influence the prediction accuracy of forecasting models. These studies use fixed analysis window sizes for forecasting. In this paper, an adaptive time-variant fuzzy-time-series forecasting model (ATVF) is proposed to improve forecasting accuracy. The proposed model automatically adapts the analysis window size of fuzzy time series based on the prediction accuracy in the training phase and uses heuristic rules to generate forecasting values in the testing phase. The performance of the ATVF model is tested using both simulated and actual time series including the enrollments at the University of Alabama, Tuscaloosa, and the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX). The experiment results show that the proposed ATVF model achieves a significant improvement in forecasting accuracy as compared to other fuzzy-time-series forecasting models.
Determining the Points of Change in Time Series of Polarimetric SAR Data
DEFF Research Database (Denmark)
Conradsen, Knut; Nielsen, Allan Aasbjerg; Skriver, Henning
2016-01-01
We present the likelihood ratio test statistic for the homogeneity of several complex variance–covariance matrices that may be used in order to assess whether at least one change has taken place in a time series of SAR data. Furthermore, we give a factorization of this test statistic into a produ....... The pixelwise analyses are applied on homogeneous subareas covered with different vegetation types using the distribution of the observed p-values....
Energy Technology Data Exchange (ETDEWEB)
Heyen, H. [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Gewaesserphysik
1998-12-31
A multivariate statistical approach is presented that allows a systematic search for relationships between the interannual variability in climate records and ecological time series. Statistical models are built between climatological predictor fields and the variables of interest. Relationships are sought on different temporal scales and for different seasons and time lags. The possibilities and limitations of this approach are discussed in four case studies dealing with salinity in the German Bight, abundance of zooplankton at Helgoland Roads, macrofauna communities off Norderney and the arrival of migratory birds on Helgoland. (orig.) [Deutsch] Ein statistisches, multivariates Modell wird vorgestellt, das eine systematische Suche nach potentiellen Zusammenhaengen zwischen Variabilitaet in Klima- und oekologischen Zeitserien erlaubt. Anhand von vier Anwendungsbeispielen wird der Klimaeinfluss auf den Salzgehalt in der Deutschen Bucht, Zooplankton vor Helgoland, Makrofauna vor Norderney, und die Ankunft von Zugvoegeln auf Helgoland untersucht. (orig.)
Topological data analysis of financial time series: Landscapes of crashes
Gidea, Marian; Katz, Yuri
2018-02-01
We explore the evolution of daily returns of four major US stock market indices during the technology crash of 2000, and the financial crisis of 2007-2009. Our methodology is based on topological data analysis (TDA). We use persistence homology to detect and quantify topological patterns that appear in multidimensional time series. Using a sliding window, we extract time-dependent point cloud data sets, to which we associate a topological space. We detect transient loops that appear in this space, and we measure their persistence. This is encoded in real-valued functions referred to as a 'persistence landscapes'. We quantify the temporal changes in persistence landscapes via their Lp-norms. We test this procedure on multidimensional time series generated by various non-linear and non-equilibrium models. We find that, in the vicinity of financial meltdowns, the Lp-norms exhibit strong growth prior to the primary peak, which ascends during a crash. Remarkably, the average spectral density at low frequencies of the time series of Lp-norms of the persistence landscapes demonstrates a strong rising trend for 250 trading days prior to either dotcom crash on 03/10/2000, or to the Lehman bankruptcy on 09/15/2008. Our study suggests that TDA provides a new type of econometric analysis, which complements the standard statistical measures. The method can be used to detect early warning signals of imminent market crashes. We believe that this approach can be used beyond the analysis of financial time series presented here.
Time Series Analysis and Forecasting by Example
Bisgaard, Soren
2011-01-01
An intuition-based approach enables you to master time series analysis with ease Time Series Analysis and Forecasting by Example provides the fundamental techniques in time series analysis using various examples. By introducing necessary theory through examples that showcase the discussed topics, the authors successfully help readers develop an intuitive understanding of seemingly complicated time series models and their implications. The book presents methodologies for time series analysis in a simplified, example-based approach. Using graphics, the authors discuss each presented example in
Time series with tailored nonlinearities
Räth, C.; Laut, I.
2015-10-01
It is demonstrated how to generate time series with tailored nonlinearities by inducing well-defined constraints on the Fourier phases. Correlations between the phase information of adjacent phases and (static and dynamic) measures of nonlinearities are established and their origin is explained. By applying a set of simple constraints on the phases of an originally linear and uncorrelated Gaussian time series, the observed scaling behavior of the intensity distribution of empirical time series can be reproduced. The power law character of the intensity distributions being typical for, e.g., turbulence and financial data can thus be explained in terms of phase correlations.
Clustering of financial time series
D'Urso, Pierpaolo; Cappelli, Carmela; Di Lallo, Dario; Massari, Riccardo
2013-05-01
This paper addresses the topic of classifying financial time series in a fuzzy framework proposing two fuzzy clustering models both based on GARCH models. In general clustering of financial time series, due to their peculiar features, needs the definition of suitable distance measures. At this aim, the first fuzzy clustering model exploits the autoregressive representation of GARCH models and employs, in the framework of a partitioning around medoids algorithm, the classical autoregressive metric. The second fuzzy clustering model, also based on partitioning around medoids algorithm, uses the Caiado distance, a Mahalanobis-like distance, based on estimated GARCH parameters and covariances that takes into account the information about the volatility structure of time series. In order to illustrate the merits of the proposed fuzzy approaches an application to the problem of classifying 29 time series of Euro exchange rates against international currencies is presented and discussed, also comparing the fuzzy models with their crisp version.
Multivariate stochastic analysis for Monthly hydrological time series at Cuyahoga River Basin
zhang, L.
2011-12-01
Copula has become a very powerful statistic and stochastic methodology in case of the multivariate analysis in Environmental and Water resources Engineering. In recent years, the popular one-parameter Archimedean copulas, e.g. Gumbel-Houggard copula, Cook-Johnson copula, Frank copula, the meta-elliptical copula, e.g. Gaussian Copula, Student-T copula, etc. have been applied in multivariate hydrological analyses, e.g. multivariate rainfall (rainfall intensity, duration and depth), flood (peak discharge, duration and volume), and drought analyses (drought length, mean and minimum SPI values, and drought mean areal extent). Copula has also been applied in the flood frequency analysis at the confluences of river systems by taking into account the dependence among upstream gauge stations rather than by using the hydrological routing technique. In most of the studies above, the annual time series have been considered as stationary signal which the time series have been assumed as independent identically distributed (i.i.d.) random variables. But in reality, hydrological time series, especially the daily and monthly hydrological time series, cannot be considered as i.i.d. random variables due to the periodicity existed in the data structure. Also, the stationary assumption is also under question due to the Climate Change and Land Use and Land Cover (LULC) change in the fast years. To this end, it is necessary to revaluate the classic approach for the study of hydrological time series by relaxing the stationary assumption by the use of nonstationary approach. Also as to the study of the dependence structure for the hydrological time series, the assumption of same type of univariate distribution also needs to be relaxed by adopting the copula theory. In this paper, the univariate monthly hydrological time series will be studied through the nonstationary time series analysis approach. The dependence structure of the multivariate monthly hydrological time series will be
A Seasonal Time-Series Model Based on Gene Expression Programming for Predicting Financial Distress.
Cheng, Ching-Hsue; Chan, Chia-Pang; Yang, Jun-He
2018-01-01
The issue of financial distress prediction plays an important and challenging research topic in the financial field. Currently, there have been many methods for predicting firm bankruptcy and financial crisis, including the artificial intelligence and the traditional statistical methods, and the past studies have shown that the prediction result of the artificial intelligence method is better than the traditional statistical method. Financial statements are quarterly reports; hence, the financial crisis of companies is seasonal time-series data, and the attribute data affecting the financial distress of companies is nonlinear and nonstationary time-series data with fluctuations. Therefore, this study employed the nonlinear attribute selection method to build a nonlinear financial distress prediction model: that is, this paper proposed a novel seasonal time-series gene expression programming model for predicting the financial distress of companies. The proposed model has several advantages including the following: (i) the proposed model is different from the previous models lacking the concept of time series; (ii) the proposed integrated attribute selection method can find the core attributes and reduce high dimensional data; and (iii) the proposed model can generate the rules and mathematical formulas of financial distress for providing references to the investors and decision makers. The result shows that the proposed method is better than the listing classifiers under three criteria; hence, the proposed model has competitive advantages in predicting the financial distress of companies.
Detecting switching and intermittent causalities in time series
Zanin, Massimiliano; Papo, David
2017-04-01
During the last decade, complex network representations have emerged as a powerful instrument for describing the cross-talk between different brain regions both at rest and as subjects are carrying out cognitive tasks, in healthy brains and neurological pathologies. The transient nature of such cross-talk has nevertheless by and large been neglected, mainly due to the inherent limitations of some metrics, e.g., causality ones, which require a long time series in order to yield statistically significant results. Here, we present a methodology to account for intermittent causal coupling in neural activity, based on the identification of non-overlapping windows within the original time series in which the causality is strongest. The result is a less coarse-grained assessment of the time-varying properties of brain interactions, which can be used to create a high temporal resolution time-varying network. We apply the proposed methodology to the analysis of the brain activity of control subjects and alcoholic patients performing an image recognition task. Our results show that short-lived, intermittent, local-scale causality is better at discriminating both groups than global network metrics. These results highlight the importance of the transient nature of brain activity, at least under some pathological conditions.
Donges, Jonathan; Heitzig, Jobst; Beronov, Boyan; Wiedermann, Marc; Runge, Jakob; Feng, Qing Yi; Tupikina, Liubov; Stolbova, Veronika; Donner, Reik; Marwan, Norbert; Dijkstra, Henk; Kurths, Jürgen
2016-04-01
We introduce the pyunicorn (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. pyunicorn is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics, or network surrogates. Additionally, pyunicorn provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis, recurrence networks, visibility graphs, and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology. pyunicorn is available online at https://github.com/pik-copan/pyunicorn. Reference: J.F. Donges, J. Heitzig, B. Beronov, M. Wiedermann, J. Runge, Q.-Y. Feng, L. Tupikina, V. Stolbova, R.V. Donner, N. Marwan, H.A. Dijkstra, and J. Kurths, Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package, Chaos 25, 113101 (2015), DOI: 10.1063/1.4934554, Preprint: arxiv.org:1507.01571 [physics.data-an].
Signs over time: Statistical and visual analysis of a longitudinal signed network
de Nooy, W.
2008-01-01
This paper presents the design and results of a statistical and visual analysis of a dynamic signed network. In addition to prevalent approaches to longitudinal networks, which analyze series of cross-sectional data, this paper focuses on network data measured in continuous time in order to explain
Evaluation of Interpolants in Their Ability to Fit Seismometric Time Series
Directory of Open Access Journals (Sweden)
Kanadpriya Basu
2015-08-01
Full Text Available This article is devoted to the study of the ASARCO demolition seismic data. Two different classes of modeling techniques are explored: First, mathematical interpolation methods and second statistical smoothing approaches for curve fitting. We estimate the characteristic parameters of the propagation medium for seismic waves with multiple mathematical and statistical techniques, and provide the relative advantages of each approach to address fitting of such data. We conclude that mathematical interpolation techniques and statistical curve fitting techniques complement each other and can add value to the study of one dimensional time series seismographic data: they can be use to add more data to the system in case the data set is not large enough to perform standard statistical tests.
OceanXtremes: Scalable Anomaly Detection in Oceanographic Time-Series
Wilson, B. D.; Armstrong, E. M.; Chin, T. M.; Gill, K. M.; Greguska, F. R., III; Huang, T.; Jacob, J. C.; Quach, N.
2016-12-01
The oceanographic community must meet the challenge to rapidly identify features and anomalies in complex and voluminous observations to further science and improve decision support. Given this data-intensive reality, we are developing an anomaly detection system, called OceanXtremes, powered by an intelligent, elastic Cloud-based analytic service backend that enables execution of domain-specific, multi-scale anomaly and feature detection algorithms across the entire archive of 15 to 30-year ocean science datasets.Our parallel analytics engine is extending the NEXUS system and exploits multiple open-source technologies: Apache Cassandra as a distributed spatial "tile" cache, Apache Spark for in-memory parallel computation, and Apache Solr for spatial search and storing pre-computed tile statistics and other metadata. OceanXtremes provides these key capabilities: Parallel generation (Spark on a compute cluster) of 15 to 30-year Ocean Climatologies (e.g. sea surface temperature or SST) in hours or overnight, using simple pixel averages or customizable Gaussian-weighted "smoothing" over latitude, longitude, and time; Parallel pre-computation, tiling, and caching of anomaly fields (daily variables minus a chosen climatology) with pre-computed tile statistics; Parallel detection (over the time-series of tiles) of anomalies or phenomena by regional area-averages exceeding a specified threshold (e.g. high SST in El Nino or SST "blob" regions), or more complex, custom data mining algorithms; Shared discovery and exploration of ocean phenomena and anomalies (facet search using Solr), along with unexpected correlations between key measured variables; Scalable execution for all capabilities on a hybrid Cloud, using our on-premise OpenStack Cloud cluster or at Amazon. The key idea is that the parallel data-mining operations will be run "near" the ocean data archives (a local "network" hop) so that we can efficiently access the thousands of files making up a three decade time-series
High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets
Chen, Tai-Liang; Cheng, Ching-Hsue; Teoh, Hia-Jong
2008-02-01
Stock investors usually make their short-term investment decisions according to recent stock information such as the late market news, technical analysis reports, and price fluctuations. To reflect these short-term factors which impact stock price, this paper proposes a comprehensive fuzzy time-series, which factors linear relationships between recent periods of stock prices and fuzzy logical relationships (nonlinear relationships) mined from time-series into forecasting processes. In empirical analysis, the TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) and HSI (Heng Seng Index) are employed as experimental datasets, and four recent fuzzy time-series models, Chen’s (1996), Yu’s (2005), Cheng’s (2006) and Chen’s (2007), are used as comparison models. Besides, to compare with conventional statistic method, the method of least squares is utilized to estimate the auto-regressive models of the testing periods within the databases. From analysis results, the performance comparisons indicate that the multi-period adaptation model, proposed in this paper, can effectively improve the forecasting performance of conventional fuzzy time-series models which only factor fuzzy logical relationships in forecasting processes. From the empirical study, the traditional statistic method and the proposed model both reveal that stock price patterns in the Taiwan stock and Hong Kong stock markets are short-term.
Data Mining Smart Energy Time Series
Directory of Open Access Journals (Sweden)
Janina POPEANGA
2015-07-01
Full Text Available With the advent of smart metering technology the amount of energy data will increase significantly and utilities industry will have to face another big challenge - to find relationships within time-series data and even more - to analyze such huge numbers of time series to find useful patterns and trends with fast or even real-time response. This study makes a small review of the literature in the field, trying to demonstrate how essential is the application of data mining techniques in the time series to make the best use of this large quantity of data, despite all the difficulties. Also, the most important Time Series Data Mining techniques are presented, highlighting their applicability in the energy domain.
Predicting chaotic time series
International Nuclear Information System (INIS)
Farmer, J.D.; Sidorowich, J.J.
1987-01-01
We present a forecasting technique for chaotic data. After embedding a time series in a state space using delay coordinates, we ''learn'' the induced nonlinear mapping using local approximation. This allows us to make short-term predictions of the future behavior of a time series, using information based only on past values. We present an error estimate for this technique, and demonstrate its effectiveness by applying it to several examples, including data from the Mackey-Glass delay differential equation, Rayleigh-Benard convection, and Taylor-Couette flow
Nonlinear Prediction Model for Hydrologic Time Series Based on Wavelet Decomposition
Kwon, H.; Khalil, A.; Brown, C.; Lall, U.; Ahn, H.; Moon, Y.
2005-12-01
Traditionally forecasting and characterizations of hydrologic systems is performed utilizing many techniques. Stochastic linear methods such as AR and ARIMA and nonlinear ones such as statistical learning theory based tools have been extensively used. The common difficulty to all methods is the determination of sufficient and necessary information and predictors for a successful prediction. Relationships between hydrologic variables are often highly nonlinear and interrelated across the temporal scale. A new hybrid approach is proposed for the simulation of hydrologic time series combining both the wavelet transform and the nonlinear model. The present model employs some merits of wavelet transform and nonlinear time series model. The Wavelet Transform is adopted to decompose a hydrologic nonlinear process into a set of mono-component signals, which are simulated by nonlinear model. The hybrid methodology is formulated in a manner to improve the accuracy of a long term forecasting. The proposed hybrid model yields much better results in terms of capturing and reproducing the time-frequency properties of the system at hand. Prediction results are promising when compared to traditional univariate time series models. An application of the plausibility of the proposed methodology is provided and the results conclude that wavelet based time series model can be utilized for simulating and forecasting of hydrologic variable reasonably well. This will ultimately serve the purpose of integrated water resources planning and management.
EEG Eye State Identification Using Incremental Attribute Learning with Time-Series Classification
Directory of Open Access Journals (Sweden)
Ting Wang
2014-01-01
Full Text Available Eye state identification is a kind of common time-series classification problem which is also a hot spot in recent research. Electroencephalography (EEG is widely used in eye state classification to detect human's cognition state. Previous research has validated the feasibility of machine learning and statistical approaches for EEG eye state classification. This paper aims to propose a novel approach for EEG eye state identification using incremental attribute learning (IAL based on neural networks. IAL is a novel machine learning strategy which gradually imports and trains features one by one. Previous studies have verified that such an approach is applicable for solving a number of pattern recognition problems. However, in these previous works, little research on IAL focused on its application to time-series problems. Therefore, it is still unknown whether IAL can be employed to cope with time-series problems like EEG eye state classification. Experimental results in this study demonstrates that, with proper feature extraction and feature ordering, IAL can not only efficiently cope with time-series classification problems, but also exhibit better classification performance in terms of classification error rates in comparison with conventional and some other approaches.
Measuring multiscaling in financial time-series
International Nuclear Information System (INIS)
Buonocore, R.J.; Aste, T.; Di Matteo, T.
2016-01-01
We discuss the origin of multiscaling in financial time-series and investigate how to best quantify it. Our methodology consists in separating the different sources of measured multifractality by analyzing the multi/uni-scaling behavior of synthetic time-series with known properties. We use the results from the synthetic time-series to interpret the measure of multifractality of real log-returns time-series. The main finding is that the aggregation horizon of the returns can introduce a strong bias effect on the measure of multifractality. This effect can become especially important when returns distributions have power law tails with exponents in the range (2, 5). We discuss the right aggregation horizon to mitigate this bias.
Physics constrained nonlinear regression models for time series
International Nuclear Information System (INIS)
Majda, Andrew J; Harlim, John
2013-01-01
A central issue in contemporary science is the development of data driven statistical nonlinear dynamical models for time series of partial observations of nature or a complex physical model. It has been established recently that ad hoc quadratic multi-level regression (MLR) models can have finite-time blow up of statistical solutions and/or pathological behaviour of their invariant measure. Here a new class of physics constrained multi-level quadratic regression models are introduced, analysed and applied to build reduced stochastic models from data of nonlinear systems. These models have the advantages of incorporating memory effects in time as well as the nonlinear noise from energy conserving nonlinear interactions. The mathematical guidelines for the performance and behaviour of these physics constrained MLR models as well as filtering algorithms for their implementation are developed here. Data driven applications of these new multi-level nonlinear regression models are developed for test models involving a nonlinear oscillator with memory effects and the difficult test case of the truncated Burgers–Hopf model. These new physics constrained quadratic MLR models are proposed here as process models for Bayesian estimation through Markov chain Monte Carlo algorithms of low frequency behaviour in complex physical data. (paper)
A Gaussian Process Based Online Change Detection Algorithm for Monitoring Periodic Time Series
Energy Technology Data Exchange (ETDEWEB)
Chandola, Varun [ORNL; Vatsavai, Raju [ORNL
2011-01-01
Online time series change detection is a critical component of many monitoring systems, such as space and air-borne remote sensing instruments, cardiac monitors, and network traffic profilers, which continuously analyze observations recorded by sensors. Data collected by such sensors typically has a periodic (seasonal) component. Most existing time series change detection methods are not directly applicable to handle such data, either because they are not designed to handle periodic time series or because they cannot operate in an online mode. We propose an online change detection algorithm which can handle periodic time series. The algorithm uses a Gaussian process based non-parametric time series prediction model and monitors the difference between the predictions and actual observations within a statistically principled control chart framework to identify changes. A key challenge in using Gaussian process in an online mode is the need to solve a large system of equations involving the associated covariance matrix which grows with every time step. The proposed algorithm exploits the special structure of the covariance matrix and can analyze a time series of length T in O(T^2) time while maintaining a O(T) memory footprint, compared to O(T^4) time and O(T^2) memory requirement of standard matrix manipulation methods. We experimentally demonstrate the superiority of the proposed algorithm over several existing time series change detection algorithms on a set of synthetic and real time series. Finally, we illustrate the effectiveness of the proposed algorithm for identifying land use land cover changes using Normalized Difference Vegetation Index (NDVI) data collected for an agricultural region in Iowa state, USA. Our algorithm is able to detect different types of changes in a NDVI validation data set (with ~80% accuracy) which occur due to crop type changes as well as disruptive changes (e.g., natural disasters).
Linear and nonlinear dynamic systems in financial time series prediction
Directory of Open Access Journals (Sweden)
Salim Lahmiri
2012-10-01
Full Text Available Autoregressive moving average (ARMA process and dynamic neural networks namely the nonlinear autoregressive moving average with exogenous inputs (NARX are compared by evaluating their ability to predict financial time series; for instance the S&P500 returns. Two classes of ARMA are considered. The first one is the standard ARMA model which is a linear static system. The second one uses Kalman filter (KF to estimate and predict ARMA coefficients. This model is a linear dynamic system. The forecasting ability of each system is evaluated by means of mean absolute error (MAE and mean absolute deviation (MAD statistics. Simulation results indicate that the ARMA-KF system performs better than the standard ARMA alone. Thus, introducing dynamics into the ARMA process improves the forecasting accuracy. In addition, the ARMA-KF outperformed the NARX. This result may suggest that the linear component found in the S&P500 return series is more dominant than the nonlinear part. In sum, we conclude that introducing dynamics into the ARMA process provides an effective system for S&P500 time series prediction.
Self-potential time series analysis in a seismic area of the Southern Apennines: preliminary results
Directory of Open Access Journals (Sweden)
V. Tramutoli
1994-06-01
Full Text Available The self-potential time series recorded during the period May 1991 - August 1992 by an automatic station, located in a seismic area of Southern Apennines, is analyzed. We deal with the spectral and the statistical features of the electrotellurie precursors: they can play a major role in the approach to seismic prediction. The time-dynamics of the experimental time series is investigated, the cyclic components and the time trends are removed. In particular we consider the influence of external noise, related to anthropic activities and meteoclimatic parameters, and pick out the anomalies from the residual series. Finally we show the preliminary results of the correlation between the anomalies in the time patterns of self-potential data and the earthquakes which occurred in the area.
Directory of Open Access Journals (Sweden)
Eulogio Pardo-Igúzquiza
2015-08-01
Full Text Available Many studies have revealed the cyclicity of past ocean/atmosphere dynamics at a wide range of time scales (from decadal to millennial time scales, based on the spectral analysis of time series of climate proxies obtained from deep sea sediment cores. Among the many techniques available for spectral analysis, the maximum entropy method and the Thomson multitaper approach have frequently been used because of their good statistical properties and high resolution with short time series. The novelty of the present study is that we compared the two methods by according to the performance of their statistical tests to assess the statistical significance of their power spectrum estimates. The statistical significance of maximum entropy estimates was assessed by a random permutation test (Pardo-Igúzquiza and Rodríguez-Tovar, 2000, while the statistical significance of the Thomson multitaper method was assessed by an F-test (Thomson, 1982. We compared the results obtained in a case study using simulated data where the spectral content of the time series was known and in a case study with real data. In both cases the results are similar: while the cycles identified as significant by maximum entropy and the permutation test have a clear physical interpretation, the F-test with the Thomson multitaper estimator tends to find as no significant the peaks in the low frequencies and tends to give as significant more spurious peaks in the middle and high frequencies. Nevertheless, the best strategy is to use both techniques and to use the advantages of each of them.
Entropic Analysis of Electromyography Time Series
Kaufman, Miron; Sung, Paul
2005-03-01
We are in the process of assessing the effectiveness of fractal and entropic measures for the diagnostic of low back pain from surface electromyography (EMG) time series. Surface electromyography (EMG) is used to assess patients with low back pain. In a typical EMG measurement, the voltage is measured every millisecond. We observed back muscle fatiguing during one minute, which results in a time series with 60,000 entries. We characterize the complexity of time series by computing the Shannon entropy time dependence. The analysis of the time series from different relevant muscles from healthy and low back pain (LBP) individuals provides evidence that the level of variability of back muscle activities is much larger for healthy individuals than for individuals with LBP. In general the time dependence of the entropy shows a crossover from a diffusive regime to a regime characterized by long time correlations (self organization) at about 0.01s.
Application of the Allan Variance to Time Series Analysis in Astrometry and Geodesy: A Review.
Malkin, Zinovy
2016-04-01
The Allan variance (AVAR) was introduced 50 years ago as a statistical tool for assessing the frequency standards deviations. For the past decades, AVAR has increasingly been used in geodesy and astrometry to assess the noise characteristics in geodetic and astrometric time series. A specific feature of astrometric and geodetic measurements, as compared with clock measurements, is that they are generally associated with uncertainties; thus, an appropriate weighting should be applied during data analysis. In addition, some physically connected scalar time series naturally form series of multidimensional vectors. For example, three station coordinates time series X, Y, and Z can be combined to analyze 3-D station position variations. The classical AVAR is not intended for processing unevenly weighted and/or multidimensional data. Therefore, AVAR modifications, namely weighted AVAR (WAVAR), multidimensional AVAR (MAVAR), and weighted multidimensional AVAR (WMAVAR), were introduced to overcome these deficiencies. In this paper, a brief review is given of the experience of using AVAR and its modifications in processing astrogeodetic time series.
Directory of Open Access Journals (Sweden)
Ivan Arismendi
2017-12-01
Full Text Available Intermittent and ephemeral streams represent more than half of the length of the global river network. Dryland freshwater ecosystems are especially vulnerable to changes in human-related water uses as well as shifts in terrestrial climates. Yet, the description and quantification of patterns of flow permanence in these systems is challenging mostly due to difficulties in instrumentation. Here, we took advantage of existing stream temperature datasets in dryland streams in the northwest Great Basin desert, USA, to extract critical information on climate-sensitive patterns of flow permanence. We used a signal detection technique, Hidden Markov Models (HMMs, to extract information from daily time series of stream temperature to diagnose patterns of stream drying. Specifically, we applied HMMs to time series of daily standard deviation (SD of stream temperature (i.e., dry stream channels typically display highly variable daily temperature records compared to wet stream channels between April and August (2015–2016. We used information from paired stream and air temperature data loggers as well as co-located stream temperature data loggers with electrical resistors as confirmatory sources of the timing of stream drying. We expanded our approach to an entire stream network to illustrate the utility of the method to detect patterns of flow permanence over a broader spatial extent. We successfully identified and separated signals characteristic of wet and dry stream conditions and their shifts over time. Most of our study sites within the entire stream network exhibited a single state over the entire season (80%, but a portion of them showed one or more shifts among states (17%. We provide recommendations to use this approach based on a series of simple steps. Our findings illustrate a successful method that can be used to rigorously quantify flow permanence regimes in streams using existing records of stream temperature.
Arismendi, Ivan; Dunham, Jason B.; Heck, Michael; Schultz, Luke; Hockman-Wert, David
2017-01-01
Intermittent and ephemeral streams represent more than half of the length of the global river network. Dryland freshwater ecosystems are especially vulnerable to changes in human-related water uses as well as shifts in terrestrial climates. Yet, the description and quantification of patterns of flow permanence in these systems is challenging mostly due to difficulties in instrumentation. Here, we took advantage of existing stream temperature datasets in dryland streams in the northwest Great Basin desert, USA, to extract critical information on climate-sensitive patterns of flow permanence. We used a signal detection technique, Hidden Markov Models (HMMs), to extract information from daily time series of stream temperature to diagnose patterns of stream drying. Specifically, we applied HMMs to time series of daily standard deviation (SD) of stream temperature (i.e., dry stream channels typically display highly variable daily temperature records compared to wet stream channels) between April and August (2015–2016). We used information from paired stream and air temperature data loggers as well as co-located stream temperature data loggers with electrical resistors as confirmatory sources of the timing of stream drying. We expanded our approach to an entire stream network to illustrate the utility of the method to detect patterns of flow permanence over a broader spatial extent. We successfully identified and separated signals characteristic of wet and dry stream conditions and their shifts over time. Most of our study sites within the entire stream network exhibited a single state over the entire season (80%), but a portion of them showed one or more shifts among states (17%). We provide recommendations to use this approach based on a series of simple steps. Our findings illustrate a successful method that can be used to rigorously quantify flow permanence regimes in streams using existing records of stream temperature.
Lopez, Benjamin; Croiset, Nolwenn; Laurence, Gourcy
2014-05-01
The Water Framework Directive 2006/11/CE (WFD) on the protection of groundwater against pollution and deterioration asks Member States to identify significant and sustained upward trends in all bodies or groups of bodies of groundwater that are characterised as being at risk in accordance with Annex II to Directive 2000/60/EC. The Directive indicates that the procedure for the identification of significant and sustained upward trends must be based on a statistical method. Moreover, for significant increases of concentrations of pollutants, trend reversals are identified as being necessary. This means to be able to identify significant trend reversals. A specific tool, named HYPE, has been developed in order to help stakeholders working on groundwater trend assessment. The R encoded tool HYPE provides statistical analysis of groundwater time series. It follows several studies on the relevancy of the use of statistical tests on groundwater data series (Lopez et al., 2011) and other case studies on the thematic (Bourgine et al., 2012). It integrates the most powerful and robust statistical tests for hydrogeological applications. HYPE is linked to the French national database on groundwater data (ADES). So monitoring data gathered by the Water Agencies can be directly processed. HYPE has two main modules: - a characterisation module, which allows to visualize time series. HYPE calculates the main statistical characteristics and provides graphical representations; - a trend module, which identifies significant breaks, trends and trend reversals in time series, providing result table and graphical representation (cf figure). Additional modules are also implemented to identify regional and seasonal trends and to sample time series in a relevant way. HYPE has been used successfully in 2012 by the French Water Agencies to satisfy requirements of the WFD, concerning characterization of groundwater bodies' qualitative status and evaluation of the risk of non-achievement of
A Seasonal Time-Series Model Based on Gene Expression Programming for Predicting Financial Distress
2018-01-01
The issue of financial distress prediction plays an important and challenging research topic in the financial field. Currently, there have been many methods for predicting firm bankruptcy and financial crisis, including the artificial intelligence and the traditional statistical methods, and the past studies have shown that the prediction result of the artificial intelligence method is better than the traditional statistical method. Financial statements are quarterly reports; hence, the financial crisis of companies is seasonal time-series data, and the attribute data affecting the financial distress of companies is nonlinear and nonstationary time-series data with fluctuations. Therefore, this study employed the nonlinear attribute selection method to build a nonlinear financial distress prediction model: that is, this paper proposed a novel seasonal time-series gene expression programming model for predicting the financial distress of companies. The proposed model has several advantages including the following: (i) the proposed model is different from the previous models lacking the concept of time series; (ii) the proposed integrated attribute selection method can find the core attributes and reduce high dimensional data; and (iii) the proposed model can generate the rules and mathematical formulas of financial distress for providing references to the investors and decision makers. The result shows that the proposed method is better than the listing classifiers under three criteria; hence, the proposed model has competitive advantages in predicting the financial distress of companies. PMID:29765399
A Seasonal Time-Series Model Based on Gene Expression Programming for Predicting Financial Distress
Directory of Open Access Journals (Sweden)
Ching-Hsue Cheng
2018-01-01
Full Text Available The issue of financial distress prediction plays an important and challenging research topic in the financial field. Currently, there have been many methods for predicting firm bankruptcy and financial crisis, including the artificial intelligence and the traditional statistical methods, and the past studies have shown that the prediction result of the artificial intelligence method is better than the traditional statistical method. Financial statements are quarterly reports; hence, the financial crisis of companies is seasonal time-series data, and the attribute data affecting the financial distress of companies is nonlinear and nonstationary time-series data with fluctuations. Therefore, this study employed the nonlinear attribute selection method to build a nonlinear financial distress prediction model: that is, this paper proposed a novel seasonal time-series gene expression programming model for predicting the financial distress of companies. The proposed model has several advantages including the following: (i the proposed model is different from the previous models lacking the concept of time series; (ii the proposed integrated attribute selection method can find the core attributes and reduce high dimensional data; and (iii the proposed model can generate the rules and mathematical formulas of financial distress for providing references to the investors and decision makers. The result shows that the proposed method is better than the listing classifiers under three criteria; hence, the proposed model has competitive advantages in predicting the financial distress of companies.
Hansen, J V; Nelson, R D
1997-01-01
Ever since the initial planning for the 1997 Utah legislative session, neural-network forecasting techniques have provided valuable insights for analysts forecasting tax revenues. These revenue estimates are critically important since agency budgets, support for education, and improvements to infrastructure all depend on their accuracy. Underforecasting generates windfalls that concern taxpayers, whereas overforecasting produces budget shortfalls that cause inadequately funded commitments. The pattern finding ability of neural networks gives insightful and alternative views of the seasonal and cyclical components commonly found in economic time series data. Two applications of neural networks to revenue forecasting clearly demonstrate how these models complement traditional time series techniques. In the first, preoccupation with a potential downturn in the economy distracts analysis based on traditional time series methods so that it overlooks an emerging new phenomenon in the data. In this case, neural networks identify the new pattern that then allows modification of the time series models and finally gives more accurate forecasts. In the second application, data structure found by traditional statistical tools allows analysts to provide neural networks with important information that the networks then use to create more accurate models. In summary, for the Utah revenue outlook, the insights that result from a portfolio of forecasts that includes neural networks exceeds the understanding generated from strictly statistical forecasting techniques. In this case, the synergy clearly results in the whole of the portfolio of forecasts being more accurate than the sum of the individual parts.
Schaarup-Jensen, K; Rasmussen, M R; Thorndahl, S
2009-01-01
In urban drainage modelling long-term extreme statistics has become an important basis for decision-making e.g. in connection with renovation projects. Therefore it is of great importance to minimize the uncertainties with regards to long-term prediction of maximum water levels and combined sewer overflow (CSO) in drainage systems. These uncertainties originate from large uncertainties regarding rainfall inputs, parameters, and assessment of return periods. This paper investigates how the choice of rainfall time series influences the extreme events statistics of max water levels in manholes and CSO volumes. Traditionally, long-term rainfall series, from a local rain gauge, are unavailable. In the present case study, however, long and local rain series are available. 2 rainfall gauges have recorded events for approximately 9 years at 2 locations within the catchment. Beside these 2 gauges another 7 gauges are located at a distance of max 20 kilometers from the catchment. All gauges are included in the Danish national rain gauge system which was launched in 1976. The paper describes to what extent the extreme events statistics based on these 9 series diverge from each other and how this diversity can be handled, e.g. by introducing an "averaging procedure" based on the variability within the set of statistics. All simulations are performed by means of the MOUSE LTS model.
Effective Feature Preprocessing for Time Series Forecasting
DEFF Research Database (Denmark)
Zhao, Junhua; Dong, Zhaoyang; Xu, Zhao
2006-01-01
Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting...... performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time...... series forecasting models....
DEFF Research Database (Denmark)
Lopes Antunes, Ana Carolina; Jensen, Dan; Hisham Beshara Halasa, Tariq
2017-01-01
Disease monitoring and surveillance play a crucial role in control and eradication programs, as it is important to track implemented strategies in order to reduce and/or eliminate a specific disease. The objectives of this study were to assess the performance of different statistical monitoring......, decreases and constant sero-prevalence levels (referred as events). Two space-state models were used to model the time series, and different statistical monitoring methods (such as univariate process control algorithms–Shewart Control Chart, Tabular Cumulative Sums, and the V-mask- and monitoring...... of noise in the baseline was greater for the Shewhart Control Chart and Tabular Cumulative Sums than for the V-Mask and trend-based methods. The performance of the different statistical monitoring methods varied when monitoring increases and decreases in disease sero-prevalence. Combining two of more...
DEFF Research Database (Denmark)
Lopes Antunes, Ana Carolina; Jensen, Dan; Hisham Beshara Halasa, Tariq
2017-01-01
, decreases and constant sero-prevalence levels (referred as events). Two space-state models were used to model the time series, and different statistical monitoring methods (such as univariate process control algorithms–Shewart Control Chart, Tabular Cumulative Sums, and the V-mask- and monitoring......Disease monitoring and surveillance play a crucial role in control and eradication programs, as it is important to track implemented strategies in order to reduce and/or eliminate a specific disease. The objectives of this study were to assess the performance of different statistical monitoring...... of noise in the baseline was greater for the Shewhart Control Chart and Tabular Cumulative Sums than for the V-Mask and trend-based methods. The performance of the different statistical monitoring methods varied when monitoring increases and decreases in disease sero-prevalence. Combining two of more...
Time series models of environmental exposures: Good predictions or good understanding.
Barnett, Adrian G; Stephen, Dimity; Huang, Cunrui; Wolkewitz, Martin
2017-04-01
Time series data are popular in environmental epidemiology as they make use of the natural experiment of how changes in exposure over time might impact on disease. Many published time series papers have used parameter-heavy models that fully explained the second order patterns in disease to give residuals that have no short-term autocorrelation or seasonality. This is often achieved by including predictors of past disease counts (autoregression) or seasonal splines with many degrees of freedom. These approaches give great residuals, but add little to our understanding of cause and effect. We argue that modelling approaches should rely more on good epidemiology and less on statistical tests. This includes thinking about causal pathways, making potential confounders explicit, fitting a limited number of models, and not over-fitting at the cost of under-estimating the true association between exposure and disease. Copyright © 2017 Elsevier Inc. All rights reserved.
Adaptive Sampling of Time Series During Remote Exploration
Thompson, David R.
2012-01-01
This work deals with the challenge of online adaptive data collection in a time series. A remote sensor or explorer agent adapts its rate of data collection in order to track anomalous events while obeying constraints on time and power. This problem is challenging because the agent has limited visibility (all its datapoints lie in the past) and limited control (it can only decide when to collect its next datapoint). This problem is treated from an information-theoretic perspective, fitting a probabilistic model to collected data and optimizing the future sampling strategy to maximize information gain. The performance characteristics of stationary and nonstationary Gaussian process models are compared. Self-throttling sensors could benefit environmental sensor networks and monitoring as well as robotic exploration. Explorer agents can improve performance by adjusting their data collection rate, preserving scarce power or bandwidth resources during uninteresting times while fully covering anomalous events of interest. For example, a remote earthquake sensor could conserve power by limiting its measurements during normal conditions and increasing its cadence during rare earthquake events. A similar capability could improve sensor platforms traversing a fixed trajectory, such as an exploration rover transect or a deep space flyby. These agents can adapt observation times to improve sample coverage during moments of rapid change. An adaptive sampling approach couples sensor autonomy, instrument interpretation, and sampling. The challenge is addressed as an active learning problem, which already has extensive theoretical treatment in the statistics and machine learning literature. A statistical Gaussian process (GP) model is employed to guide sample decisions that maximize information gain. Nonsta tion - ary (e.g., time-varying) covariance relationships permit the system to represent and track local anomalies, in contrast with current GP approaches. Most common GP models
Ocean time-series near Bermuda: Hydrostation S and the US JGOFS Bermuda Atlantic time-series study
Michaels, Anthony F.; Knap, Anthony H.
1992-01-01
Bermuda is the site of two ocean time-series programs. At Hydrostation S, the ongoing biweekly profiles of temperature, salinity and oxygen now span 37 years. This is one of the longest open-ocean time-series data sets and provides a view of decadal scale variability in ocean processes. In 1988, the U.S. JGOFS Bermuda Atlantic Time-series Study began a wide range of measurements at a frequency of 14-18 cruises each year to understand temporal variability in ocean biogeochemistry. On each cruise, the data range from chemical analyses of discrete water samples to data from electronic packages of hydrographic and optics sensors. In addition, a range of biological and geochemical rate measurements are conducted that integrate over time-periods of minutes to days. This sampling strategy yields a reasonable resolution of the major seasonal patterns and of decadal scale variability. The Sargasso Sea also has a variety of episodic production events on scales of days to weeks and these are only poorly resolved. In addition, there is a substantial amount of mesoscale variability in this region and some of the perceived temporal patterns are caused by the intersection of the biweekly sampling with the natural spatial variability. In the Bermuda time-series programs, we have added a series of additional cruises to begin to assess these other sources of variation and their impacts on the interpretation of the main time-series record. However, the adequate resolution of higher frequency temporal patterns will probably require the introduction of new sampling strategies and some emerging technologies such as biogeochemical moorings and autonomous underwater vehicles.
Gábor Hatvani, István; Kern, Zoltán; Leél-Őssy, Szabolcs; Demény, Attila
2018-01-01
Uneven spacing is a common feature of sedimentary paleoclimate records, in many cases causing difficulties in the application of classical statistical and time series methods. Although special statistical tools do exist to assess unevenly spaced data directly, the transformation of such data into a temporally equidistant time series which may then be examined using commonly employed statistical tools remains, however, an unachieved goal. The present paper, therefore, introduces an approach to obtain evenly spaced time series (using cubic spline fitting) from unevenly spaced speleothem records with the application of a spectral guidance to avoid the spectral bias caused by interpolation and retain the original spectral characteristics of the data. The methodology was applied to stable carbon and oxygen isotope records derived from two stalagmites from the Baradla Cave (NE Hungary) dating back to the late 18th century. To show the benefit of the equally spaced records to climate studies, their coherence with climate parameters is explored using wavelet transform coherence and discussed. The obtained equally spaced time series are available at PANGAEA.875917" target="_blank">https://doi.org/10.1594/PANGAEA.875917.
Identification of two-phase flow regimes by time-series modeling
International Nuclear Information System (INIS)
King, C.H.; Ouyang, M.S.; Pei, B.S.
1987-01-01
The identification of two-phase flow patterns in pipes or ducts is important to the design and operation of thermal-hydraulic systems, especially in the nuclear reactor cores of boiling water reactors or in the steam generators of pressurized water reactors. Basically, two-phase flow shows some fluctuating characteristics even at steady-state conditions. These fluctuating characteristics can be analyzed by statistical methods for obtaining flow signatures. There have been a number of experimental studies conducted that are concerned with the statistical properties of void fraction or pressure pulsation in two-phase flow. In this study, the authors propose a new technique of identifying the patterns of air-water two-phase flow in a vertical pipe. This technique is based on analyzing the statistic characteristics of the pressure signals of the test loop by time-series modeling
International Nuclear Information System (INIS)
Corana, A.; Bortolan, G.; Casaleggio, A.
2004-01-01
We present and compare two automatic methods for dimension estimation from time series. Both methods, based on conceptually different approaches, work on the derivative of the bi-logarithmic plot of the correlation integral versus the correlation length (log-log plot). The first method searches for the most probable dimension values (MPDV) and associates to each of them a possible scaling region. The second one searches for the most flat intervals (MFI) in the derivative of the log-log plot. The automatic procedures include the evaluation of the candidate scaling regions using two reliability indices. The data set used to test the methods consists of time series from known model attractors with and without the addition of noise, structured time series, and electrocardiographic signals from the MIT-BIH ECG database. Statistical analysis of results was carried out by means of paired t-test, and no statistically significant differences were found in the large majority of the trials. Consistent results are also obtained dealing with 'difficult' time series. In general for a more robust and reliable estimate, the use of both methods may represent a good solution when time series from complex systems are analyzed. Although we present results for the correlation dimension only, the procedures can also be used for the automatic estimation of generalized q-order dimensions and pointwise dimension. We think that the proposed methods, eliminating the need of operator intervention, allow a faster and more objective analysis, thus improving the usefulness of dimension analysis for the characterization of time series obtained from complex dynamical systems
Multivariate Time Series Decomposition into Oscillation Components.
Matsuda, Takeru; Komaki, Fumiyasu
2017-08-01
Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.
Forecasting Enrollments with Fuzzy Time Series.
Song, Qiang; Chissom, Brad S.
The concept of fuzzy time series is introduced and used to forecast the enrollment of a university. Fuzzy time series, an aspect of fuzzy set theory, forecasts enrollment using a first-order time-invariant model. To evaluate the model, the conventional linear regression technique is applied and the predicted values obtained are compared to the…
DEFF Research Database (Denmark)
Schaarup-Jensen, Kjeld; Rasmussen, Michael R.; Thorndahl, Søren
2008-01-01
In urban drainage modeling long term extreme statistics has become an important basis for decision-making e.g. in connection with renovation projects. Therefore it is of great importance to minimize the uncertainties concerning long term prediction of maximum water levels and combined sewer...... overflow (CSO) in drainage systems. These uncertainties originate from large uncertainties regarding rainfall inputs, parameters, and assessment of return periods. This paper investigates how the choice of rainfall time series influences the extreme events statistics of max water levels in manholes and CSO...... gauges are located at a distance of max 20 kilometers from the catchment. All gauges are included in the Danish national rain gauge system which was launched in 1976. The paper describes to what extent the extreme events statistics based on these 9 series diverge from each other and how this diversity...
DEFF Research Database (Denmark)
Schaarup-Jensen, Kjeld; Rasmussen, Michael R.; Thorndahl, Søren
2009-01-01
In urban drainage modelling long term extreme statistics has become an important basis for decision-making e.g. in connection with renovation projects. Therefore it is of great importance to minimize the uncertainties concerning long term prediction of maximum water levels and combined sewer...... overflow (CSO) in drainage systems. These uncertainties originate from large uncertainties regarding rainfall inputs, parameters, and assessment of return periods. This paper investigates how the choice of rainfall time series influences the extreme events statistics of max water levels in manholes and CSO...... gauges are located at a distance of max 20 kilometers from the catchment. All gauges are included in the Danish national rain gauge system which was launched in 1976. The paper describes to what extent the extreme events statistics based on these 9 series diverge from each other and how this diversity...
A robust interrupted time series model for analyzing complex health care intervention data
Cruz, Maricela
2017-08-29
Current health policy calls for greater use of evidence-based care delivery services to improve patient quality and safety outcomes. Care delivery is complex, with interacting and interdependent components that challenge traditional statistical analytic techniques, in particular, when modeling a time series of outcomes data that might be
A robust interrupted time series model for analyzing complex health care intervention data
Cruz, Maricela; Bender, Miriam; Ombao, Hernando
2017-01-01
Current health policy calls for greater use of evidence-based care delivery services to improve patient quality and safety outcomes. Care delivery is complex, with interacting and interdependent components that challenge traditional statistical analytic techniques, in particular, when modeling a time series of outcomes data that might be
Financial Time Series Prediction Using Elman Recurrent Random Neural Networks
Directory of Open Access Journals (Sweden)
Jie Wang
2016-01-01
(ERNN, the empirical results show that the proposed neural network displays the best performance among these neural networks in financial time series forecasting. Further, the empirical research is performed in testing the predictive effects of SSE, TWSE, KOSPI, and Nikkei225 with the established model, and the corresponding statistical comparisons of the above market indices are also exhibited. The experimental results show that this approach gives good performance in predicting the values from the stock market indices.
Directory of Open Access Journals (Sweden)
Jan Dempewolf
2014-10-01
Full Text Available Policy makers, government planners and agricultural market participants in Pakistan require accurate and timely information about wheat yield and production. Punjab Province is by far the most important wheat producing region in the country. The manual collection of field data and data processing for crop forecasting by the provincial government requires significant amounts of time before official reports can be released. Several studies have shown that wheat yield can be effectively forecast using satellite remote sensing data. In this study, we developed a methodology for estimating wheat yield and area for Punjab Province from freely available Landsat and MODIS satellite imagery approximately six weeks before harvest. Wheat yield was derived by regressing reported yield values against time series of four different peak-season MODIS-derived vegetation indices. We also tested deriving wheat area from the same MODIS time series using a regression-tree approach. Among the four evaluated indices, WDRVI provided more consistent and accurate yield forecasts compared to NDVI, EVI2 and saturation-adjusted normalized difference vegetation index (SANDVI. The lowest RMSE values at the district level for forecast versus reported yield were found when using six or more years of training data. Forecast yield for the 2007/2008 to 2012/2013 growing seasons were within 0.2% and 11.5% of final reported values. Absolute deviations of wheat area and production forecasts from reported values were slightly greater compared to using the previous year's or the three- or six-year moving average values, implying that 250-m MODIS data does not provide sufficient spatial resolution for providing improved wheat area and production forecasts.
Hermosilla, Txomin; Wulder, Michael A.; White, Joanne C.; Coops, Nicholas C.; Hobart, Geordie W.
2017-12-01
The use of time series satellite data allows for the temporally dense, systematic, transparent, and synoptic capture of land dynamics over time. Subsequent to the opening of the Landsat archive, several time series approaches for characterizing landscape change have been developed, often representing a particular analytical time window. The information richness and widespread utility of these time series data have created a need to maintain the currency of time series information via the addition of new data, as it becomes available. When an existing time series is temporally extended, it is critical that previously generated change information remains consistent, thereby not altering reported change statistics or science outcomes based on that change information. In this research, we investigate the impacts and implications of adding additional years to an existing 29-year annual Landsat time series for forest change. To do so, we undertook a spatially explicit comparison of the 29 overlapping years of a time series representing 1984-2012, with a time series representing 1984-2016. Surface reflectance values, and presence, year, and type of change were compared. We found that the addition of years to extend the time series had minimal effect on the annual surface reflectance composites, with slight band-specific differences (r ≥ 0.1) in the final years of the original time series being updated. The area of stand replacing disturbances and determination of change year are virtually unchanged for the overlapping period between the two time-series products. Over the overlapping temporal period (1984-2012), the total area of change differs by 0.53%, equating to an annual difference in change area of 0.019%. Overall, the spatial and temporal agreement of the changes detected by both time series was 96%. Further, our findings suggest that the entire pre-existing historic time series does not need to be re-processed during the update process. Critically, given the time
Forecasting Cryptocurrencies Financial Time Series
DEFF Research Database (Denmark)
Catania, Leopoldo; Grassi, Stefano; Ravazzolo, Francesco
2018-01-01
This paper studies the predictability of cryptocurrencies time series. We compare several alternative univariate and multivariate models in point and density forecasting of four of the most capitalized series: Bitcoin, Litecoin, Ripple and Ethereum. We apply a set of crypto–predictors and rely...
Detecting macroeconomic phases in the Dow Jones Industrial Average time series
Wong, Jian Cheng; Lian, Heng; Cheong, Siew Ann
2009-11-01
In this paper, we perform statistical segmentation and clustering analysis of the Dow Jones Industrial Average (DJI) time series between January 1997 and August 2008. Modeling the index movements and log-index movements as stationary Gaussian processes, we find a total of 116 and 119 statistically stationary segments respectively. These can then be grouped into between five and seven clusters, each representing a different macroeconomic phase. The macroeconomic phases are distinguished primarily by their volatilities. We find that the US economy, as measured by the DJI, spends most of its time in a low-volatility phase and a high-volatility phase. The former can be roughly associated with economic expansion, while the latter contains the economic contraction phase in the standard economic cycle. Both phases are interrupted by a moderate-volatility market correction phase, but extremely-high-volatility market crashes are found mostly within the high-volatility phase. From the temporal distribution of various phases, we see a high-volatility phase from mid-1998 to mid-2003, and another starting mid-2007 (the current global financial crisis). Transitions from the low-volatility phase to the high-volatility phase are preceded by a series of precursor shocks, whereas the transition from the high-volatility phase to the low-volatility phase is preceded by a series of inverted shocks. The time scale for both types of transitions is about a year. We also identify the July 1997 Asian Financial Crisis to be the trigger for the mid-1998 transition, and an unnamed May 2006 market event related to corrections in the Chinese markets to be the trigger for the mid-2007 transition.
Time Series Analysis Forecasting and Control
Box, George E P; Reinsel, Gregory C
2011-01-01
A modernized new edition of one of the most trusted books on time series analysis. Since publication of the first edition in 1970, Time Series Analysis has served as one of the most influential and prominent works on the subject. This new edition maintains its balanced presentation of the tools for modeling and analyzing time series and also introduces the latest developments that have occurred n the field over the past decade through applications from areas such as business, finance, and engineering. The Fourth Edition provides a clearly written exploration of the key methods for building, cl
Bayesian dynamic modeling of time series of dengue disease case counts.
Martínez-Bello, Daniel Adyro; López-Quílez, Antonio; Torres-Prieto, Alexander
2017-07-01
The aim of this study is to model the association between weekly time series of dengue case counts and meteorological variables, in a high-incidence city of Colombia, applying Bayesian hierarchical dynamic generalized linear models over the period January 2008 to August 2015. Additionally, we evaluate the model's short-term performance for predicting dengue cases. The methodology shows dynamic Poisson log link models including constant or time-varying coefficients for the meteorological variables. Calendar effects were modeled using constant or first- or second-order random walk time-varying coefficients. The meteorological variables were modeled using constant coefficients and first-order random walk time-varying coefficients. We applied Markov Chain Monte Carlo simulations for parameter estimation, and deviance information criterion statistic (DIC) for model selection. We assessed the short-term predictive performance of the selected final model, at several time points within the study period using the mean absolute percentage error. The results showed the best model including first-order random walk time-varying coefficients for calendar trend and first-order random walk time-varying coefficients for the meteorological variables. Besides the computational challenges, interpreting the results implies a complete analysis of the time series of dengue with respect to the parameter estimates of the meteorological effects. We found small values of the mean absolute percentage errors at one or two weeks out-of-sample predictions for most prediction points, associated with low volatility periods in the dengue counts. We discuss the advantages and limitations of the dynamic Poisson models for studying the association between time series of dengue disease and meteorological variables. The key conclusion of the study is that dynamic Poisson models account for the dynamic nature of the variables involved in the modeling of time series of dengue disease, producing useful
R/S method for evaluation of pollutant time series in environmental quality assessment
Directory of Open Access Journals (Sweden)
Bu Quanmin
2008-12-01
Full Text Available The significance of the fluctuation and randomness of the time series of each pollutant in environmental quality assessment is described for the first time in this paper. A comparative study was made of three different computing methods: the same starting point method, the striding averaging method, and the stagger phase averaging method. All of them can be used to calculate the Hurst index, which quantifies fluctuation and randomness. This study used real water quality data from Shazhu monitoring station on Taihu Lake in Wuxi, Jiangsu Province. The results show that, of the three methods, the stagger phase averaging method is best for calculating the Hurst index of a pollutant time series from the perspective of statistical regularity.
Costationarity of Locally Stationary Time Series Using costat
Cardinali, Alessandro; Nason, Guy P.
2013-01-01
This article describes the R package costat. This package enables a user to (i) perform a test for time series stationarity; (ii) compute and plot time-localized autocovariances, and (iii) to determine and explore any costationary relationship between two locally stationary time series. Two locally stationary time series are said to be costationary if there exists two time-varying combination functions such that the linear combination of the two series with the functions produces another time...
Empirical intrinsic geometry for nonlinear modeling and time series filtering.
Talmon, Ronen; Coifman, Ronald R
2013-07-30
In this paper, we present a method for time series analysis based on empirical intrinsic geometry (EIG). EIG enables one to reveal the low-dimensional parametric manifold as well as to infer the underlying dynamics of high-dimensional time series. By incorporating concepts of information geometry, this method extends existing geometric analysis tools to support stochastic settings and parametrizes the geometry of empirical distributions. However, the statistical models are not required as priors; hence, EIG may be applied to a wide range of real signals without existing definitive models. We show that the inferred model is noise-resilient and invariant under different observation and instrumental modalities. In addition, we show that it can be extended efficiently to newly acquired measurements in a sequential manner. These two advantages enable us to revisit the Bayesian approach and incorporate empirical dynamics and intrinsic geometry into a nonlinear filtering framework. We show applications to nonlinear and non-Gaussian tracking problems as well as to acoustic signal localization.
International Nuclear Information System (INIS)
Veronesi, F; Grassi, S
2016-01-01
Wind resource assessment is a key aspect of wind farm planning since it allows to estimate the long term electricity production. Moreover, wind speed time-series at high resolution are helpful to estimate the temporal changes of the electricity generation and indispensable to design stand-alone systems, which are affected by the mismatch of supply and demand. In this work, we present a new generalized statistical methodology to generate the spatial distribution of wind speed time-series, using Switzerland as a case study. This research is based upon a machine learning model and demonstrates that statistical wind resource assessment can successfully be used for estimating wind speed time-series. In fact, this method is able to obtain reliable wind speed estimates and propagate all the sources of uncertainty (from the measurements to the mapping process) in an efficient way, i.e. minimizing computational time and load. This allows not only an accurate estimation, but the creation of precise confidence intervals to map the stochasticity of the wind resource for a particular site. The validation shows that machine learning can minimize the bias of the wind speed hourly estimates. Moreover, for each mapped location this method delivers not only the mean wind speed, but also its confidence interval, which are crucial data for planners. (paper)
Veronesi, F.; Grassi, S.
2016-09-01
Wind resource assessment is a key aspect of wind farm planning since it allows to estimate the long term electricity production. Moreover, wind speed time-series at high resolution are helpful to estimate the temporal changes of the electricity generation and indispensable to design stand-alone systems, which are affected by the mismatch of supply and demand. In this work, we present a new generalized statistical methodology to generate the spatial distribution of wind speed time-series, using Switzerland as a case study. This research is based upon a machine learning model and demonstrates that statistical wind resource assessment can successfully be used for estimating wind speed time-series. In fact, this method is able to obtain reliable wind speed estimates and propagate all the sources of uncertainty (from the measurements to the mapping process) in an efficient way, i.e. minimizing computational time and load. This allows not only an accurate estimation, but the creation of precise confidence intervals to map the stochasticity of the wind resource for a particular site. The validation shows that machine learning can minimize the bias of the wind speed hourly estimates. Moreover, for each mapped location this method delivers not only the mean wind speed, but also its confidence interval, which are crucial data for planners.
Statistical inference for classification of RRIM clone series using near IR reflectance properties
Ismail, Faridatul Aima; Madzhi, Nina Korlina; Hashim, Hadzli; Abdullah, Noor Ezan; Khairuzzaman, Noor Aishah; Azmi, Azrie Faris Mohd; Sampian, Ahmad Faiz Mohd; Harun, Muhammad Hafiz
2015-08-01
RRIM clone is a rubber breeding series produced by RRIM (Rubber Research Institute of Malaysia) through "rubber breeding program" to improve latex yield and producing clones attractive to farmers. The objective of this work is to analyse measurement of optical sensing device on latex of selected clone series. The device using transmitting NIR properties and its reflectance is converted in terms of voltage. The obtained reflectance index value via voltage was analyzed using statistical technique in order to find out the discrimination among the clones. From the statistical results using error plots and one-way ANOVA test, there is an overwhelming evidence showing discrimination of RRIM 2002, RRIM 2007 and RRIM 3001 clone series with p value = 0.000. RRIM 2008 cannot be discriminated with RRIM 2014; however both of these groups are distinct from the other clones.
First and second order Markov chain models for synthetic generation of wind speed time series
International Nuclear Information System (INIS)
Shamshad, A.; Bawadi, M.A.; Wan Hussin, W.M.A.; Majid, T.A.; Sanusi, S.A.M.
2005-01-01
Hourly wind speed time series data of two meteorological stations in Malaysia have been used for stochastic generation of wind speed data using the transition matrix approach of the Markov chain process. The transition probability matrices have been formed using two different approaches: the first approach involves the use of the first order transition probability matrix of a Markov chain, and the second involves the use of a second order transition probability matrix that uses the current and preceding values to describe the next wind speed value. The algorithm to generate the wind speed time series from the transition probability matrices is described. Uniform random number generators have been used for transition between successive time states and within state wind speed values. The ability of each approach to retain the statistical properties of the generated speed is compared with the observed ones. The main statistical properties used for this purpose are mean, standard deviation, median, percentiles, Weibull distribution parameters, autocorrelations and spectral density of wind speed values. The comparison of the observed wind speed and the synthetically generated ones shows that the statistical characteristics are satisfactorily preserved
TIME SERIES ANALYSIS USING A UNIQUE MODEL OF TRANSFORMATION
Directory of Open Access Journals (Sweden)
Goran Klepac
2007-12-01
Full Text Available REFII1 model is an authorial mathematical model for time series data mining. The main purpose of that model is to automate time series analysis, through a unique transformation model of time series. An advantage of this approach of time series analysis is the linkage of different methods for time series analysis, linking traditional data mining tools in time series, and constructing new algorithms for analyzing time series. It is worth mentioning that REFII model is not a closed system, which means that we have a finite set of methods. At first, this is a model for transformation of values of time series, which prepares data used by different sets of methods based on the same model of transformation in a domain of problem space. REFII model gives a new approach in time series analysis based on a unique model of transformation, which is a base for all kind of time series analysis. The advantage of REFII model is its possible application in many different areas such as finance, medicine, voice recognition, face recognition and text mining.
Trend Change Detection in NDVI Time Series: Effects of Inter-Annual Variability and Methodology
Forkel, Matthias; Carvalhais, Nuno; Verbesselt, Jan; Mahecha, Miguel D.; Neigh, Christopher S.R.; Reichstein, Markus
2013-01-01
Changing trends in ecosystem productivity can be quantified using satellite observations of Normalized Difference Vegetation Index (NDVI). However, the estimation of trends from NDVI time series differs substantially depending on analyzed satellite dataset, the corresponding spatiotemporal resolution, and the applied statistical method. Here we compare the performance of a wide range of trend estimation methods and demonstrate that performance decreases with increasing inter-annual variability in the NDVI time series. Trend slope estimates based on annual aggregated time series or based on a seasonal-trend model show better performances than methods that remove the seasonal cycle of the time series. A breakpoint detection analysis reveals that an overestimation of breakpoints in NDVI trends can result in wrong or even opposite trend estimates. Based on our results, we give practical recommendations for the application of trend methods on long-term NDVI time series. Particularly, we apply and compare different methods on NDVI time series in Alaska, where both greening and browning trends have been previously observed. Here, the multi-method uncertainty of NDVI trends is quantified through the application of the different trend estimation methods. Our results indicate that greening NDVI trends in Alaska are more spatially and temporally prevalent than browning trends. We also show that detected breakpoints in NDVI trends tend to coincide with large fires. Overall, our analyses demonstrate that seasonal trend methods need to be improved against inter-annual variability to quantify changing trends in ecosystem productivity with higher accuracy.
Scale-dependent intrinsic entropies of complex time series.
Yeh, Jia-Rong; Peng, Chung-Kang; Huang, Norden E
2016-04-13
Multi-scale entropy (MSE) was developed as a measure of complexity for complex time series, and it has been applied widely in recent years. The MSE algorithm is based on the assumption that biological systems possess the ability to adapt and function in an ever-changing environment, and these systems need to operate across multiple temporal and spatial scales, such that their complexity is also multi-scale and hierarchical. Here, we present a systematic approach to apply the empirical mode decomposition algorithm, which can detrend time series on various time scales, prior to analysing a signal's complexity by measuring the irregularity of its dynamics on multiple time scales. Simulated time series of fractal Gaussian noise and human heartbeat time series were used to study the performance of this new approach. We show that our method can successfully quantify the fractal properties of the simulated time series and can accurately distinguish modulations in human heartbeat time series in health and disease. © 2016 The Author(s).
Elements of nonlinear time series analysis and forecasting
De Gooijer, Jan G
2017-01-01
This book provides an overview of the current state-of-the-art of nonlinear time series analysis, richly illustrated with examples, pseudocode algorithms and real-world applications. Avoiding a “theorem-proof” format, it shows concrete applications on a variety of empirical time series. The book can be used in graduate courses in nonlinear time series and at the same time also includes interesting material for more advanced readers. Though it is largely self-contained, readers require an understanding of basic linear time series concepts, Markov chains and Monte Carlo simulation methods. The book covers time-domain and frequency-domain methods for the analysis of both univariate and multivariate (vector) time series. It makes a clear distinction between parametric models on the one hand, and semi- and nonparametric models/methods on the other. This offers the reader the option of concentrating exclusively on one of these nonlinear time series analysis methods. To make the book as user friendly as possible...
An Energy-Based Similarity Measure for Time Series
Directory of Open Access Journals (Sweden)
Pierre Brunagel
2007-11-01
Full Text Available A new similarity measure, called SimilB, for time series analysis, based on the cross-ÃŽÂ¨B-energy operator (2004, is introduced. ÃŽÂ¨B is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED or the Pearson correlation coefficient (CC, SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of ÃŽÂ¨B are presented. Particularly, we show that ÃŽÂ¨B as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures.
O'Shaughnessy, Patrick; Cavanaugh, Joseph E
2015-01-01
Industrial hygienists now commonly use direct-reading instruments to evaluate hazards in the workplace. The stored values over time from these instruments constitute a time series of measurements that are often autocorrelated. Given the need to statistically compare two occupational scenarios using values from a direct-reading instrument, a t-test must consider measurement autocorrelation or the resulting test will have a largely inflated type-1 error probability (false rejection of the null hypothesis). A method is described for both the one-sample and two-sample cases which properly adjusts for autocorrelation. This method involves the computation of an "equivalent sample size" that effectively decreases the actual sample size when determining the standard error of the mean for the time series. An example is provided for the one-sample case, and an example is given where a two-sample t-test is conducted for two autocorrelated time series comprised of lognormally distributed measurements.
Detecting chaos in irregularly sampled time series.
Kulp, C W
2013-09-01
Recently, Wiebe and Virgin [Chaos 22, 013136 (2012)] developed an algorithm which detects chaos by analyzing a time series' power spectrum which is computed using the Discrete Fourier Transform (DFT). Their algorithm, like other time series characterization algorithms, requires that the time series be regularly sampled. Real-world data, however, are often irregularly sampled, thus, making the detection of chaotic behavior difficult or impossible with those methods. In this paper, a characterization algorithm is presented, which effectively detects chaos in irregularly sampled time series. The work presented here is a modification of Wiebe and Virgin's algorithm and uses the Lomb-Scargle Periodogram (LSP) to compute a series' power spectrum instead of the DFT. The DFT is not appropriate for irregularly sampled time series. However, the LSP is capable of computing the frequency content of irregularly sampled data. Furthermore, a new method of analyzing the power spectrum is developed, which can be useful for differentiating between chaotic and non-chaotic behavior. The new characterization algorithm is successfully applied to irregularly sampled data generated by a model as well as data consisting of observations of variable stars.
Building Chaotic Model From Incomplete Time Series
Siek, Michael; Solomatine, Dimitri
2010-05-01
This paper presents a number of novel techniques for building a predictive chaotic model from incomplete time series. A predictive chaotic model is built by reconstructing the time-delayed phase space from observed time series and the prediction is made by a global model or adaptive local models based on the dynamical neighbors found in the reconstructed phase space. In general, the building of any data-driven models depends on the completeness and quality of the data itself. However, the completeness of the data availability can not always be guaranteed since the measurement or data transmission is intermittently not working properly due to some reasons. We propose two main solutions dealing with incomplete time series: using imputing and non-imputing methods. For imputing methods, we utilized the interpolation methods (weighted sum of linear interpolations, Bayesian principle component analysis and cubic spline interpolation) and predictive models (neural network, kernel machine, chaotic model) for estimating the missing values. After imputing the missing values, the phase space reconstruction and chaotic model prediction are executed as a standard procedure. For non-imputing methods, we reconstructed the time-delayed phase space from observed time series with missing values. This reconstruction results in non-continuous trajectories. However, the local model prediction can still be made from the other dynamical neighbors reconstructed from non-missing values. We implemented and tested these methods to construct a chaotic model for predicting storm surges at Hoek van Holland as the entrance of Rotterdam Port. The hourly surge time series is available for duration of 1990-1996. For measuring the performance of the proposed methods, a synthetic time series with missing values generated by a particular random variable to the original (complete) time series is utilized. There exist two main performance measures used in this work: (1) error measures between the actual
Directory of Open Access Journals (Sweden)
Xudong Guan
2016-01-01
Full Text Available Normalized Difference Vegetation Index (NDVI derived from Moderate Resolution Imaging Spectroradiometer (MODIS time-series data has been widely used in the fields of crop and rice classification. The cloudy and rainy weather characteristics of the monsoon season greatly reduce the likelihood of obtaining high-quality optical remote sensing images. In addition, the diverse crop-planting system in Vietnam also hinders the comparison of NDVI among different crop stages. To address these problems, we apply a Dynamic Time Warping (DTW distance-based similarity measure approach and use the entire yearly NDVI time series to reduce the inaccuracy of classification using a single image. We first de-noise the NDVI time series using S-G filtering based on the TIMESAT software. Then, a standard NDVI time-series base for rice growth is established based on field survey data and Google Earth sample data. NDVI time-series data for each pixel are constructed and the DTW distance with the standard rice growth NDVI time series is calculated. Then, we apply thresholds to extract rice growth areas. A qualitative assessment using statistical data and a spatial assessment using sampled data from the rice-cropping map reveal a high mapping accuracy at the national scale between the statistical data, with the corresponding R2 being as high as 0.809; however, the mapped rice accuracy decreased at the provincial scale due to the reduced number of rice planting areas per province. An analysis of the results indicates that the 500-m resolution MODIS data are limited in terms of mapping scattered rice parcels. The results demonstrate that the DTW-based similarity measure of the NDVI time series can be effectively used to map large-area rice cropping systems with diverse cultivation processes.
Multivariate Time Series Search
National Aeronautics and Space Administration — Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical...
National Research Council Canada - National Science Library
Adler, Robert
1997-01-01
We describe how to take a stable, ARMA, time series through the various stages of model identification, parameter estimation, and diagnostic checking, and accompany the discussion with a goodly number...
Year Ahead Demand Forecast of City Natural Gas Using Seasonal Time Series Methods
Directory of Open Access Journals (Sweden)
Mustafa Akpinar
2016-09-01
Full Text Available Consumption of natural gas, a major clean energy source, increases as energy demand increases. We studied specifically the Turkish natural gas market. Turkey’s natural gas consumption increased as well in parallel with the world‘s over the last decade. This consumption growth in Turkey has led to the formation of a market structure for the natural gas industry. This significant increase requires additional investments since a rise in consumption capacity is expected. One of the reasons for the consumption increase is the user-based natural gas consumption influence. This effect yields imbalances in demand forecasts and if the error rates are out of bounds, penalties may occur. In this paper, three univariate statistical methods, which have not been previously investigated for mid-term year-ahead monthly natural gas forecasting, are used to forecast natural gas demand in Turkey’s Sakarya province. Residential and low-consumption commercial data is used, which may contain seasonality. The goal of this paper is minimizing more or less gas tractions on mid-term consumption while improving the accuracy of demand forecasting. In forecasting models, seasonality and single variable impacts reinforce forecasts. This paper studies time series decomposition, Holt-Winters exponential smoothing and autoregressive integrated moving average (ARIMA methods. Here, 2011–2014 monthly data were prepared and divided into two series. The first series is 2011–2013 monthly data used for finding seasonal effects and model requirements. The second series is 2014 monthly data used for forecasting. For the ARIMA method, a stationary series was prepared and transformation process prior to forecasting was done. Forecasting results confirmed that as the computation complexity of the model increases, forecasting accuracy increases with lower error rates. Also, forecasting errors and the coefficients of determination values give more consistent results. Consequently
Chattopadhyay, Surajit; Chattopadhyay, Goutami
The present paper reports studies on the association between the mean annual sunspot numbers and the summer monsoon rainfall over India. The cross correlations have been studied. After Box-Cox transformation, the time spectral analysis has been executed and it has been found that both of the time series have an important spectrum at the fifth harmonic. An artificial neural network (ANN) model has been developed on the data series averaged continuously by five years and the neural network could establish a predictor-predict and relationship between the sunspot numbers and the mean yearly summer monsoon rainfall over India.
Autocorrelation and cross-correlation in time series of homicide and attempted homicide
Machado Filho, A.; da Silva, M. F.; Zebende, G. F.
2014-04-01
We propose in this paper to establish the relationship between homicides and attempted homicides by a non-stationary time-series analysis. This analysis will be carried out by Detrended Fluctuation Analysis (DFA), Detrended Cross-Correlation Analysis (DCCA), and DCCA cross-correlation coefficient, ρ(n). Through this analysis we can identify a positive cross-correlation between homicides and attempted homicides. At the same time, looked at from the point of view of autocorrelation (DFA), this analysis can be more informative depending on time scale. For short scale (days), we cannot identify auto-correlations, on the scale of weeks DFA presents anti-persistent behavior, and for long time scales (n>90 days) DFA presents a persistent behavior. Finally, the application of this new type of statistical analysis proved to be efficient and, in this sense, this paper can contribute to a more accurate descriptive statistics of crime.
Yan, Ying; Zhang, Shen; Tang, Jinjun; Wang, Xiaofei
2017-07-01
Discovering dynamic characteristics in traffic flow is the significant step to design effective traffic managing and controlling strategy for relieving traffic congestion in urban cities. A new method based on complex network theory is proposed to study multivariate traffic flow time series. The data were collected from loop detectors on freeway during a year. In order to construct complex network from original traffic flow, a weighted Froenius norm is adopt to estimate similarity between multivariate time series, and Principal Component Analysis is implemented to determine the weights. We discuss how to select optimal critical threshold for networks at different hour in term of cumulative probability distribution of degree. Furthermore, two statistical properties of networks: normalized network structure entropy and cumulative probability of degree, are utilized to explore hourly variation in traffic flow. The results demonstrate these two statistical quantities express similar pattern to traffic flow parameters with morning and evening peak hours. Accordingly, we detect three traffic states: trough, peak and transitional hours, according to the correlation between two aforementioned properties. The classifying results of states can actually represent hourly fluctuation in traffic flow by analyzing annual average hourly values of traffic volume, occupancy and speed in corresponding hours.
Time Series Observations in the North Indian Ocean
Digital Repository Service at National Institute of Oceanography (India)
Shenoy, D.M.; Naik, H.; Kurian, S.; Naqvi, S.W.A.; Khare, N.
Ocean and the ongoing time series study (Candolim Time Series; CaTS) off Goa. In addition, this article also focuses on the new time series initiative in the Arabian Sea and the Bay of Bengal under Sustained Indian Ocean Biogeochemistry and Ecosystem...
Time series analysis of soil Radon-222 recorded at Kutch region, Gujarat, India
International Nuclear Information System (INIS)
Madhusudan Rao, K.; Rastogi, B.K.; Barman, Chiranjib; Chaudhuri, Hirok
2013-01-01
Kutch region in Gujarat lies in a seismic vulnerable zone (seismic zone-v). After the devastating Bhuj earthquake (7.7M) of January 26, 2001 in the Kutch region several researcher focused their attention to monitor geophysical and geochemical precursors for earthquakes in the region. In order to find out the possible geochemical precursory signals for earthquake events, we monitored radioactive gas radon-222 in sub surface soil gas at Kutch region. We have analysed the recorded soil radon-222 time series by means of nonlinear techniques such as FFT power spectral analysis, empirical mode decomposition, multi-fractal analysis along with other linear statistical methods. Some fascinating and fruitful results originated out the nonlinear analysis of the said time series have been discussed in the present paper. The entire analytical method aided us to recognize the nature and pattern of soil radon-222 emanation process. Moreover the recording and statistical and non-linear analysis of soil radon data at Kutch region will assist us to understand the preparation phase of an imminent seismic event in the region. (author)
Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations
Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James
2013-01-01
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks [Scargle 1998]-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piece- wise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by [Arias-Castro, Donoho and Huo 2003]. In the spirit of Reproducible Research [Donoho et al. (2008)] all of the code and data necessary to reproduce all of the figures in this paper are included as auxiliary material.
STUDIES IN ASTRONOMICAL TIME SERIES ANALYSIS. VI. BAYESIAN BLOCK REPRESENTATIONS
Energy Technology Data Exchange (ETDEWEB)
Scargle, Jeffrey D. [Space Science and Astrobiology Division, MS 245-3, NASA Ames Research Center, Moffett Field, CA 94035-1000 (United States); Norris, Jay P. [Physics Department, Boise State University, 2110 University Drive, Boise, ID 83725-1570 (United States); Jackson, Brad [The Center for Applied Mathematics and Computer Science, Department of Mathematics, San Jose State University, One Washington Square, MH 308, San Jose, CA 95192-0103 (United States); Chiang, James, E-mail: jeffrey.d.scargle@nasa.gov [W. W. Hansen Experimental Physics Laboratory, Kavli Institute for Particle Astrophysics and Cosmology, Department of Physics and SLAC National Accelerator Laboratory, Stanford University, Stanford, CA 94305 (United States)
2013-02-20
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by Arias-Castro et al. In the spirit of Reproducible Research all of the code and data necessary to reproduce all of the figures in this paper are included as supplementary material.
STUDIES IN ASTRONOMICAL TIME SERIES ANALYSIS. VI. BAYESIAN BLOCK REPRESENTATIONS
International Nuclear Information System (INIS)
Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James
2013-01-01
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it—an improved and generalized version of Bayesian Blocks—that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piecewise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by Arias-Castro et al. In the spirit of Reproducible Research all of the code and data necessary to reproduce all of the figures in this paper are included as supplementary material.
Zhu, Zhe
2017-08-01
The free and open access to all archived Landsat images in 2008 has completely changed the way of using Landsat data. Many novel change detection algorithms based on Landsat time series have been developed We present a comprehensive review of four important aspects of change detection studies based on Landsat time series, including frequencies, preprocessing, algorithms, and applications. We observed the trend that the more recent the study, the higher the frequency of Landsat time series used. We reviewed a series of image preprocessing steps, including atmospheric correction, cloud and cloud shadow detection, and composite/fusion/metrics techniques. We divided all change detection algorithms into six categories, including thresholding, differencing, segmentation, trajectory classification, statistical boundary, and regression. Within each category, six major characteristics of different algorithms, such as frequency, change index, univariate/multivariate, online/offline, abrupt/gradual change, and sub-pixel/pixel/spatial were analyzed. Moreover, some of the widely-used change detection algorithms were also discussed. Finally, we reviewed different change detection applications by dividing these applications into two categories, change target and change agent detection.
UPPAAL-SMC: Statistical Model Checking for Priced Timed Automata
DEFF Research Database (Denmark)
Bulychev, Petr; David, Alexandre; Larsen, Kim Guldstrand
2012-01-01
on a series of extensions of the statistical model checking approach generalized to handle real-time systems and estimate undecidable problems. U PPAAL - SMC comes together with a friendly user interface that allows a user to specify complex problems in an efficient manner as well as to get feedback...... in the form of probability distributions and compare probabilities to analyze performance aspects of systems. The focus of the survey is on the evolution of the tool – including modeling and specification formalisms as well as techniques applied – together with applications of the tool to case studies....
Geometric noise reduction for multivariate time series.
Mera, M Eugenia; Morán, Manuel
2006-03-01
We propose an algorithm for the reduction of observational noise in chaotic multivariate time series. The algorithm is based on a maximum likelihood criterion, and its goal is to reduce the mean distance of the points of the cleaned time series to the attractor. We give evidence of the convergence of the empirical measure associated with the cleaned time series to the underlying invariant measure, implying the possibility to predict the long run behavior of the true dynamics.
BRITS: Bidirectional Recurrent Imputation for Time Series
Cao, Wei; Wang, Dong; Li, Jian; Zhou, Hao; Li, Lei; Li, Yitan
2018-01-01
Time series are widely used as signals in many classification/regression tasks. It is ubiquitous that time series contains many missing values. Given multiple correlated time series data, how to fill in missing values and to predict their class labels? Existing imputation methods often impose strong assumptions of the underlying data generating process, such as linear dynamics in the state space. In this paper, we propose BRITS, a novel method based on recurrent neural networks for missing va...
Efficient Algorithms for Segmentation of Item-Set Time Series
Chundi, Parvathi; Rosenkrantz, Daniel J.
We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.
Global Population Density Grid Time Series Estimates
National Aeronautics and Space Administration — Global Population Density Grid Time Series Estimates provide a back-cast time series of population density grids based on the year 2000 population grid from SEDAC's...
Prediction and Geometry of Chaotic Time Series
National Research Council Canada - National Science Library
Leonardi, Mary
1997-01-01
This thesis examines the topic of chaotic time series. An overview of chaos, dynamical systems, and traditional approaches to time series analysis is provided, followed by an examination of state space reconstruction...
Forootan, Ehsan; Kusche, Jürgen
2016-04-01
Geodetic/geophysical observations, such as the time series of global terrestrial water storage change or sea level and temperature change, represent samples of physical processes and therefore contain information about complex physical interactionswith many inherent time scales. Extracting relevant information from these samples, for example quantifying the seasonality of a physical process or its variability due to large-scale ocean-atmosphere interactions, is not possible by rendering simple time series approaches. In the last decades, decomposition techniques have found increasing interest for extracting patterns from geophysical observations. Traditionally, principal component analysis (PCA) and more recently independent component analysis (ICA) are common techniques to extract statistical orthogonal (uncorrelated) and independent modes that represent the maximum variance of observations, respectively. PCA and ICA can be classified as stationary signal decomposition techniques since they are based on decomposing the auto-covariance matrix or diagonalizing higher (than two)-order statistical tensors from centered time series. However, the stationary assumption is obviously not justifiable for many geophysical and climate variables even after removing cyclic components e.g., the seasonal cycles. In this paper, we present a new decomposition method, the complex independent component analysis (CICA, Forootan, PhD-2014), which can be applied to extract to non-stationary (changing in space and time) patterns from geophysical time series. Here, CICA is derived as an extension of real-valued ICA (Forootan and Kusche, JoG-2012), where we (i) define a new complex data set using a Hilbert transformation. The complex time series contain the observed values in their real part, and the temporal rate of variability in their imaginary part. (ii) An ICA algorithm based on diagonalization of fourth-order cumulants is then applied to decompose the new complex data set in (i
International Nuclear Information System (INIS)
Jafri, Y.Z.; Kamal, L.
2007-01-01
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
Sensor-Generated Time Series Events: A Definition Language
Anguera, Aurea; Lara, Juan A.; Lizcano, David; Martínez, Maria Aurora; Pazos, Juan
2012-01-01
There are now a great many domains where information is recorded by sensors over a limited time period or on a permanent basis. This data flow leads to sequences of data known as time series. In many domains, like seismography or medicine, time series analysis focuses on particular regions of interest, known as events, whereas the remainder of the time series contains hardly any useful information. In these domains, there is a need for mechanisms to identify and locate such events. In this paper, we propose an events definition language that is general enough to be used to easily and naturally define events in time series recorded by sensors in any domain. The proposed language has been applied to the definition of time series events generated within the branch of medicine dealing with balance-related functions in human beings. A device, called posturograph, is used to study balance-related functions. The platform has four sensors that record the pressure intensity being exerted on the platform, generating four interrelated time series. As opposed to the existing ad hoc proposals, the results confirm that the proposed language is valid, that is generally applicable and accurate, for identifying the events contained in the time series.
Energy Technology Data Exchange (ETDEWEB)
Gallego, C. J.
2010-03-08
Abstract: This technical report is focused on the analysis of stochastic processes that switch between different dynamics (also called regimes or mechanisms) over time. The so-called Switching-regime models consider several underlying functions instead of one. In this case, a classification problem arises as the current regime has to be assessed at each time-step. The identification of the regimes allows the performance of regime-switching models for short-term forecasting purposes. Within this framework, identifying different regimes showed by time-series is the aim of this work. The proposed approach is based on a statistical tool called Gamma-test. One of the main advantages of this methodology is the absence of a mathematical definition for the different underlying functions. Applications with both simulated and real wind power data have been considered. Results on simulated time series show that regimes can be successfully identified under certain hypothesis. Nevertheless, this work highlights that further research has to be done when considering real wind power time-series, which usually show different behaviours (e.g. fluctuations or ramps, followed by low variance periods). A better understanding of these events eventually will improve wind power forecasting. (Author) 15 refs.
Gender inequality and economic growth: a time series analysis for Pakistan
Pervaiz, Zahid; Chani, Muhammad Irfan; Jan, Sajjad Ahmad; Chaudhary, Amatul R.
2011-01-01
This paper attempts to analyze the impact of gender inequality on economic growth of Pakistan. An annual time series data for the period of 1972-2009 has been used in this study. We have regressed growth rate of real gross domestic product (GDP) per capita on labour force growth, investment, trade openness and a composite index of gender inequality. The results reveal that labour force growth, investment and trade openness have statistically significant and positive impact whereas gender ineq...
Time Series Forecasting with Missing Values
Directory of Open Access Journals (Sweden)
Shin-Fu Wu
2015-11-01
Full Text Available Time series prediction has become more popular in various kinds of applications such as weather prediction, control engineering, financial analysis, industrial monitoring, etc. To deal with real-world problems, we are often faced with missing values in the data due to sensor malfunctions or human errors. Traditionally, the missing values are simply omitted or replaced by means of imputation methods. However, omitting those missing values may cause temporal discontinuity. Imputation methods, on the other hand, may alter the original time series. In this study, we propose a novel forecasting method based on least squares support vector machine (LSSVM. We employ the input patterns with the temporal information which is defined as local time index (LTI. Time series data as well as local time indexes are fed to LSSVM for doing forecasting without imputation. We compare the forecasting performance of our method with other imputation methods. Experimental results show that the proposed method is promising and is worth further investigations.
Carleton, W Christopher; Campbell, David; Collard, Mark
2018-01-01
Statistical time-series analysis has the potential to improve our understanding of human-environment interaction in deep time. However, radiocarbon dating-the most common chronometric technique in archaeological and palaeoenvironmental research-creates challenges for established statistical methods. The methods assume that observations in a time-series are precisely dated, but this assumption is often violated when calibrated radiocarbon dates are used because they usually have highly irregular uncertainties. As a result, it is unclear whether the methods can be reliably used on radiocarbon-dated time-series. With this in mind, we conducted a large simulation study to investigate the impact of chronological uncertainty on a potentially useful time-series method. The method is a type of regression involving a prediction algorithm called the Poisson Exponentially Weighted Moving Average (PEMWA). It is designed for use with count time-series data, which makes it applicable to a wide range of questions about human-environment interaction in deep time. Our simulations suggest that the PEWMA method can often correctly identify relationships between time-series despite chronological uncertainty. When two time-series are correlated with a coefficient of 0.25, the method is able to identify that relationship correctly 20-30% of the time, providing the time-series contain low noise levels. With correlations of around 0.5, it is capable of correctly identifying correlations despite chronological uncertainty more than 90% of the time. While further testing is desirable, these findings indicate that the method can be used to test hypotheses about long-term human-environment interaction with a reasonable degree of confidence.
The analysis of time series: an introduction
National Research Council Canada - National Science Library
Chatfield, Christopher
1989-01-01
.... A variety of practical examples are given to support the theory. The book covers a wide range of time-series topics, including probability models for time series, Box-Jenkins forecasting, spectral analysis, linear systems and system identification...
Remote-Sensing Time Series Analysis, a Vegetation Monitoring Tool
McKellip, Rodney; Prados, Donald; Ryan, Robert; Ross, Kenton; Spruce, Joseph; Gasser, Gerald; Greer, Randall
2008-01-01
The Time Series Product Tool (TSPT) is software, developed in MATLAB , which creates and displays high signal-to- noise Vegetation Indices imagery and other higher-level products derived from remotely sensed data. This tool enables automated, rapid, large-scale regional surveillance of crops, forests, and other vegetation. TSPT temporally processes high-revisit-rate satellite imagery produced by the Moderate Resolution Imaging Spectroradiometer (MODIS) and by other remote-sensing systems. Although MODIS imagery is acquired daily, cloudiness and other sources of noise can greatly reduce the effective temporal resolution. To improve cloud statistics, the TSPT combines MODIS data from multiple satellites (Aqua and Terra). The TSPT produces MODIS products as single time-frame and multitemporal change images, as time-series plots at a selected location, or as temporally processed image videos. Using the TSPT program, MODIS metadata is used to remove and/or correct bad and suspect data. Bad pixel removal, multiple satellite data fusion, and temporal processing techniques create high-quality plots and animated image video sequences that depict changes in vegetation greenness. This tool provides several temporal processing options not found in other comparable imaging software tools. Because the framework to generate and use other algorithms is established, small modifications to this tool will enable the use of a large range of remotely sensed data types. An effective remote-sensing crop monitoring system must be able to detect subtle changes in plant health in the earliest stages, before the effects of a disease outbreak or other adverse environmental conditions can become widespread and devastating. The integration of the time series analysis tool with ground-based information, soil types, crop types, meteorological data, and crop growth models in a Geographic Information System, could provide the foundation for a large-area crop-surveillance system that could identify
Effectiveness of Multivariate Time Series Classification Using Shapelets
Directory of Open Access Journals (Sweden)
A. P. Karpenko
2015-01-01
Full Text Available Typically, time series classifiers require signal pre-processing (filtering signals from noise and artifact removal, etc., enhancement of signal features (amplitude, frequency, spectrum, etc., classification of signal features in space using the classical techniques and classification algorithms of multivariate data. We consider a method of classifying time series, which does not require enhancement of the signal features. The method uses the shapelets of time series (time series shapelets i.e. small fragments of this series, which reflect properties of one of its classes most of all.Despite the significant number of publications on the theory and shapelet applications for classification of time series, the task to evaluate the effectiveness of this technique remains relevant. An objective of this publication is to study the effectiveness of a number of modifications of the original shapelet method as applied to the multivariate series classification that is a littlestudied problem. The paper presents the problem statement of multivariate time series classification using the shapelets and describes the shapelet–based basic method of binary classification, as well as various generalizations and proposed modification of the method. It also offers the software that implements a modified method and results of computational experiments confirming the effectiveness of the algorithmic and software solutions.The paper shows that the modified method and the software to use it allow us to reach the classification accuracy of about 85%, at best. The shapelet search time increases in proportion to input data dimension.
Interglacial climate dynamics and advanced time series analysis
Mudelsee, Manfred; Bermejo, Miguel; Köhler, Peter; Lohmann, Gerrit
2013-04-01
Studying the climate dynamics of past interglacials (IGs) helps to better assess the anthropogenically influenced dynamics of the current IG, the Holocene. We select the IG portions from the EPICA Dome C ice core archive, which covers the past 800 ka, to apply methods of statistical time series analysis (Mudelsee 2010). The analysed variables are deuterium/H (indicating temperature) (Jouzel et al. 2007), greenhouse gases (Siegenthaler et al. 2005, Loulergue et al. 2008, L¨ü thi et al. 2008) and a model-co-derived climate radiative forcing (Köhler et al. 2010). We select additionally high-resolution sea-surface-temperature records from the marine sedimentary archive. The first statistical method, persistence time estimation (Mudelsee 2002) lets us infer the 'climate memory' property of IGs. Second, linear regression informs about long-term climate trends during IGs. Third, ramp function regression (Mudelsee 2000) is adapted to look on abrupt climate changes during IGs. We compare the Holocene with previous IGs in terms of these mathematical approaches, interprete results in a climate context, assess uncertainties and the requirements to data from old IGs for yielding results of 'acceptable' accuracy. This work receives financial support from the Deutsche Forschungsgemeinschaft (Project ClimSens within the DFG Research Priority Program INTERDYNAMIK) and the European Commission (Marie Curie Initial Training Network LINC, No. 289447, within the 7th Framework Programme). References Jouzel J, Masson-Delmotte V, Cattani O, Dreyfus G, Falourd S, Hoffmann G, Minster B, Nouet J, Barnola JM, Chappellaz J, Fischer H, Gallet JC, Johnsen S, Leuenberger M, Loulergue L, Luethi D, Oerter H, Parrenin F, Raisbeck G, Raynaud D, Schilt A, Schwander J, Selmo E, Souchez R, Spahni R, Stauffer B, Steffensen JP, Stenni B, Stocker TF, Tison JL, Werner M, Wolff EW (2007) Orbital and millennial Antarctic climate variability over the past 800,000 years. Science 317:793. Köhler P, Bintanja R
Wavelet based correlation coefficient of time series of Saudi Meteorological Data
International Nuclear Information System (INIS)
Rehman, S.; Siddiqi, A.H.
2009-01-01
In this paper, wavelet concepts are used to study a correlation between pairs of time series of meteorological parameters such as pressure, temperature, rainfall, relative humidity and wind speed. The study utilized the daily average values of meteorological parameters of nine meteorological stations of Saudi Arabia located at different strategic locations. The data used in this study cover a period of 16 years between 1990 and 2005. Besides obtaining wavelet spectra, we also computed the wavelet correlation coefficients between two same parameters from two different locations and show that strong correlation or strong anti-correlation depends on scale. The cross-correlation coefficients of meteorological parameters between two stations were also calculated using statistical function. For coastal to costal pair of stations, pressure time series was found to be strongly correlated. In general, the temperature data were found to be strongly correlated for all pairs of stations and the rainfall data the least.
Energy Technology Data Exchange (ETDEWEB)
Pardo-Iguzquiza, E.; Rodriguez-Tovar, F. J.
2013-06-01
In geosciences the sampling of a time series tends to afford uneven results, sometimes because the sampling itself is random or because of hiatuses or even completely missing data or due to difficulties involved in the conversion of data from a spatial to a time scale when the sedimentation rate was not constant. Whatever the case, the best solution does not lie in interpolation but rather in resorting to a method that deals with the irregular data. We show here how the use of the smoothed Lomb-Scargle periodogram is both a practical and efficient choice. We describe the effects on the estimated power spectrum of the type of irregular sampling, the number of data, interpolation, and the presence of drift. We propose the permutation test as being an efficient way of calculating statistical confidence levels. By applying the Lomb-Scargle periodogram to a synthetic series with a known spectral content we are able to confirm the validity of this method in the face of the difficulties mentioned above. A case study with real data, including hiatuses, representing the thickness of the annual banding in a stalagmite, is chosen to demonstrate an application using the statistical and physical interpretation of spectral peaks. (Author)
Estimation of Hurst Exponent for the Financial Time Series
Kumar, J.; Manchanda, P.
2009-07-01
Till recently statistical methods and Fourier analysis were employed to study fluctuations in stock markets in general and Indian stock market in particular. However current trend is to apply the concepts of wavelet methodology and Hurst exponent, see for example the work of Manchanda, J. Kumar and Siddiqi, Journal of the Frankline Institute 144 (2007), 613-636 and paper of Cajueiro and B. M. Tabak. Cajueiro and Tabak, Physica A, 2003, have checked the efficiency of emerging markets by computing Hurst component over a time window of 4 years of data. Our goal in the present paper is to understand the dynamics of the Indian stock market. We look for the persistency in the stock market through Hurst exponent and fractal dimension of time series data of BSE 100 and NIFTY 50.
Clinical and epidemiological rounds. Time series
Directory of Open Access Journals (Sweden)
León-Álvarez, Alba Luz
2016-07-01
Full Text Available Analysis of time series is a technique that implicates the study of individuals or groups observed in successive moments in time. This type of analysis allows the study of potential causal relationships between different variables that change over time and relate to each other. It is the most important technique to make inferences about the future, predicting, on the basis or what has happened in the past and it is applied in different disciplines of knowledge. Here we discuss different components of time series, the analysis technique and specific examples in health research.
Acker, James G.; Shen, Suhung; Leptoukh, Gregory G.; Lee, Zhongping
2012-01-01
Oceanographic time-series stations provide vital data for the monitoring of oceanic processes, particularly those associated with trends over time and interannual variability. There are likely numerous locations where the establishment of a time-series station would be desirable, but for reasons of funding or logistics, such establishment may not be feasible. An alternative to an operational time-series station is monitoring of sites via remote sensing. In this study, the NASA Giovanni data system is employed to simulate the establishment of two time-series stations near the outflow region of California s Eel River, which carries a high sediment load. Previous time-series analysis of this location (Acker et al. 2009) indicated that remotely-sensed chl a exhibits a statistically significant increasing trend during summer (low flow) months, but no apparent trend during winter (high flow) months. Examination of several newly-available ocean data parameters in Giovanni, including 8-day resolution data, demonstrates the differences in ocean parameter trends at the two locations compared to regionally-averaged time-series. The hypothesis that the increased summer chl a values are related to increasing SST is evaluated, and the signature of the Eel River plume is defined with ocean optical parameters.
Directory of Open Access Journals (Sweden)
Farshad Fathian
2017-02-01
Full Text Available Introduction: Time series models are one of the most important tools for investigating and modeling hydrological processes in order to solve problems related to water resources management. Many hydrological time series shows nonstationary and nonlinear behaviors. One of the important hydrological modeling tasks is determining the existence of nonstationarity and the way through which we can access the stationarity accordingly. On the other hand, streamflow processes are usually considered as nonlinear mechanisms while in many studies linear time series models are used to model streamflow time series. However, it is not clear what kind of nonlinearity is acting underlying the streamflowprocesses and how intensive it is. Materials and Methods: Streamflow time series of 6 hydro-gauge stations located in the upstream basin rivers of ZarrinehRoud dam (located in the southern part of Urmia Lake basin have been considered to investigate stationarity and nonlinearity. All data series used here to startfrom January 1, 1997, and end on December 31, 2011. In this study, stationarity is tested by ADF and KPSS tests and nonlinearity is tested by BDS, Keenan and TLRT tests. The stationarity test is carried out with two methods. Thefirst one method is the augmented Dickey-Fuller (ADF unit root test first proposed by Dickey and Fuller (1979 and modified by Said and Dickey (1984, which examinsthe presence of unit roots in time series.The second onemethod is KPSS test, proposed by Kwiatkowski et al. (1992, which examinesthestationarity around a deterministic trend (trend stationarity and the stationarity around a fixed level (level stationarity. The BDS test (Brock et al., 1996 is a nonparametric method for testing the serial independence and nonlinear structure in time series based on the correlation integral of the series. The null hypothesis is the time series sample comes from an independent identically distributed (i.i.d. process. The alternative hypothesis
Robust Forecasting of Non-Stationary Time Series
Croux, C.; Fried, R.; Gijbels, I.; Mahieu, K.
2010-01-01
This paper proposes a robust forecasting method for non-stationary time series. The time series is modelled using non-parametric heteroscedastic regression, and fitted by a localized MM-estimator, combining high robustness and large efficiency. The proposed method is shown to produce reliable
Time series analysis in road safety research uisng state space methods
BIJLEVELD, FD
2008-01-01
In this thesis we present a comprehensive study into novel time series models for aggregated road safety data. The models are mainly intended for analysis of indicators relevant to road safety, with a particular focus on how to measure these factors. Such developments may need to be related to or explained by external influences. It is also possible to make forecasts using the models. Relevant indicators include the number of persons killed permonth or year. These statistics are closely watch...
Forecasting the Reference Evapotranspiration Using Time Series Model
Directory of Open Access Journals (Sweden)
H. Zare Abyaneh
2016-10-01
evapotranspiration were obtained. The mean values of evapotranspiration in the study period were 4.42, 3.93, 5.05, 5.49, and 5.60 mm day−1 in Esfahan, Semnan, Shiraz, Kerman, and Yazd, respectively. The Augmented Dickey-Fuller (ADF test was performed to the time series. The results showed that in all stations except Shiraz, time series had unit root and were non-stationary. The non-stationary time series became stationary at 1st difference. Using the EViews 7 software, the seasonal ARIMA models were applied to the evapotranspiration time series and R2 coefficient of determination, Durbin–Watson statistic (DW, Hannan-Quinn (HQ, Schwarz (SC and Akaike information criteria (AIC were used to determine, the best models for the stations were selected. The selected models were listed in Table 2. Moreover, information criteria (AIC, SC, and HQ were used to assess model parsimony. The independence assumption of the model residuals was confirmed by a sensitive diagnostic check. Furthermore, the homoscedasticity and normality assumptions were tested using other diagnostics tests. Table 2- The selected time series models for the stations Station\tSeasonal ARIMA model\tInformation criteria\tR2\tDW SC\tHQ\tAIC Esfahan\tARIMA(1, 1, 1×(1, 0, 112\t1.2571\t1.2840\t1.2396\t0.8800\t1.9987 Semnan\tARIMA(5, 1, 2×(1, 0, 112\t1.5665\t1.5122\t1.4770\t0.8543\t1.9911 Shiraz\tARIMA(2, 0, 3×(1, 0, 112\t1.3312\t1.2881\t1.2601\t0.9665\t1.9873 Kerman\tARIMA(5, 1, 1×(1, 0, 112\t1.8097\t1.7608\t1.8097\t0.8557\t2.0042 Yazd\tARIMA(2, 1, 3×(1, 1, 112\t1.7472\t1.7032\t1.6746\t0.5264\t1.9943 The seasonal ARIMA models presented in Table 2, were used at the 12 months (2004-2005 forecasting horizon. The results showed that the models produce good out-of-sample forecasts, which in all the stations the lowest correlation coefficient and the highest root mean square error were obtained 0.988 and 0.515 mm day−1, respectively. Conclusion: In the presented paper, reference evapotranspiration in the five synoptic
Comparison of time-series registration methods in breast dynamic infrared imaging
Riyahi-Alam, S.; Agostini, V.; Molinari, F.; Knaflitz, M.
2015-03-01
Automated motion reduction in dynamic infrared imaging is on demand in clinical applications, since movement disarranges time-temperature series of each pixel, thus originating thermal artifacts that might bias the clinical decision. All previously proposed registration methods are feature based algorithms requiring manual intervention. The aim of this work is to optimize the registration strategy specifically for Breast Dynamic Infrared Imaging and to make it user-independent. We implemented and evaluated 3 different 3D time-series registration methods: 1. Linear affine, 2. Non-linear Bspline, 3. Demons applied to 12 datasets of healthy breast thermal images. The results are evaluated through normalized mutual information with average values of 0.70 ±0.03, 0.74 ±0.03 and 0.81 ±0.09 (out of 1) for Affine, Bspline and Demons registration, respectively, as well as breast boundary overlap and Jacobian determinant of the deformation field. The statistical analysis of the results showed that symmetric diffeomorphic Demons' registration method outperforms also with the best breast alignment and non-negative Jacobian values which guarantee image similarity and anatomical consistency of the transformation, due to homologous forces enforcing the pixel geometric disparities to be shortened on all the frames. We propose Demons' registration as an effective technique for time-series dynamic infrared registration, to stabilize the local temperature oscillation.
Detecting method for crude oil price fluctuation mechanism under different periodic time series
International Nuclear Information System (INIS)
Gao, Xiangyun; Fang, Wei; An, Feng; Wang, Yue
2017-01-01
Highlights: • We proposed the concept of autoregressive modes to indicate the fluctuation patterns. • We constructed transmission networks for studying the fluctuation mechanism. • There are different fluctuation mechanism under different periodic time series. • Only a few types of autoregressive modes control the fluctuations in crude oil price. • There are cluster effects during the fluctuation mechanism of autoregressive modes. - Abstract: Current existing literatures can characterize the long-term fluctuation of crude oil price time series, however, it is difficult to detect the fluctuation mechanism specifically under short term. Because each fluctuation pattern for one short period contained in a long-term crude oil price time series have dynamic characteristics of diversity; in other words, there exhibit various fluctuation patterns in different short periods and transmit to each other, which reflects the reputedly complicate and chaotic oil market. Thus, we proposed an incorporated method to detect the fluctuation mechanism, which is the evolution of the different fluctuation patterns over time from the complex network perspective. We divided crude oil price time series into segments using sliding time windows, and defined autoregressive modes based on regression models to indicate the fluctuation patterns of each segment. Hence, the transmissions between different types of autoregressive modes over time form a transmission network that contains rich dynamic information. We then capture transmission characteristics of autoregressive modes under different periodic time series through the structure features of the transmission networks. The results indicate that there are various autoregressive modes with significantly different statistical characteristics under different periodic time series. However, only a few types of autoregressive modes and transmission patterns play a major role in the fluctuation mechanism of the crude oil price, and these
Time series analysis of the behavior of brazilian natural rubber
Directory of Open Access Journals (Sweden)
Antônio Donizette de Oliveira
2009-03-01
Full Text Available The natural rubber is a non-wood product obtained of the coagulation of some lattices of forest species, being Hevea brasiliensis the main one. Native from the Amazon Region, this species was already known by the Indians before the discovery of America. The natural rubber became a product globally valued due to its multiple applications in the economy, being its almost perfect substitute the synthetic rubber derived from the petroleum. Similarly to what happens with other countless products the forecast of future prices of the natural rubber has been object of many studies. The use of models of forecast of univariate timeseries stands out as the more accurate and useful to reduce the uncertainty in the economic decision making process. This studyanalyzed the historical series of prices of the Brazilian natural rubber (R$/kg, in the Jan/99 - Jun/2006 period, in order tocharacterize the rubber price behavior in the domestic market; estimated a model for the time series of monthly natural rubberprices; and foresaw the domestic prices of the natural rubber, in the Jul/2006 - Jun/2007 period, based on the estimated models.The studied models were the ones belonging to the ARIMA family. The main results were: the domestic market of the natural rubberis expanding due to the growth of the world economy; among the adjusted models, the ARIMA (1,1,1 model provided the bestadjustment of the time series of prices of the natural rubber (R$/kg; the prognosis accomplished for the series supplied statistically adequate fittings.
Directory of Open Access Journals (Sweden)
David Afolabi
2017-11-01
Full Text Available The importance of an interference-less machine learning scheme in time series prediction is crucial, as an oversight can have a negative cumulative effect, especially when predicting many steps ahead of the currently available data. The on-going research on noise elimination in time series forecasting has led to a successful approach of decomposing the data sequence into component trends to identify noise-inducing information. The empirical mode decomposition method separates the time series/signal into a set of intrinsic mode functions ranging from high to low frequencies, which can be summed up to reconstruct the original data. The usual assumption that random noises are only contained in the high-frequency component has been shown not to be the case, as observed in our previous findings. The results from that experiment reveal that noise can be present in a low frequency component, and this motivates the newly-proposed algorithm. Additionally, to prevent the erosion of periodic trends and patterns within the series, we perform the learning of local and global trends separately in a hierarchical manner which succeeds in detecting and eliminating short/long term noise. The algorithm is tested on four datasets from financial market data and physical science data. The simulation results are compared with the conventional and state-of-the-art approaches for time series machine learning, such as the non-linear autoregressive neural network and the long short-term memory recurrent neural network, respectively. Statistically significant performance gains are recorded when the meta-learning algorithm for noise reduction is used in combination with these artificial neural networks. For time series data which cannot be decomposed into meaningful trends, applying the moving average method to create meta-information for guiding the learning process is still better than the traditional approach. Therefore, this new approach is applicable to the forecasting
Evaluation of the autoregression time-series model for analysis of a noisy signal
International Nuclear Information System (INIS)
Allen, J.W.
1977-01-01
The autoregression (AR) time-series model of a continuous noisy signal was statistically evaluated to determine quantitatively the uncertainties of the model order, the model parameters, and the model's power spectral density (PSD). The result of such a statistical evaluation enables an experimenter to decide whether an AR model can adequately represent a continuous noisy signal and be consistent with the signal's frequency spectrum, and whether it can be used for on-line monitoring. Although evaluations of other types of signals have been reported in the literature, no direct reference has been found to AR model's uncertainties for continuous noisy signals; yet the evaluation is necessary to decide the usefulness of AR models of typical reactor signals (e.g., neutron detector output or thermocouple output) and the potential of AR models for on-line monitoring applications. AR and other time-series models for noisy data representation are being investigated by others since such models require fewer parameters than the traditional PSD model. For this study, the AR model was selected for its simplicity and conduciveness to uncertainty analysis, and controlled laboratory bench signals were used for continuous noisy data. (author)
International Nuclear Information System (INIS)
Suzuki, Kiyotaka; Matsuzawa, Hitoshi; Watanabe, Masaki; Nakada, Tsutomu; Nakayama, Naoki; Kwee, I.L.
2003-01-01
Dynamic contrast enhanced magnetic resonance imaging (dynamic MRI) represents a MRI version of non-diffusible tracer methods, the main clinical use of which is the physiological construction of what is conventionally referred to as perfusion images. The raw data utilized for constructing MRI perfusion images are time series of pixel signal alterations associated with the passage of a gadolinium containing contrast agent. Such time series are highly compatible with independent component analysis (ICA), a novel statistical signal processing technique capable of effectively separating a single mixture of multiple signals into their original independent source signals (blind separation). Accordingly, we applied ICA to dynamic MRI time series. The technique was found to be powerful, allowing for hitherto unobtainable assessment of regional cerebral hemodynamics in vivo. (author)
Complex network approach to fractional time series
Energy Technology Data Exchange (ETDEWEB)
Manshour, Pouya [Physics Department, Persian Gulf University, Bushehr 75169 (Iran, Islamic Republic of)
2015-10-15
In order to extract correlation information inherited in stochastic time series, the visibility graph algorithm has been recently proposed, by which a time series can be mapped onto a complex network. We demonstrate that the visibility algorithm is not an appropriate one to study the correlation aspects of a time series. We then employ the horizontal visibility algorithm, as a much simpler one, to map fractional processes onto complex networks. The degree distributions are shown to have parabolic exponential forms with Hurst dependent fitting parameter. Further, we take into account other topological properties such as maximum eigenvalue of the adjacency matrix and the degree assortativity, and show that such topological quantities can also be used to predict the Hurst exponent, with an exception for anti-persistent fractional Gaussian noises. To solve this problem, we take into account the Spearman correlation coefficient between nodes' degrees and their corresponding data values in the original time series.
Event-sequence time series analysis in ground-based gamma-ray astronomy
International Nuclear Information System (INIS)
Barres de Almeida, U.; Chadwick, P.; Daniel, M.; Nolan, S.; McComb, L.
2008-01-01
The recent, extreme episodes of variability detected from Blazars by the leading atmospheric Cerenkov experiments motivate the development and application of specialized statistical techniques that enable the study of this rich data set to its furthest extent. The identification of the shortest variability timescales supported by the data and the actual variability structure observed in the light curves of these sources are some of the fundamental aspects being studied, that answers can bring new developments on the understanding of the physics of these objects and on the mechanisms of production of VHE gamma-rays in the Universe. Some of our efforts in studying the time variability of VHE sources involve the application of dynamic programming algorithms to the problem of detecting change-points in a Poisson sequence. In this particular paper we concentrate on the more primary issue of the applicability of counting statistics to the analysis of time-series on VHE gamma-ray astronomy.
Directory of Open Access Journals (Sweden)
Matthew Perry
2017-06-01
Full Text Available A tool has been developed to statistically increase the temporal resolution of solar irradiance time series. Fine temporal resolution time series are an important input into the planning process for solar power plants, and lead to increased understanding of the likely short-term variability of solar energy. The approach makes use of the spatial variability of hourly gridded datasets around a location of interest to make inferences about the temporal variability within the hour. The unique characteristics of solar irradiance data are modelled by classifying each hour into a typical weather situation. Low variability situations are modelled using an autoregressive process which is applied to ramps of clear-sky index. High variability situations are modelled as a transition between states of clear sky conditions and different levels of cloud opacity. The methods have been calibrated to Australian conditions using 1 min data from four ground stations for a 10 year period. These stations, together with an independent dataset, have also been used to verify the quality of the results using a number of relevant metrics. The results show that the method generates realistic fine resolution synthetic time series. The synthetic time series correlate well with observed data on monthly and annual timescales as they are constrained to the nearest grid-point value on each hour. The probability distributions of the synthetic and observed global irradiance data are similar, with Kolmogorov-Smirnov test statistic less than 0.04 at each station. The tool could be useful for the estimation of solar power output for integration studies.
The foundations of modern time series analysis
Mills, Terence C
2011-01-01
This book develops the analysis of Time Series from its formal beginnings in the 1890s through to the publication of Box and Jenkins' watershed publication in 1970, showing how these methods laid the foundations for the modern techniques of Time Series analysis that are in use today.
HOMPRA Europe - A gridded precipitation data set from European homogenized time series
Rustemeier, Elke; Kapala, Alice; Meyer-Christoffer, Anja; Finger, Peter; Schneider, Udo; Venema, Victor; Ziese, Markus; Simmer, Clemens; Becker, Andreas
2017-04-01
Reliable monitoring data are essential for robust analyses of climate variability and, in particular, long-term trends. In this regard, a gridded, homogenized data set of monthly precipitation totals - HOMPRA Europe (HOMogenized PRecipitation Analysis of European in-situ data)- is presented. The data base consists of 5373 homogenized monthly time series, a carefully selected subset held by the Global Precipitation Climatology Centre (GPCC). The chosen series cover the period 1951-2005 and contain less than 10% missing values. Due to the large number of data, an automatic algorithm had to be developed for the homogenization of these precipitation series. In principal, the algorithm is based on three steps: * Selection of overlapping station networks in the same precipitation regime, based on rank correlation and Ward's method of minimal variance. Since the underlying time series should be as homogeneous as possible, the station selection is carried out by deterministic first derivation in order to reduce artificial influences. * The natural variability and trends were temporally removed by means of highly correlated neighboring time series to detect artificial break-points in the annual totals. This ensures that only artificial changes can be detected. The method is based on the algorithm of Caussinus and Mestre (2004). * In the last step, the detected breaks are corrected monthly by means of a multiple linear regression (Mestre, 2003). Due to the automation of the homogenization, the validation of the algorithm is essential. Therefore, the method was tested on artificial data sets. Additionally the sensitivity of the method was tested by varying the neighborhood series. If available in digitized form, the station history was also used to search for systematic errors in the jump detection. Finally, the actual HOMPRA Europe product is produced by interpolation of the homogenized series onto a 1° grid using one of the interpolation schems operationally at GPCC
Time series clustering in large data sets
Directory of Open Access Journals (Sweden)
Jiří Fejfar
2011-01-01
Full Text Available The clustering of time series is a widely researched area. There are many methods for dealing with this task. We are actually using the Self-organizing map (SOM with the unsupervised learning algorithm for clustering of time series. After the first experiment (Fejfar, Weinlichová, Šťastný, 2009 it seems that the whole concept of the clustering algorithm is correct but that we have to perform time series clustering on much larger dataset to obtain more accurate results and to find the correlation between configured parameters and results more precisely. The second requirement arose in a need for a well-defined evaluation of results. It seems useful to use sound recordings as instances of time series again. There are many recordings to use in digital libraries, many interesting features and patterns can be found in this area. We are searching for recordings with the similar development of information density in this experiment. It can be used for musical form investigation, cover songs detection and many others applications.The objective of the presented paper is to compare clustering results made with different parameters of feature vectors and the SOM itself. We are describing time series in a simplistic way evaluating standard deviations for separated parts of recordings. The resulting feature vectors are clustered with the SOM in batch training mode with different topologies varying from few neurons to large maps.There are other algorithms discussed, usable for finding similarities between time series and finally conclusions for further research are presented. We also present an overview of the related actual literature and projects.
Umphrey, Gary; Carter, Richard; McLeod, A; Ullah, Aman
1987-01-01
On May 27-31, 1985, a series of symposia was held at The University of Western Ontario, London, Canada, to celebrate the 70th birthday of Professor V. M. Joshi. These symposia were chosen to reflect Professor Joshi's research interests as well as areas of expertise in statistical science among faculty in the Departments of Statistical and Actuarial Sciences, Economics, Epidemiology and Biostatistics, and Philosophy. From these symposia, the six volumes which comprise the "Joshi Festschrift" have arisen. The 117 articles in this work reflect the broad interests and high quality of research of those who attended our conference. We would like to thank all of the contributors for their superb cooperation in helping us to complete this project. Our deepest gratitude must go to the three people who have spent so much of their time in the past year typing these volumes: Jackie Bell, Lise Constant, and Sandy Tarnowski. This work has been printed from "camera ready" copy produced by our Vax 785 computer and QMS Laserg...
Monitoring Forest Regrowth Using a Multi-Platform Time Series
Sabol, Donald E., Jr.; Smith, Milton O.; Adams, John B.; Gillespie, Alan R.; Tucker, Compton J.
1996-01-01
Over the past 50 years, the forests of western Washington and Oregon have been extensively harvested for timber. This has resulted in a heterogeneous mosaic of remaining mature forests, clear-cuts, new plantations, and second-growth stands that now occur in areas that formerly were dominated by extensive old-growth forests and younger forests resulting from fire disturbance. Traditionally, determination of seral stage and stand condition have been made using aerial photography and spot field observations, a methodology that is not only time- and resource-intensive, but falls short of providing current information on a regional scale. These limitations may be solved, in part, through the use of multispectral images which can cover large areas at spatial resolutions in the order of tens of meters. The use of multiple images comprising a time series potentially can be used to monitor land use (e.g. cutting and replanting), and to observe natural processes such as regeneration, maturation and phenologic change. These processes are more likely to be spectrally observed in a time series composed of images taken during different seasons over a long period of time. Therefore, for many areas, it may be necessary to use a variety of images taken with different imaging systems. A common framework for interpretation is needed that reduces topographic, atmospheric, instrumental, effects as well as differences in lighting geometry between images. The present state of remote-sensing technology in general use does not realize the full potential of the multispectral data in areas of high topographic relief. For example, the primary method for analyzing images of forested landscapes in the Northwest has been with statistical classifiers (e.g. parallelepiped, nearest-neighbor, maximum likelihood, etc.), often applied to uncalibrated multispectral data. Although this approach has produced useful information from individual images in some areas, landcover classes defined by these
Earthquake forecasting studies using radon time series data in Taiwan
Walia, Vivek; Kumar, Arvind; Fu, Ching-Chou; Lin, Shih-Jung; Chou, Kuang-Wu; Wen, Kuo-Liang; Chen, Cheng-Hong
2017-04-01
For few decades, growing number of studies have shown usefulness of data in the field of seismogeochemistry interpreted as geochemical precursory signals for impending earthquakes and radon is idendified to be as one of the most reliable geochemical precursor. Radon is recognized as short-term precursor and is being monitored in many countries. This study is aimed at developing an effective earthquake forecasting system by inspecting long term radon time series data. The data is obtained from a network of radon monitoring stations eastblished along different faults of Taiwan. The continuous time series radon data for earthquake studies have been recorded and some significant variations associated with strong earthquakes have been observed. The data is also examined to evaluate earthquake precursory signals against environmental factors. An automated real-time database operating system has been developed recently to improve the data processing for earthquake precursory studies. In addition, the study is aimed at the appraisal and filtrations of these environmental parameters, in order to create a real-time database that helps our earthquake precursory study. In recent years, automatic operating real-time database has been developed using R, an open source programming language, to carry out statistical computation on the data. To integrate our data with our working procedure, we use the popular and famous open source web application solution, AMP (Apache, MySQL, and PHP), creating a website that could effectively show and help us manage the real-time database.
DEFF Research Database (Denmark)
Lindström, Erik; Madsen, Henrik; Nielsen, Jan Nygaard
Statistics for Finance develops students’ professional skills in statistics with applications in finance. Developed from the authors’ courses at the Technical University of Denmark and Lund University, the text bridges the gap between classical, rigorous treatments of financial mathematics...... that rarely connect concepts to data and books on econometrics and time series analysis that do not cover specific problems related to option valuation. The book discusses applications of financial derivatives pertaining to risk assessment and elimination. The authors cover various statistical...... and mathematical techniques, including linear and nonlinear time series analysis, stochastic calculus models, stochastic differential equations, Itō’s formula, the Black–Scholes model, the generalized method-of-moments, and the Kalman filter. They explain how these tools are used to price financial derivatives...
Exploratory joint and separate tracking of geographically related time series
Balasingam, Balakumar; Willett, Peter; Levchuk, Georgiy; Freeman, Jared
2012-05-01
Target tracking techniques have usually been applied to physical systems via radar, sonar or imaging modalities. But the same techniques - filtering, association, classification, track management - can be applied to nontraditional data such as one might find in other fields such as economics, business and national defense. In this paper we explore a particular data set. The measurements are time series collected at various sites; but other than that little is known about it. We shall refer to as the data as representing the Megawatt hour (MWH) output of various power plants located in Afghanistan. We pose such questions as: 1. Which power plants seem to have a common model? 2. Do any power plants change their models with time? 3. Can power plant behavior be predicted, and if so, how far to the future? 4. Are some of the power plants stochastically linked? That is, do we observed a lack of power demand at one power plant as implying a surfeit of demand elsewhere? The observations seem well modeled as hidden Markov. This HMM modeling is compared to other approaches; and tests are continued to other (albeit self-generated) data sets with similar characteristics. Keywords: Time-series analysis, hidden Markov models, statistical similarity, clustering weighted
Lag space estimation in time series modelling
DEFF Research Database (Denmark)
Goutte, Cyril
1997-01-01
The purpose of this article is to investigate some techniques for finding the relevant lag-space, i.e. input information, for time series modelling. This is an important aspect of time series modelling, as it conditions the design of the model through the regressor vector a.k.a. the input layer...
Time-series prediction and applications a machine intelligence approach
Konar, Amit
2017-01-01
This book presents machine learning and type-2 fuzzy sets for the prediction of time-series with a particular focus on business forecasting applications. It also proposes new uncertainty management techniques in an economic time-series using type-2 fuzzy sets for prediction of the time-series at a given time point from its preceding value in fluctuating business environments. It employs machine learning to determine repetitively occurring similar structural patterns in the time-series and uses stochastic automaton to predict the most probabilistic structure at a given partition of the time-series. Such predictions help in determining probabilistic moves in a stock index time-series Primarily written for graduate students and researchers in computer science, the book is equally useful for researchers/professionals in business intelligence and stock index prediction. A background of undergraduate level mathematics is presumed, although not mandatory, for most of the sections. Exercises with tips are provided at...
A Time Series Forecasting Method
Directory of Open Access Journals (Sweden)
Wang Zhao-Yu
2017-01-01
Full Text Available This paper proposes a novel time series forecasting method based on a weighted self-constructing clustering technique. The weighted self-constructing clustering processes all the data patterns incrementally. If a data pattern is not similar enough to an existing cluster, it forms a new cluster of its own. However, if a data pattern is similar enough to an existing cluster, it is removed from the cluster it currently belongs to and added to the most similar cluster. During the clustering process, weights are learned for each cluster. Given a series of time-stamped data up to time t, we divide it into a set of training patterns. By using the weighted self-constructing clustering, the training patterns are grouped into a set of clusters. To estimate the value at time t + 1, we find the k nearest neighbors of the input pattern and use these k neighbors to decide the estimation. Experimental results are shown to demonstrate the effectiveness of the proposed approach.
Zero-crossing statistics for non-Markovian time series.
Nyberg, Markus; Lizana, Ludvig; Ambjörnsson, Tobias
2018-03-01
In applications spanning from image analysis and speech recognition to energy dissipation in turbulence and time-to failure of fatigued materials, researchers and engineers want to calculate how often a stochastic observable crosses a specific level, such as zero. At first glance this problem looks simple, but it is in fact theoretically very challenging, and therefore few exact results exist. One exception is the celebrated Rice formula that gives the mean number of zero crossings in a fixed time interval of a zero-mean Gaussian stationary process. In this study we use the so-called independent interval approximation to go beyond Rice's result and derive analytic expressions for all higher-order zero-crossing cumulants and moments. Our results agree well with simulations for the non-Markovian autoregressive model.
Zero-crossing statistics for non-Markovian time series
Nyberg, Markus; Lizana, Ludvig; Ambjörnsson, Tobias
2018-03-01
In applications spanning from image analysis and speech recognition to energy dissipation in turbulence and time-to failure of fatigued materials, researchers and engineers want to calculate how often a stochastic observable crosses a specific level, such as zero. At first glance this problem looks simple, but it is in fact theoretically very challenging, and therefore few exact results exist. One exception is the celebrated Rice formula that gives the mean number of zero crossings in a fixed time interval of a zero-mean Gaussian stationary process. In this study we use the so-called independent interval approximation to go beyond Rice's result and derive analytic expressions for all higher-order zero-crossing cumulants and moments. Our results agree well with simulations for the non-Markovian autoregressive model.
Stochastic nature of series of waiting times
Anvari, Mehrnaz; Aghamohammadi, Cina; Dashti-Naserabadi, H.; Salehi, E.; Behjat, E.; Qorbani, M.; Khazaei Nezhad, M.; Zirak, M.; Hadjihosseini, Ali; Peinke, Joachim; Tabar, M. Reza Rahimi
2013-06-01
Although fluctuations in the waiting time series have been studied for a long time, some important issues such as its long-range memory and its stochastic features in the presence of nonstationarity have so far remained unstudied. Here we find that the “waiting times” series for a given increment level have long-range correlations with Hurst exponents belonging to the interval 1/2
Efficient Approximate OLAP Querying Over Time Series
DEFF Research Database (Denmark)
Perera, Kasun Baruhupolage Don Kasun Sanjeewa; Hahmann, Martin; Lehner, Wolfgang
2016-01-01
The ongoing trend for data gathering not only produces larger volumes of data, but also increases the variety of recorded data types. Out of these, especially time series, e.g. various sensor readings, have attracted attention in the domains of business intelligence and decision making. As OLAP...... queries play a major role in these domains, it is desirable to also execute them on time series data. While this is not a problem on the conceptual level, it can become a bottleneck with regards to query run-time. In general, processing OLAP queries gets more computationally intensive as the volume...... of data grows. This is a particular problem when querying time series data, which generally contains multiple measures recorded at fine time granularities. Usually, this issue is addressed either by scaling up hardware or by employing workload based query optimization techniques. However, these solutions...
Multi-Scale Entropy Analysis as a Method for Time-Series Analysis of Climate Data
Directory of Open Access Journals (Sweden)
Heiko Balzter
2015-03-01
statistical properties of climate time-series data that can go undetected using traditional methods.
Modeling and Forecasting of Water Demand in Isfahan Using Underlying Trend Concept and Time Series
Directory of Open Access Journals (Sweden)
H. Sadeghi
2016-02-01
costs of water subscribers between 1388 and 1390. In structural time series model, the model was generated by entering the invisibility part of the process and development of a state-space model, as well as using maximum likelihood method and the Kalman-Filter algorithm. Results and Discussion: Given the value of the test statistic ADF, with the exception of changing water use variables with a time difference of the steady rest. Superpopulation different modes of behavior were assessed based on the demand for water. Due to the likelihood ratio statistic is most suitable for the parameters, was diagnosed the steady-state level of randomness and the slope. Price and income elasticities of demand for water, respectively -0.81 and 0.85 shows that water demand is inelastic with respect to price and income and a lot of water is essential. Identify the nature of the request of one of the most important results in estimated water demand in the urban part of the state space time series structure and patterning methods, as an Alternative for variable is Technology preferences use. The model is estimated for the city's water demand time series model, respectively ARMA (3,1. Model performance metrics to compare the structural time series and time series ARMA, the result represents a structural time series model based on the fact that all the performance criteria in this study outperformed the ARMA model to forecast water city demand in the Isfahan. Conclusion: Of a time series model structure to model ARMA in this research is to estimate the model and predict the number the less time is required, and also can be used for modeling of other variables (such as income and price to this is helping to improve the models. Also, in ARMA time series the best model for data was selected according to the Schwarz Bayesian and Akaike criterion. Results indicate that the estimation of water demand using structural time series method is more efficient than when ARMA time series model is applied
A Dynamic Fuzzy Cluster Algorithm for Time Series
Directory of Open Access Journals (Sweden)
Min Ji
2013-01-01
clustering time series by introducing the definition of key point and improving FCM algorithm. The proposed algorithm works by determining those time series whose class labels are vague and further partitions them into different clusters over time. The main advantage of this approach compared with other existing algorithms is that the property of some time series belonging to different clusters over time can be partially revealed. Results from simulation-based experiments on geographical data demonstrate the excellent performance and the desired results have been obtained. The proposed algorithm can be applied to solve other clustering problems in data mining.
A novel weight determination method for time series data aggregation
Xu, Paiheng; Zhang, Rong; Deng, Yong
2017-09-01
Aggregation in time series is of great importance in time series smoothing, predicting and other time series analysis process, which makes it crucial to address the weights in times series correctly and reasonably. In this paper, a novel method to obtain the weights in time series is proposed, in which we adopt induced ordered weighted aggregation (IOWA) operator and visibility graph averaging (VGA) operator and linearly combine the weights separately generated by the two operator. The IOWA operator is introduced to the weight determination of time series, through which the time decay factor is taken into consideration. The VGA operator is able to generate weights with respect to the degree distribution in the visibility graph constructed from the corresponding time series, which reflects the relative importance of vertices in time series. The proposed method is applied to two practical datasets to illustrate its merits. The aggregation of Construction Cost Index (CCI) demonstrates the ability of proposed method to smooth time series, while the aggregation of The Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) illustrate how proposed method maintain the variation tendency of original data.
Time-Elastic Generative Model for Acceleration Time Series in Human Activity Recognition.
Munoz-Organero, Mario; Ruiz-Blazquez, Ramona
2017-02-08
Body-worn sensors in general and accelerometers in particular have been widely used in order to detect human movements and activities. The execution of each type of movement by each particular individual generates sequences of time series of sensed data from which specific movement related patterns can be assessed. Several machine learning algorithms have been used over windowed segments of sensed data in order to detect such patterns in activity recognition based on intermediate features (either hand-crafted or automatically learned from data). The underlying assumption is that the computed features will capture statistical differences that can properly classify different movements and activities after a training phase based on sensed data. In order to achieve high accuracy and recall rates (and guarantee the generalization of the system to new users), the training data have to contain enough information to characterize all possible ways of executing the activity or movement to be detected. This could imply large amounts of data and a complex and time-consuming training phase, which has been shown to be even more relevant when automatically learning the optimal features to be used. In this paper, we present a novel generative model that is able to generate sequences of time series for characterizing a particular movement based on the time elasticity properties of the sensed data. The model is used to train a stack of auto-encoders in order to learn the particular features able to detect human movements. The results of movement detection using a newly generated database with information on five users performing six different movements are presented. The generalization of results using an existing database is also presented in the paper. The results show that the proposed mechanism is able to obtain acceptable recognition rates ( F = 0.77) even in the case of using different people executing a different sequence of movements and using different hardware.
Big Data impacts on stochastic Forecast Models: Evidence from FX time series
Directory of Open Access Journals (Sweden)
Sebastian Dietz
2013-12-01
Full Text Available With the rise of the Big Data paradigm new tasks for prediction models appeared. In addition to the volume problem of such data sets nonlinearity becomes important, as the more detailed data sets contain also more comprehensive information, e.g. about non regular seasonal or cyclical movements as well as jumps in time series. This essay compares two nonlinear methods for predicting a high frequency time series, the USD/Euro exchange rate. The first method investigated is Autoregressive Neural Network Processes (ARNN, a neural network based nonlinear extension of classical autoregressive process models from time series analysis (see Dietz 2011. Its advantage is its simple but scalable time series process model architecture, which is able to include all kinds of nonlinearities based on the universal approximation theorem of Hornik, Stinchcombe and White 1989 and the extensions of Hornik 1993. However, restrictions related to the numeric estimation procedures limit the flexibility of the model. The alternative is a Support Vector Machine Model (SVM, Vapnik 1995. The two methods compared have different approaches of error minimization (Empirical error minimization at the ARNN vs. structural error minimization at the SVM. Our new finding is, that time series data classified as “Big Data” need new methods for prediction. Estimation and prediction was performed using the statistical programming language R. Besides prediction results we will also discuss the impact of Big Data on data preparation and model validation steps. Normal 0 21 false false false DE X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Normale Tabelle"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";}
Foundations of Sequence-to-Sequence Modeling for Time Series
Kuznetsov, Vitaly; Mariet, Zelda
2018-01-01
The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practiti...
Single-Index Additive Vector Autoregressive Time Series Models
LI, YEHUA
2009-09-01
We study a new class of nonlinear autoregressive models for vector time series, where the current vector depends on single-indexes defined on the past lags and the effects of different lags have an additive form. A sufficient condition is provided for stationarity of such models. We also study estimation of the proposed model using P-splines, hypothesis testing, asymptotics, selection of the order of the autoregression and of the smoothing parameters and nonlinear forecasting. We perform simulation experiments to evaluate our model in various settings. We illustrate our methodology on a climate data set and show that our model provides more accurate yearly forecasts of the El Niño phenomenon, the unusual warming of water in the Pacific Ocean. © 2009 Board of the Foundation of the Scandinavian Journal of Statistics.
Lu, Meng; Pebesma, Edzer; Sanchez, Alber; Verbesselt, Jan
2016-07-01
Growing availability of long-term satellite imagery enables change modeling with advanced spatio-temporal statistical methods. Multidimensional arrays naturally match the structure of spatio-temporal satellite data and can provide a clean modeling process for complex spatio-temporal analysis over large datasets. Our study case illustrates the detection of breakpoints in MODIS imagery time series for land cover change in the Brazilian Amazon using the BFAST (Breaks For Additive Season and Trend) change detection framework. BFAST includes an Empirical Fluctuation Process (EFP) to alarm the change and a change point time locating process. We extend the EFP to account for the spatial autocorrelation between spatial neighbors and assess the effects of spatial correlation when applying BFAST on satellite image time series. In addition, we evaluate how sensitive EFP is to the assumption that its time series residuals are temporally uncorrelated, by modeling it as an autoregressive process. We use arrays as a unified data structure for the modeling process, R to execute the analysis, and an array database management system to scale computation. Our results point to BFAST as a robust approach against mild temporal and spatial correlation, to the use of arrays to ease the modeling process of spatio-temporal change, and towards communicable and scalable analysis.
Climate Prediction Center (CPC) Global Precipitation Time Series
National Oceanic and Atmospheric Administration, Department of Commerce — The global precipitation time series provides time series charts showing observations of daily precipitation as well as accumulated precipitation compared to normal...
Climate Prediction Center (CPC) Global Temperature Time Series
National Oceanic and Atmospheric Administration, Department of Commerce — The global temperature time series provides time series charts using station based observations of daily temperature. These charts provide information about the...
Convergence of statistical moments of particle density time series in scrape-off layer plasmas
Energy Technology Data Exchange (ETDEWEB)
Kube, R., E-mail: ralph.kube@uit.no; Garcia, O. E. [Department of Physics and Technology, UiT - The Arctic University of Norway, N-9037 Tromsø (Norway)
2015-01-15
Particle density fluctuations in the scrape-off layer of magnetically confined plasmas, as measured by gas-puff imaging or Langmuir probes, are modeled as the realization of a stochastic process in which a superposition of pulses with a fixed shape, an exponential distribution of waiting times, and amplitudes represents the radial motion of blob-like structures. With an analytic formulation of the process at hand, we derive expressions for the mean squared error on estimators of sample mean and sample variance as a function of sample length, sampling frequency, and the parameters of the stochastic process. Employing that the probability distribution function of a particularly relevant stochastic process is given by the gamma distribution, we derive estimators for sample skewness and kurtosis and expressions for the mean squared error on these estimators. Numerically, generated synthetic time series are used to verify the proposed estimators, the sample length dependency of their mean squared errors, and their performance. We find that estimators for sample skewness and kurtosis based on the gamma distribution are more precise and more accurate than common estimators based on the method of moments.
Convergence of statistical moments of particle density time series in scrape-off layer plasmas
International Nuclear Information System (INIS)
Kube, R.; Garcia, O. E.
2015-01-01
Particle density fluctuations in the scrape-off layer of magnetically confined plasmas, as measured by gas-puff imaging or Langmuir probes, are modeled as the realization of a stochastic process in which a superposition of pulses with a fixed shape, an exponential distribution of waiting times, and amplitudes represents the radial motion of blob-like structures. With an analytic formulation of the process at hand, we derive expressions for the mean squared error on estimators of sample mean and sample variance as a function of sample length, sampling frequency, and the parameters of the stochastic process. Employing that the probability distribution function of a particularly relevant stochastic process is given by the gamma distribution, we derive estimators for sample skewness and kurtosis and expressions for the mean squared error on these estimators. Numerically, generated synthetic time series are used to verify the proposed estimators, the sample length dependency of their mean squared errors, and their performance. We find that estimators for sample skewness and kurtosis based on the gamma distribution are more precise and more accurate than common estimators based on the method of moments
Recurrent Neural Network Applications for Astronomical Time Series
Protopapas, Pavlos
2017-06-01
The benefits of good predictive models in astronomy lie in early event prediction systems and effective resource allocation. Current time series methods applicable to regular time series have not evolved to generalize for irregular time series. In this talk, I will describe two Recurrent Neural Network methods, Long Short-Term Memory (LSTM) and Echo State Networks (ESNs) for predicting irregular time series. Feature engineering along with a non-linear modeling proved to be an effective predictor. For noisy time series, the prediction is improved by training the network on error realizations using the error estimates from astronomical light curves. In addition to this, we propose a new neural network architecture to remove correlation from the residuals in order to improve prediction and compensate for the noisy data. Finally, I show how to set hyperparameters for a stable and performant solution correctly. In this work, we circumvent this obstacle by optimizing ESN hyperparameters using Bayesian optimization with Gaussian Process priors. This automates the tuning procedure, enabling users to employ the power of RNN without needing an in-depth understanding of the tuning procedure.
Modeling Financial Time Series Based on a Market Microstructure Model with Leverage Effect
Yanhui Xi; Hui Peng; Yemei Qin
2016-01-01
The basic market microstructure model specifies that the price/return innovation and the volatility innovation are independent Gaussian white noise processes. However, the financial leverage effect has been found to be statistically significant in many financial time series. In this paper, a novel market microstructure model with leverage effects is proposed. The model specification assumed a negative correlation in the errors between the price/return innovation and the volatility innovation....
Transition Icons for Time-Series Visualization and Exploratory Analysis.
Nickerson, Paul V; Baharloo, Raheleh; Wanigatunga, Amal A; Manini, Todd M; Tighe, Patrick J; Rashidi, Parisa
2018-03-01
The modern healthcare landscape has seen the rapid emergence of techniques and devices that temporally monitor and record physiological signals. The prevalence of time-series data within the healthcare field necessitates the development of methods that can analyze the data in order to draw meaningful conclusions. Time-series behavior is notoriously difficult to intuitively understand due to its intrinsic high-dimensionality, which is compounded in the case of analyzing groups of time series collected from different patients. Our framework, which we call transition icons, renders common patterns in a visual format useful for understanding the shared behavior within groups of time series. Transition icons are adept at detecting and displaying subtle differences and similarities, e.g., between measurements taken from patients receiving different treatment strategies or stratified by demographics. We introduce various methods that collectively allow for exploratory analysis of groups of time series, while being free of distribution assumptions and including simple heuristics for parameter determination. Our technique extracts discrete transition patterns from symbolic aggregate approXimation representations, and compiles transition frequencies into a bag of patterns constructed for each group. These transition frequencies are normalized and aligned in icon form to intuitively display the underlying patterns. We demonstrate the transition icon technique for two time-series datasets-postoperative pain scores, and hip-worn accelerometer activity counts. We believe transition icons can be an important tool for researchers approaching time-series data, as they give rich and intuitive information about collective time-series behaviors.
Multifractal analysis of visibility graph-based Ito-related connectivity time series.
Czechowski, Zbigniew; Lovallo, Michele; Telesca, Luciano
2016-02-01
In this study, we investigate multifractal properties of connectivity time series resulting from the visibility graph applied to normally distributed time series generated by the Ito equations with multiplicative power-law noise. We show that multifractality of the connectivity time series (i.e., the series of numbers of links outgoing any node) increases with the exponent of the power-law noise. The multifractality of the connectivity time series could be due to the width of connectivity degree distribution that can be related to the exit time of the associated Ito time series. Furthermore, the connectivity time series are characterized by persistence, although the original Ito time series are random; this is due to the procedure of visibility graph that, connecting the values of the time series, generates persistence but destroys most of the nonlinear correlations. Moreover, the visibility graph is sensitive for detecting wide "depressions" in input time series.
Mathematical foundations of time series analysis a concise introduction
Beran, Jan
2017-01-01
This book provides a concise introduction to the mathematical foundations of time series analysis, with an emphasis on mathematical clarity. The text is reduced to the essential logical core, mostly using the symbolic language of mathematics, thus enabling readers to very quickly grasp the essential reasoning behind time series analysis. It appeals to anybody wanting to understand time series in a precise, mathematical manner. It is suitable for graduate courses in time series analysis but is equally useful as a reference work for students and researchers alike.
Time series analysis in the social sciences the fundamentals
Shin, Youseop
2017-01-01
Times Series Analysis in the Social Sciences is a practical and highly readable introduction written exclusively for students and researchers whose mathematical background is limited to basic algebra. The book focuses on fundamental elements of time series analysis that social scientists need to understand so they can employ time series analysis for their research and practice. Through step-by-step explanations and using monthly violent crime rates as case studies, this book explains univariate time series from the preliminary visual analysis through the modeling of seasonality, trends, and re
Algorithm for Compressing Time-Series Data
Hawkins, S. Edward, III; Darlington, Edward Hugo
2012-01-01
An algorithm based on Chebyshev polynomials effects lossy compression of time-series data or other one-dimensional data streams (e.g., spectral data) that are arranged in blocks for sequential transmission. The algorithm was developed for use in transmitting data from spacecraft scientific instruments to Earth stations. In spite of its lossy nature, the algorithm preserves the information needed for scientific analysis. The algorithm is computationally simple, yet compresses data streams by factors much greater than two. The algorithm is not restricted to spacecraft or scientific uses: it is applicable to time-series data in general. The algorithm can also be applied to general multidimensional data that have been converted to time-series data, a typical example being image data acquired by raster scanning. However, unlike most prior image-data-compression algorithms, this algorithm neither depends on nor exploits the two-dimensional spatial correlations that are generally present in images. In order to understand the essence of this compression algorithm, it is necessary to understand that the net effect of this algorithm and the associated decompression algorithm is to approximate the original stream of data as a sequence of finite series of Chebyshev polynomials. For the purpose of this algorithm, a block of data or interval of time for which a Chebyshev polynomial series is fitted to the original data is denoted a fitting interval. Chebyshev approximation has two properties that make it particularly effective for compressing serial data streams with minimal loss of scientific information: The errors associated with a Chebyshev approximation are nearly uniformly distributed over the fitting interval (this is known in the art as the "equal error property"); and the maximum deviations of the fitted Chebyshev polynomial from the original data have the smallest possible values (this is known in the art as the "min-max property").
Modeling of Volatility with Non-linear Time Series Model
Kim Song Yon; Kim Mun Chol
2013-01-01
In this paper, non-linear time series models are used to describe volatility in financial time series data. To describe volatility, two of the non-linear time series are combined into form TAR (Threshold Auto-Regressive Model) with AARCH (Asymmetric Auto-Regressive Conditional Heteroskedasticity) error term and its parameter estimation is studied.
Machiwal, Deepesh; Kumar, Sanjay; Dayal, Devi
2016-05-01
This study aimed at characterization of rainfall dynamics in a hot arid region of Gujarat, India by employing time-series modeling techniques and sustainability approach. Five characteristics, i.e., normality, stationarity, homogeneity, presence/absence of trend, and persistence of 34-year (1980-2013) period annual rainfall time series of ten stations were identified/detected by applying multiple parametric and non-parametric statistical tests. Furthermore, the study involves novelty of proposing sustainability concept for evaluating rainfall time series and demonstrated the concept, for the first time, by identifying the most sustainable rainfall series following reliability ( R y), resilience ( R e), and vulnerability ( V y) approach. Box-whisker plots, normal probability plots, and histograms indicated that the annual rainfall of Mandvi and Dayapar stations is relatively more positively skewed and non-normal compared with that of other stations, which is due to the presence of severe outlier and extreme. Results of Shapiro-Wilk test and Lilliefors test revealed that annual rainfall series of all stations significantly deviated from normal distribution. Two parametric t tests and the non-parametric Mann-Whitney test indicated significant non-stationarity in annual rainfall of Rapar station, where the rainfall was also found to be non-homogeneous based on the results of four parametric homogeneity tests. Four trend tests indicated significantly increasing rainfall trends at Rapar and Gandhidham stations. The autocorrelation analysis suggested the presence of persistence of statistically significant nature in rainfall series of Bhachau (3-year time lag), Mundra (1- and 9-year time lag), Nakhatrana (9-year time lag), and Rapar (3- and 4-year time lag). Results of sustainability approach indicated that annual rainfall of Mundra and Naliya stations ( R y = 0.50 and 0.44; R e = 0.47 and 0.47; V y = 0.49 and 0.46, respectively) are the most sustainable and dependable
Layered Ensemble Architecture for Time Series Forecasting.
Rahman, Md Mustafizur; Islam, Md Monirul; Murase, Kazuyuki; Yao, Xin
2016-01-01
Time series forecasting (TSF) has been widely used in many application areas such as science, engineering, and finance. The phenomena generating time series are usually unknown and information available for forecasting is only limited to the past values of the series. It is, therefore, necessary to use an appropriate number of past values, termed lag, for forecasting. This paper proposes a layered ensemble architecture (LEA) for TSF problems. Our LEA consists of two layers, each of which uses an ensemble of multilayer perceptron (MLP) networks. While the first ensemble layer tries to find an appropriate lag, the second ensemble layer employs the obtained lag for forecasting. Unlike most previous work on TSF, the proposed architecture considers both accuracy and diversity of the individual networks in constructing an ensemble. LEA trains different networks in the ensemble by using different training sets with an aim of maintaining diversity among the networks. However, it uses the appropriate lag and combines the best trained networks to construct the ensemble. This indicates LEAs emphasis on accuracy of the networks. The proposed architecture has been tested extensively on time series data of neural network (NN)3 and NN5 competitions. It has also been tested on several standard benchmark time series data. In terms of forecasting accuracy, our experimental results have revealed clearly that LEA is better than other ensemble and nonensemble methods.
Using the mean approach in pooling cross-section and time series data for regression modelling
International Nuclear Information System (INIS)
Nuamah, N.N.N.N.
1989-12-01
The mean approach is one of the methods for pooling cross section and time series data for mathematical-statistical modelling. Though a simple approach, its results are sometimes paradoxical in nature. However, researchers still continue using it for its simplicity. Here, the paper investigates the nature and source of such unwanted phenomena. (author). 7 refs
Fractal time series analysis of postural stability in elderly and control subjects
Directory of Open Access Journals (Sweden)
Doussot Michel
2007-05-01
Full Text Available Abstract Background The study of balance using stabilogram analysis is of particular interest in the study of falls. Although simple statistical parameters derived from the stabilogram have been shown to predict risk of falls, such measures offer little insight into the underlying control mechanisms responsible for degradation in balance. In contrast, fractal and non-linear time-series analysis of stabilograms, such as estimations of the Hurst exponent (H, may provide information related to the underlying motor control strategies governing postural stability. In order to be adapted for a home-based follow-up of balance, such methods need to be robust, regardless of the experimental protocol, while producing time-series that are as short as possible. The present study compares two methods of calculating H: Detrended Fluctuation Analysis (DFA and Stabilogram Diffusion Analysis (SDA for elderly and control subjects, as well as evaluating the effect of recording duration. Methods Centre of pressure signals were obtained from 90 young adult subjects and 10 elderly subjects. Data were sampled at 100 Hz for 30 s, including stepping onto and off the force plate. Estimations of H were made using sliding windows of 10, 5, and 2.5 s durations, with windows slid forward in 1-s increments. Multivariate analysis of variance was used to test for the effect of time, age and estimation method on the Hurst exponent, while the intra-class correlation coefficient (ICC was used as a measure of reliability. Results Both SDA and DFA methods were able to identify differences in postural stability between control and elderly subjects for time series as short as 5 s, with ICC values as high as 0.75 for DFA. Conclusion Both methods would be well-suited to non-invasive longitudinal assessment of balance. In addition, reliable estimations of H were obtained from time series as short as 5 s.
Forecasting incidence of dengue in Rajasthan, using time series analyses.
Bhatnagar, Sunil; Lal, Vivek; Gupta, Shiv D; Gupta, Om P
2012-01-01
To develop a prediction model for dengue fever/dengue haemorrhagic fever (DF/DHF) using time series data over the past decade in Rajasthan and to forecast monthly DF/DHF incidence for 2011. Seasonal autoregressive integrated moving average (SARIMA) model was used for statistical modeling. During January 2001 to December 2010, the reported DF/DHF cases showed a cyclical pattern with seasonal variation. SARIMA (0,0,1) (0,1,1) 12 model had the lowest normalized Bayesian information criteria (BIC) of 9.426 and mean absolute percentage error (MAPE) of 263.361 and appeared to be the best model. The proportion of variance explained by the model was 54.3%. Adequacy of the model was established through Ljung-Box test (Q statistic 4.910 and P-value 0.996), which showed no significant correlation between residuals at different lag times. The forecast for the year 2011 showed a seasonal peak in the month of October with an estimated 546 cases. Application of SARIMA model may be useful for forecast of cases and impending outbreaks of DF/DHF and other infectious diseases, which exhibit seasonal pattern.
A window-based time series feature extraction method.
Katircioglu-Öztürk, Deniz; Güvenir, H Altay; Ravens, Ursula; Baykal, Nazife
2017-10-01
This study proposes a robust similarity score-based time series feature extraction method that is termed as Window-based Time series Feature ExtraCtion (WTC). Specifically, WTC generates domain-interpretable results and involves significantly low computational complexity thereby rendering itself useful for densely sampled and populated time series datasets. In this study, WTC is applied to a proprietary action potential (AP) time series dataset on human cardiomyocytes and three precordial leads from a publicly available electrocardiogram (ECG) dataset. This is followed by comparing WTC in terms of predictive accuracy and computational complexity with shapelet transform and fast shapelet transform (which constitutes an accelerated variant of the shapelet transform). The results indicate that WTC achieves a slightly higher classification performance with significantly lower execution time when compared to its shapelet-based alternatives. With respect to its interpretable features, WTC has a potential to enable medical experts to explore definitive common trends in novel datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Statistical time lags in ac discharges
International Nuclear Information System (INIS)
Sobota, A; Kanters, J H M; Van Veldhuizen, E M; Haverlag, M; Manders, F
2011-01-01
The paper presents statistical time lags measured for breakdown events in near-atmospheric pressure argon and xenon. Ac voltage at 100, 400 and 800 kHz was used to drive the breakdown processes, and the voltage amplitude slope was varied between 10 and 1280 V ms -1 . The values obtained for the statistical time lags are roughly between 1 and 150 ms. It is shown that the statistical time lags in ac-driven discharges follow the same general trends as the discharges driven by voltage of monotonic slope. In addition, the validity of the Cobine-Easton expression is tested at an alternating voltage form.
Statistical time lags in ac discharges
Energy Technology Data Exchange (ETDEWEB)
Sobota, A; Kanters, J H M; Van Veldhuizen, E M; Haverlag, M [Eindhoven University of Technology, Department of Applied Physics, Postbus 513, 5600MB Eindhoven (Netherlands); Manders, F, E-mail: a.sobota@tue.nl [Philips Lighting, LightLabs, Mathildelaan 1, 5600JM Eindhoven (Netherlands)
2011-04-06
The paper presents statistical time lags measured for breakdown events in near-atmospheric pressure argon and xenon. Ac voltage at 100, 400 and 800 kHz was used to drive the breakdown processes, and the voltage amplitude slope was varied between 10 and 1280 V ms{sup -1}. The values obtained for the statistical time lags are roughly between 1 and 150 ms. It is shown that the statistical time lags in ac-driven discharges follow the same general trends as the discharges driven by voltage of monotonic slope. In addition, the validity of the Cobine-Easton expression is tested at an alternating voltage form.
Directory of Open Access Journals (Sweden)
Madeira Sara C
2009-06-01
Full Text Available Abstract Background The ability to monitor the change in expression patterns over time, and to observe the emergence of coherent temporal responses using gene expression time series, obtained from microarray experiments, is critical to advance our understanding of complex biological processes. In this context, biclustering algorithms have been recognized as an important tool for the discovery of local expression patterns, which are crucial to unravel potential regulatory mechanisms. Although most formulations of the biclustering problem are NP-hard, when working with time series expression data the interesting biclusters can be restricted to those with contiguous columns. This restriction leads to a tractable problem and enables the design of efficient biclustering algorithms able to identify all maximal contiguous column coherent biclusters. Methods In this work, we propose e-CCC-Biclustering, a biclustering algorithm that finds and reports all maximal contiguous column coherent biclusters with approximate expression patterns in time polynomial in the size of the time series gene expression matrix. This polynomial time complexity is achieved by manipulating a discretized version of the original matrix using efficient string processing techniques. We also propose extensions to deal with missing values, discover anticorrelated and scaled expression patterns, and different ways to compute the errors allowed in the expression patterns. We propose a scoring criterion combining the statistical significance of expression patterns with a similarity measure between overlapping biclusters. Results We present results in real data showing the effectiveness of e-CCC-Biclustering and its relevance in the discovery of regulatory modules describing the transcriptomic expression patterns occurring in Saccharomyces cerevisiae in response to heat stress. In particular, the results show the advantage of considering approximate patterns when compared to state of
Directory of Open Access Journals (Sweden)
Ibgtc Bowala
2017-06-01
Full Text Available With the rapid growth of financial markets, analyzers are paying more attention on predictions. Stock data are time series data, with huge amounts. Feasible solution for handling the increasing amount of data is to use a cluster for parallel processing, and Hadoop parallel computing platform is a typical representative. There are various statistical models for forecasting time series data, but accurate clusters are a pre-requirement. Clustering analysis for time series data is one of the main methods for mining time series data for many other analysis processes. However, general clustering algorithms cannot perform clustering for time series data because series data has a special structure and a high dimensionality has highly co-related values due to high noise level. A novel model for time series clustering is presented using BIRCH, based on piecewise SVD, leading to a novel dimension reduction approach. Highly co-related features are handled using SVD with a novel approach for dimensionality reduction in order to keep co-related behavior optimal and then use BIRCH for clustering. The algorithm is a novel model that can handle massive time series data. Finally, this new model is successfully applied to real stock time series data of Yahoo finance with satisfactory results.
Razavi, Saman; Vogel, Richard
2018-02-01
Prewhitening, the process of eliminating or reducing short-term stochastic persistence to enable detection of deterministic change, has been extensively applied to time series analysis of a range of geophysical variables. Despite the controversy around its utility, methodologies for prewhitening time series continue to be a critical feature of a variety of analyses including: trend detection of hydroclimatic variables and reconstruction of climate and/or hydrology through proxy records such as tree rings. With a focus on the latter, this paper presents a generalized approach to exploring the impact of a wide range of stochastic structures of short- and long-term persistence on the variability of hydroclimatic time series. Through this approach, we examine the impact of prewhitening on the inferred variability of time series across time scales. We document how a focus on prewhitened, residual time series can be misleading, as it can drastically distort (or remove) the structure of variability across time scales. Through examples with actual data, we show how such loss of information in prewhitened time series of tree rings (so-called "residual chronologies") can lead to the underestimation of extreme conditions in climate and hydrology, particularly droughts, reconstructed for centuries preceding the historical period.
Meshgi, Ali; Schmitter, Petra; Babovic, Vladan; Chui, Ting Fong May
2014-11-01
Developing reliable methods to estimate stream baseflow has been a subject of interest due to its importance in catchment response and sustainable watershed management. However, to date, in the absence of complex numerical models, baseflow is most commonly estimated using statistically derived empirical approaches that do not directly incorporate physically-meaningful information. On the other hand, Artificial Intelligence (AI) tools such as Genetic Programming (GP) offer unique capabilities to reduce the complexities of hydrological systems without losing relevant physical information. This study presents a simple-to-use empirical equation to estimate baseflow time series using GP so that minimal data is required and physical information is preserved. A groundwater numerical model was first adopted to simulate baseflow for a small semi-urban catchment (0.043 km2) located in Singapore. GP was then used to derive an empirical equation relating baseflow time series to time series of groundwater table fluctuations, which are relatively easily measured and are physically related to baseflow generation. The equation was then generalized for approximating baseflow in other catchments and validated for a larger vegetation-dominated basin located in the US (24 km2). Overall, this study used GP to propose a simple-to-use equation to predict baseflow time series based on only three parameters: minimum daily baseflow of the entire period, area of the catchment and groundwater table fluctuations. It serves as an alternative approach for baseflow estimation in un-gauged systems when only groundwater table and soil information is available, and is thus complementary to other methods that require discharge measurements.
DTW-APPROACH FOR UNCORRELATED MULTIVARIATE TIME SERIES IMPUTATION
Phan , Thi-Thu-Hong; Poisson Caillault , Emilie; Bigand , André; Lefebvre , Alain
2017-01-01
International audience; Missing data are inevitable in almost domains of applied sciences. Data analysis with missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). Some well-known methods for multivariate time series imputation require high correlations between series or their features. In this paper , we propose an approach based on the shape-behaviour relation in low/un-correlated multivariate time series under an assumption of...
Yozgatligil, Ceylan; Aslan, Sipan; Iyigun, Cem; Batmaz, Inci
2013-04-01
This study aims to compare several imputation methods to complete the missing values of spatio-temporal meteorological time series. To this end, six imputation methods are assessed with respect to various criteria including accuracy, robustness, precision, and efficiency for artificially created missing data in monthly total precipitation and mean temperature series obtained from the Turkish State Meteorological Service. Of these methods, simple arithmetic average, normal ratio (NR), and NR weighted with correlations comprise the simple ones, whereas multilayer perceptron type neural network and multiple imputation strategy adopted by Monte Carlo Markov Chain based on expectation-maximization (EM-MCMC) are computationally intensive ones. In addition, we propose a modification on the EM-MCMC method. Besides using a conventional accuracy measure based on squared errors, we also suggest the correlation dimension (CD) technique of nonlinear dynamic time series analysis which takes spatio-temporal dependencies into account for evaluating imputation performances. Depending on the detailed graphical and quantitative analysis, it can be said that although computational methods, particularly EM-MCMC method, are computationally inefficient, they seem favorable for imputation of meteorological time series with respect to different missingness periods considering both measures and both series studied. To conclude, using the EM-MCMC algorithm for imputing missing values before conducting any statistical analyses of meteorological data will definitely decrease the amount of uncertainty and give more robust results. Moreover, the CD measure can be suggested for the performance evaluation of missing data imputation particularly with computational methods since it gives more precise results in meteorological time series.
Variable Selection in Time Series Forecasting Using Random Forests
Directory of Open Access Journals (Sweden)
Hristos Tyralis
2017-10-01
Full Text Available Time series forecasting using machine learning algorithms has gained popularity recently. Random forest is a machine learning algorithm implemented in time series forecasting; however, most of its forecasting properties have remained unexplored. Here we focus on assessing the performance of random forests in one-step forecasting using two large datasets of short time series with the aim to suggest an optimal set of predictor variables. Furthermore, we compare its performance to benchmarking methods. The first dataset is composed by 16,000 simulated time series from a variety of Autoregressive Fractionally Integrated Moving Average (ARFIMA models. The second dataset consists of 135 mean annual temperature time series. The highest predictive performance of RF is observed when using a low number of recent lagged predictor variables. This outcome could be useful in relevant future applications, with the prospect to achieve higher predictive accuracy.
Trend time-series modeling and forecasting with neural networks.
Qi, Min; Zhang, G Peter
2008-05-01
Despite its great importance, there has been no general consensus on how to model the trends in time-series data. Compared to traditional approaches, neural networks (NNs) have shown some promise in time-series forecasting. This paper investigates how to best model trend time series using NNs. Four different strategies (raw data, raw data with time index, detrending, and differencing) are used to model various trend patterns (linear, nonlinear, deterministic, stochastic, and breaking trend). We find that with NNs differencing often gives meritorious results regardless of the underlying data generating processes (DGPs). This finding is also confirmed by the real gross national product (GNP) series.
Almog, Assaf; Garlaschelli, Diego
2014-09-01
The dynamics of complex systems, from financial markets to the brain, can be monitored in terms of multiple time series of activity of the constituent units, such as stocks or neurons, respectively. While the main focus of time series analysis is on the magnitude of temporal increments, a significant piece of information is encoded into the binary projection (i.e. the sign) of such increments. In this paper we provide further evidence of this by showing strong nonlinear relations between binary and non-binary properties of financial time series. These relations are a novel quantification of the fact that extreme price increments occur more often when most stocks move in the same direction. We then introduce an information-theoretic approach to the analysis of the binary signature of single and multiple time series. Through the definition of maximum-entropy ensembles of binary matrices and their mapping to spin models in statistical physics, we quantify the information encoded into the simplest binary properties of real time series and identify the most informative property given a set of measurements. Our formalism is able to accurately replicate, and mathematically characterize, the observed binary/non-binary relations. We also obtain a phase diagram allowing us to identify, based only on the instantaneous aggregate return of a set of multiple time series, a regime where the so-called ‘market mode’ has an optimal interpretation in terms of collective (endogenous) effects, a regime where it is parsimoniously explained by pure noise, and a regime where it can be regarded as a combination of endogenous and exogenous factors. Our approach allows us to connect spin models, simple stochastic processes, and ensembles of time series inferred from partial information.
International Nuclear Information System (INIS)
Almog, Assaf; Garlaschelli, Diego
2014-01-01
The dynamics of complex systems, from financial markets to the brain, can be monitored in terms of multiple time series of activity of the constituent units, such as stocks or neurons, respectively. While the main focus of time series analysis is on the magnitude of temporal increments, a significant piece of information is encoded into the binary projection (i.e. the sign) of such increments. In this paper we provide further evidence of this by showing strong nonlinear relations between binary and non-binary properties of financial time series. These relations are a novel quantification of the fact that extreme price increments occur more often when most stocks move in the same direction. We then introduce an information-theoretic approach to the analysis of the binary signature of single and multiple time series. Through the definition of maximum-entropy ensembles of binary matrices and their mapping to spin models in statistical physics, we quantify the information encoded into the simplest binary properties of real time series and identify the most informative property given a set of measurements. Our formalism is able to accurately replicate, and mathematically characterize, the observed binary/non-binary relations. We also obtain a phase diagram allowing us to identify, based only on the instantaneous aggregate return of a set of multiple time series, a regime where the so-called ‘market mode’ has an optimal interpretation in terms of collective (endogenous) effects, a regime where it is parsimoniously explained by pure noise, and a regime where it can be regarded as a combination of endogenous and exogenous factors. Our approach allows us to connect spin models, simple stochastic processes, and ensembles of time series inferred from partial information. (paper)
Segmentation of Nonstationary Time Series with Geometric Clustering
DEFF Research Database (Denmark)
Bocharov, Alexei; Thiesson, Bo
2013-01-01
We introduce a non-parametric method for segmentation in regimeswitching time-series models. The approach is based on spectral clustering of target-regressor tuples and derives a switching regression tree, where regime switches are modeled by oblique splits. Such models can be learned efficiently...... from data, where clustering is used to propose one single split candidate at each split level. We use the class of ART time series models to serve as illustration, but because of the non-parametric nature of our segmentation approach, it readily generalizes to a wide range of time-series models that go...
Confidence in Phase Definition for Periodicity in Genes Expression Time Series.
El Anbari, Mohammed; Fadda, Abeer; Ptitsyn, Andrey
2015-01-01
Circadian oscillation in baseline gene expression plays an important role in the regulation of multiple cellular processes. Most of the knowledge of circadian gene expression is based on studies measuring gene expression over time. Our ability to dissect molecular events in time is determined by the sampling frequency of such experiments. However, the real peaks of gene activity can be at any time on or between the time points at which samples are collected. Thus, some genes with a peak activity near the observation point have their phase of oscillation detected with better precision then those which peak between observation time points. Separating genes for which we can confidently identify peak activity from ambiguous genes can improve the analysis of time series gene expression. In this study we propose a new statistical method to quantify the phase confidence of circadian genes. The numerical performance of the proposed method has been tested using three real gene expression data sets.
Time Series Decomposition into Oscillation Components and Phase Estimation.
Matsuda, Takeru; Komaki, Fumiyasu
2017-02-01
Many time series are naturally considered as a superposition of several oscillation components. For example, electroencephalogram (EEG) time series include oscillation components such as alpha, beta, and gamma. We propose a method for decomposing time series into such oscillation components using state-space models. Based on the concept of random frequency modulation, gaussian linear state-space models for oscillation components are developed. In this model, the frequency of an oscillator fluctuates by noise. Time series decomposition is accomplished by this model like the Bayesian seasonal adjustment method. Since the model parameters are estimated from data by the empirical Bayes' method, the amplitudes and the frequencies of oscillation components are determined in a data-driven manner. Also, the appropriate number of oscillation components is determined with the Akaike information criterion (AIC). In this way, the proposed method provides a natural decomposition of the given time series into oscillation components. In neuroscience, the phase of neural time series plays an important role in neural information processing. The proposed method can be used to estimate the phase of each oscillation component and has several advantages over a conventional method based on the Hilbert transform. Thus, the proposed method enables an investigation of the phase dynamics of time series. Numerical results show that the proposed method succeeds in extracting intermittent oscillations like ripples and detecting the phase reset phenomena. We apply the proposed method to real data from various fields such as astronomy, ecology, tidology, and neuroscience.
Time series regression-based pairs trading in the Korean equities market
Kim, Saejoon; Heo, Jun
2017-07-01
Pairs trading is an instance of statistical arbitrage that relies on heavy quantitative data analysis to profit by capitalising low-risk trading opportunities provided by anomalies of related assets. A key element in pairs trading is the rule by which open and close trading triggers are defined. This paper investigates the use of time series regression to define the rule which has previously been identified with fixed threshold-based approaches. Empirical results indicate that our approach may yield significantly increased excess returns compared to ones obtained by previous approaches on large capitalisation stocks in the Korean equities market.
Multi-Scale Dissemination of Time Series Data
DEFF Research Database (Denmark)
Guo, Qingsong; Zhou, Yongluan; Su, Li
2013-01-01
In this paper, we consider the problem of continuous dissemination of time series data, such as sensor measurements, to a large number of subscribers. These subscribers fall into multiple subscription levels, where each subscription level is specified by the bandwidth constraint of a subscriber......, which is an abstract indicator for both the physical limits and the amount of data that the subscriber would like to handle. To handle this problem, we propose a system framework for multi-scale time series data dissemination that employs a typical tree-based dissemination network and existing time...
A Modularized Efficient Framework for Non-Markov Time Series Estimation
Schamberg, Gabriel; Ba, Demba; Coleman, Todd P.
2018-06-01
We present a compartmentalized approach to finding the maximum a-posteriori (MAP) estimate of a latent time series that obeys a dynamic stochastic model and is observed through noisy measurements. We specifically consider modern signal processing problems with non-Markov signal dynamics (e.g. group sparsity) and/or non-Gaussian measurement models (e.g. point process observation models used in neuroscience). Through the use of auxiliary variables in the MAP estimation problem, we show that a consensus formulation of the alternating direction method of multipliers (ADMM) enables iteratively computing separate estimates based on the likelihood and prior and subsequently "averaging" them in an appropriate sense using a Kalman smoother. As such, this can be applied to a broad class of problem settings and only requires modular adjustments when interchanging various aspects of the statistical model. Under broad log-concavity assumptions, we show that the separate estimation problems are convex optimization problems and that the iterative algorithm converges to the MAP estimate. As such, this framework can capture non-Markov latent time series models and non-Gaussian measurement models. We provide example applications involving (i) group-sparsity priors, within the context of electrophysiologic specrotemporal estimation, and (ii) non-Gaussian measurement models, within the context of dynamic analyses of learning with neural spiking and behavioral observations.
Modeling multivariate time series on manifolds with skew radial basis functions.
Jamshidi, Arta A; Kirby, Michael J
2011-01-01
We present an approach for constructing nonlinear empirical mappings from high-dimensional domains to multivariate ranges. We employ radial basis functions and skew radial basis functions for constructing a model using data that are potentially scattered or sparse. The algorithm progresses iteratively, adding a new function at each step to refine the model. The placement of the functions is driven by a statistical hypothesis test that accounts for correlation in the multivariate range variables. The test is applied on training and validation data and reveals nonstatistical or geometric structure when it fails. At each step, the added function is fit to data contained in a spatiotemporally defined local region to determine the parameters--in particular, the scale of the local model. The scale of the function is determined by the zero crossings of the autocorrelation function of the residuals. The model parameters and the number of basis functions are determined automatically from the given data, and there is no need to initialize any ad hoc parameters save for the selection of the skew radial basis functions. Compactly supported skew radial basis functions are employed to improve model accuracy, order, and convergence properties. The extension of the algorithm to higher-dimensional ranges produces reduced-order models by exploiting the existence of correlation in the range variable data. Structure is tested not just in a single time series but between all pairs of time series. We illustrate the new methodologies using several illustrative problems, including modeling data on manifolds and the prediction of chaotic time series.
Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series
Vautard, R.; Ghil, M.
1989-01-01
Two dimensions of a dynamical system given by experimental time series are distinguished. Statistical dimension gives a theoretical upper bound for the minimal number of degrees of freedom required to describe the attractor up to the accuracy of the data, taking into account sampling and noise problems. The dynamical dimension is the intrinsic dimension of the attractor and does not depend on the quality of the data. Singular Spectrum Analysis (SSA) provides estimates of the statistical dimension. SSA also describes the main physical phenomena reflected by the data. It gives adaptive spectral filters associated with the dominant oscillations of the system and clarifies the noise characteristics of the data. SSA is applied to four paleoclimatic records. The principal climatic oscillations and the regime changes in their amplitude are detected. About 10 degrees of freedom are statistically significant in the data. Large noise and insufficient sample length do not allow reliable estimates of the dynamical dimension.
RADON CONCENTRATION TIME SERIES MODELING AND APPLICATION DISCUSSION.
Stránský, V; Thinová, L
2017-11-01
In the year 2010 a continual radon measurement was established at Mladeč Caves in the Czech Republic using a continual radon monitor RADIM3A. In order to model radon time series in the years 2010-15, the Box-Jenkins Methodology, often used in econometrics, was applied. Because of the behavior of radon concentrations (RCs), a seasonal integrated, autoregressive moving averages model with exogenous variables (SARIMAX) has been chosen to model the measured time series. This model uses the time series seasonality, previously acquired values and delayed atmospheric parameters, to forecast RC. The developed model for RC time series is called regARIMA(5,1,3). Model residuals could be retrospectively compared with seismic evidence of local or global earthquakes, which occurred during the RCs measurement. This technique enables us to asses if continuously measured RC could serve an earthquake precursor. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Similarity estimators for irregular and age uncertain time series
Rehfeld, K.; Kurths, J.
2013-09-01
Paleoclimate time series are often irregularly sampled and age uncertain, which is an important technical challenge to overcome for successful reconstruction of past climate variability and dynamics. Visual comparison and interpolation-based linear correlation approaches have been used to infer dependencies from such proxy time series. While the first is subjective, not measurable and not suitable for the comparison of many datasets at a time, the latter introduces interpolation bias, and both face difficulties if the underlying dependencies are nonlinear. In this paper we investigate similarity estimators that could be suitable for the quantitative investigation of dependencies in irregular and age uncertain time series. We compare the Gaussian-kernel based cross correlation (gXCF, Rehfeld et al., 2011) and mutual information (gMI, Rehfeld et al., 2013) against their interpolation-based counterparts and the new event synchronization function (ESF). We test the efficiency of the methods in estimating coupling strength and coupling lag numerically, using ensembles of synthetic stalagmites with short, autocorrelated, linear and nonlinearly coupled proxy time series, and in the application to real stalagmite time series. In the linear test case coupling strength increases are identified consistently for all estimators, while in the nonlinear test case the correlation-based approaches fail. The lag at which the time series are coupled is identified correctly as the maximum of the similarity functions in around 60-55% (in the linear case) to 53-42% (for the nonlinear processes) of the cases when the dating of the synthetic stalagmite is perfectly precise. If the age uncertainty increases beyond 5% of the time series length, however, the true coupling lag is not identified more often than the others for which the similarity function was estimated. Age uncertainty contributes up to half of the uncertainty in the similarity estimation process. Time series irregularity
Similarity estimators for irregular and age-uncertain time series
Rehfeld, K.; Kurths, J.
2014-01-01
Paleoclimate time series are often irregularly sampled and age uncertain, which is an important technical challenge to overcome for successful reconstruction of past climate variability and dynamics. Visual comparison and interpolation-based linear correlation approaches have been used to infer dependencies from such proxy time series. While the first is subjective, not measurable and not suitable for the comparison of many data sets at a time, the latter introduces interpolation bias, and both face difficulties if the underlying dependencies are nonlinear. In this paper we investigate similarity estimators that could be suitable for the quantitative investigation of dependencies in irregular and age-uncertain time series. We compare the Gaussian-kernel-based cross-correlation (gXCF, Rehfeld et al., 2011) and mutual information (gMI, Rehfeld et al., 2013) against their interpolation-based counterparts and the new event synchronization function (ESF). We test the efficiency of the methods in estimating coupling strength and coupling lag numerically, using ensembles of synthetic stalagmites with short, autocorrelated, linear and nonlinearly coupled proxy time series, and in the application to real stalagmite time series. In the linear test case, coupling strength increases are identified consistently for all estimators, while in the nonlinear test case the correlation-based approaches fail. The lag at which the time series are coupled is identified correctly as the maximum of the similarity functions in around 60-55% (in the linear case) to 53-42% (for the nonlinear processes) of the cases when the dating of the synthetic stalagmite is perfectly precise. If the age uncertainty increases beyond 5% of the time series length, however, the true coupling lag is not identified more often than the others for which the similarity function was estimated. Age uncertainty contributes up to half of the uncertainty in the similarity estimation process. Time series irregularity
Directory of Open Access Journals (Sweden)
Jiří Fejfar
2012-01-01
Full Text Available We are presenting results comparison of three artificial intelligence algorithms in a classification of time series derived from musical excerpts in this paper. Algorithms were chosen to represent different principles of classification – statistic approach, neural networks and competitive learning. The first algorithm is a classical k-Nearest neighbours algorithm, the second algorithm is Multilayer Perceptron (MPL, an example of artificial neural network and the third one is a Learning Vector Quantization (LVQ algorithm representing supervised counterpart to unsupervised Self Organizing Map (SOM.After our own former experiments with unlabelled data we moved forward to the data labels utilization, which generally led to a better accuracy of classification results. As we need huge data set of labelled time series (a priori knowledge of correct class which each time series instance belongs to, we used, with a good experience in former studies, musical excerpts as a source of real-world time series. We are using standard deviation of the sound signal as a descriptor of a musical excerpts volume level.We are describing principle of each algorithm as well as its implementation briefly, giving links for further research. Classification results of each algorithm are presented in a confusion matrix showing numbers of misclassifications and allowing to evaluate overall accuracy of the algorithm. Results are compared and particular misclassifications are discussed for each algorithm. Finally the best solution is chosen and further research goals are given.
Westenbroek, Stephen M.; Doherty, John; Walker, John F.; Kelson, Victor A.; Hunt, Randall J.; Cera, Timothy B.
2012-01-01
The TSPROC (Time Series PROCessor) computer software uses a simple scripting language to process and analyze time series. It was developed primarily to assist in the calibration of environmental models. The software is designed to perform calculations on time-series data commonly associated with surface-water models, including calculation of flow volumes, transformation by means of basic arithmetic operations, and generation of seasonal and annual statistics and hydrologic indices. TSPROC can also be used to generate some of the key input files required to perform parameter optimization by means of the PEST (Parameter ESTimation) computer software. Through the use of TSPROC, the objective function for use in the model-calibration process can be focused on specific components of a hydrograph.
Robust Forecasting of Non-Stationary Time Series
Croux, C.; Fried, R.; Gijbels, I.; Mahieu, K.
2010-01-01
This paper proposes a robust forecasting method for non-stationary time series. The time series is modelled using non-parametric heteroscedastic regression, and fitted by a localized MM-estimator, combining high robustness and large efficiency. The proposed method is shown to produce reliable forecasts in the presence of outliers, non-linearity, and heteroscedasticity. In the absence of outliers, the forecasts are only slightly less precise than those based on a localized Least Squares estima...
The Use of Sentinel-1 Time-Series Data to Improve Flood Monitoring in Arid Areas
Directory of Open Access Journals (Sweden)
Sandro Martinis
2018-04-01
Full Text Available Due to the similarity of the radar backscatter over open water and over sand surfaces a reliable near real-time flood mapping based on satellite radar sensors is usually not possible in arid areas. Within this study, an approach is presented to enhance the results of an automatic Sentinel-1 flood processing chain by removing overestimations of the water extent related to low-backscattering sand surfaces using a Sand Exclusion Layer (SEL derived from time-series statistics of Sentinel-1 data sets. The methodology was tested and validated on a flood event in May 2016 at Webi Shabelle River, Somalia and Ethiopia, which has been covered by a time-series of 202 Sentinel-1 scenes within the period June 2014 to May 2017. The approach proved capable of significantly improving the classification accuracy of the Sentinel-1 flood service within this study site. The Overall Accuracy increased by ~5% to a value of 98.5% and the User’s Accuracy increased by 25.2% to a value of 96.0%. Experimental results have shown that the classification accuracy is influenced by several parameters such as the lengths of the time-series used for generating the SEL.
Nielsen, Allan A.; Conradsen, Knut; Skriver, Henning
2016-10-01
., Skriver, H., Nielsen, A. A., and Conradsen, K., "CFAR edge detector for polarimetric SAR images," IEEE Transactions on Geoscience and Remote Sensing 41(1): 20-32, 2003. [4] van Zyl, J. J. and Ulaby, F. T., "Scattering matrix representation for simple targets," in Radar Polarimetry for Geoscience Applications, Ulaby, F. T. and Elachi, C., eds., Artech, Norwood, MA (1990). [5] Canty, M. J., Image Analysis, Classification and Change Detection in Remote Sensing,with Algorithms for ENVI/IDL and Python, Taylor & Francis, CRC Press, third revised ed. (2014). [6] Nielsen, A. A., Conradsen, K., and Skriver, H., "Change detection in full and dual polarization, single- and multi-frequency SAR data," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 8(8): 4041-4048, 2015. [7] Conradsen, K., Nielsen, A. A., and Skriver, H., "Determining the points of change in time series of polarimetric SAR data," IEEE Transactions on Geoscience and Remote Sensing 54(5), 3007-3024, 2016. [9] Christensen, E. L., Skou, N., Dall, J., Woelders, K., rgensen, J. H. J., Granholm, J., and Madsen, S. N., "EMISAR: An absolutely calibrated polarimetric L- and C-band SAR," IEEE Transactions on Geoscience and Remote Sensing 36: 1852-1865 (1998).
Time Series Econometrics for the 21st Century
Hansen, Bruce E.
2017-01-01
The field of econometrics largely started with time series analysis because many early datasets were time-series macroeconomic data. As the field developed, more cross-sectional and longitudinal datasets were collected, which today dominate the majority of academic empirical research. In nonacademic (private sector, central bank, and governmental)…
Effectiveness of firefly algorithm based neural network in time series ...
African Journals Online (AJOL)
Effectiveness of firefly algorithm based neural network in time series forecasting. ... In the experiments, three well known time series were used to evaluate the performance. Results obtained were compared with ... Keywords: Time series, Artificial Neural Network, Firefly Algorithm, Particle Swarm Optimization, Overfitting ...
Statistical inference for financial engineering
Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki
2014-01-01
This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.
Time Series Analysis of Insar Data: Methods and Trends
Osmanoglu, Batuhan; Sunar, Filiz; Wdowinski, Shimon; Cano-Cabral, Enrique
2015-01-01
Time series analysis of InSAR data has emerged as an important tool for monitoring and measuring the displacement of the Earth's surface. Changes in the Earth's surface can result from a wide range of phenomena such as earthquakes, volcanoes, landslides, variations in ground water levels, and changes in wetland water levels. Time series analysis is applied to interferometric phase measurements, which wrap around when the observed motion is larger than one-half of the radar wavelength. Thus, the spatio-temporal ''unwrapping" of phase observations is necessary to obtain physically meaningful results. Several different algorithms have been developed for time series analysis of InSAR data to solve for this ambiguity. These algorithms may employ different models for time series analysis, but they all generate a first-order deformation rate, which can be compared to each other. However, there is no single algorithm that can provide optimal results in all cases. Since time series analyses of InSAR data are used in a variety of applications with different characteristics, each algorithm possesses inherently unique strengths and weaknesses. In this review article, following a brief overview of InSAR technology, we discuss several algorithms developed for time series analysis of InSAR data using an example set of results for measuring subsidence rates in Mexico City.
Dash, Y.; Mishra, S. K.; Panigrahi, B. K.
2017-12-01
Prediction of northeast/post monsoon rainfall which occur during October, November and December (OND) over Indian peninsula is a challenging task due to the dynamic nature of uncertain chaotic climate. It is imperative to elucidate this issue by examining performance of different machine leaning (ML) approaches. The prime objective of this research is to compare between a) statistical prediction using historical rainfall observations and global atmosphere-ocean predictors like Sea Surface Temperature (SST) and Sea Level Pressure (SLP) and b) empirical prediction based on a time series analysis of past rainfall data without using any other predictors. Initially, ML techniques have been applied on SST and SLP data (1948-2014) obtained from NCEP/NCAR reanalysis monthly mean provided by the NOAA ESRL PSD. Later, this study investigated the applicability of ML methods using OND rainfall time series for 1948-2014 and forecasted up to 2018. The predicted values of aforementioned methods were verified using observed time series data collected from Indian Institute of Tropical Meteorology and the result revealed good performance of ML algorithms with minimal error scores. Thus, it is found that both statistical and empirical methods are useful for long range climatic projections.
Frequency-based time-series gene expression recomposition using PRIISM
Directory of Open Access Journals (Sweden)
Rosa Bruce A
2012-06-01
Full Text Available Abstract Background Circadian rhythm pathways influence the expression patterns of as much as 31% of the Arabidopsis genome through complicated interaction pathways, and have been found to be significantly disrupted by biotic and abiotic stress treatments, complicating treatment-response gene discovery methods due to clock pattern mismatches in the fold change-based statistics. The PRIISM (Pattern Recomposition for the Isolation of Independent Signals in Microarray data algorithm outlined in this paper is designed to separate pattern changes induced by different forces, including treatment-response pathways and circadian clock rhythm disruptions. Results Using the Fourier transform, high-resolution time-series microarray data is projected to the frequency domain. By identifying the clock frequency range from the core circadian clock genes, we separate the frequency spectrum to different sections containing treatment-frequency (representing up- or down-regulation by an adaptive treatment response, clock-frequency (representing the circadian clock-disruption response and noise-frequency components. Then, we project the components’ spectra back to the expression domain to reconstruct isolated, independent gene expression patterns representing the effects of the different influences. By applying PRIISM on a high-resolution time-series Arabidopsis microarray dataset under a cold treatment, we systematically evaluated our method using maximum fold change and principal component analyses. The results of this study showed that the ranked treatment-frequency fold change results produce fewer false positives than the original methodology, and the 26-hour timepoint in our dataset was the best statistic for distinguishing the most known cold-response genes. In addition, six novel cold-response genes were discovered. PRIISM also provides gene expression data which represents only circadian clock influences, and may be useful for circadian clock studies
Modelling conditional heteroscedasticity in nonstationary series
Cizek, P.; Cizek, P.; Härdle, W.K.; Weron, R.
2011-01-01
A vast amount of econometrical and statistical research deals with modeling financial time series and their volatility, which measures the dispersion of a series at a point in time (i.e., conditional variance). Although financial markets have been experiencing many shorter and longer periods of
Self-affinity in the dengue fever time series
Azevedo, S. M.; Saba, H.; Miranda, J. G. V.; Filho, A. S. Nascimento; Moret, M. A.
2016-06-01
Dengue is a complex public health problem that is common in tropical and subtropical regions. This disease has risen substantially in the last three decades, and the physical symptoms depict the self-affine behavior of the occurrences of reported dengue cases in Bahia, Brazil. This study uses detrended fluctuation analysis (DFA) to verify the scale behavior in a time series of dengue cases and to evaluate the long-range correlations that are characterized by the power law α exponent for different cities in Bahia, Brazil. The scaling exponent (α) presents different long-range correlations, i.e. uncorrelated, anti-persistent, persistent and diffusive behaviors. The long-range correlations highlight the complex behavior of the time series of this disease. The findings show that there are two distinct types of scale behavior. In the first behavior, the time series presents a persistent α exponent for a one-month period. For large periods, the time series signal approaches subdiffusive behavior. The hypothesis of the long-range correlations in the time series of the occurrences of reported dengue cases was validated. The observed self-affinity is useful as a forecasting tool for future periods through extrapolation of the α exponent behavior. This complex system has a higher predictability in a relatively short time (approximately one month), and it suggests a new tool in epidemiological control strategies. However, predictions for large periods using DFA are hidden by the subdiffusive behavior.
SWToolbox: A surface-water tool-box for statistical analysis of streamflow time series
Kiang, Julie E.; Flynn, Kate; Zhai, Tong; Hummel, Paul; Granato, Gregory
2018-03-07
This report is a user guide for the low-flow analysis methods provided with version 1.0 of the Surface Water Toolbox (SWToolbox) computer program. The software combines functionality from two software programs—U.S. Geological Survey (USGS) SWSTAT and U.S. Environmental Protection Agency (EPA) DFLOW. Both of these programs have been used primarily for computation of critical low-flow statistics. The main analysis methods are the computation of hydrologic frequency statistics such as the 7-day minimum flow that occurs on average only once every 10 years (7Q10), computation of design flows including biologically based flows, and computation of flow-duration curves and duration hydrographs. Other annual, monthly, and seasonal statistics can also be computed. The interface facilitates retrieval of streamflow discharge data from the USGS National Water Information System and outputs text reports for a record of the analysis. Tools for graphing data and screening tests are available to assist the analyst in conducting the analysis.
Time-Elastic Generative Model for Acceleration Time Series in Human Activity Recognition
Directory of Open Access Journals (Sweden)
Mario Munoz-Organero
2017-02-01
Full Text Available Body-worn sensors in general and accelerometers in particular have been widely used in order to detect human movements and activities. The execution of each type of movement by each particular individual generates sequences of time series of sensed data from which specific movement related patterns can be assessed. Several machine learning algorithms have been used over windowed segments of sensed data in order to detect such patterns in activity recognition based on intermediate features (either hand-crafted or automatically learned from data. The underlying assumption is that the computed features will capture statistical differences that can properly classify different movements and activities after a training phase based on sensed data. In order to achieve high accuracy and recall rates (and guarantee the generalization of the system to new users, the training data have to contain enough information to characterize all possible ways of executing the activity or movement to be detected. This could imply large amounts of data and a complex and time-consuming training phase, which has been shown to be even more relevant when automatically learning the optimal features to be used. In this paper, we present a novel generative model that is able to generate sequences of time series for characterizing a particular movement based on the time elasticity properties of the sensed data. The model is used to train a stack of auto-encoders in order to learn the particular features able to detect human movements. The results of movement detection using a newly generated database with information on five users performing six different movements are presented. The generalization of results using an existing database is also presented in the paper. The results show that the proposed mechanism is able to obtain acceptable recognition rates (F = 0.77 even in the case of using different people executing a different sequence of movements and using different
Knox meets Cox: adapting epidemiological space-time statistics to demographic studies.
Schmertmann, Carl P; Assuçãon, Renato M; Potter, Joseph E
2010-08-01
Many important questions and theories in demography focus on changes over time, and on how those changes differ over geographic and social space. Space-time analysis has always been important in studying fertility transitions, for example. However demographers have seldom used formal statistical methods to describe and analyze time series of maps. One formal method, used widely in epidemiology, criminology, and public health, is Knox 's space-time interaction test. In this article, we discuss the potential of the Knox test in demographic research and note some possible pitfalls. We demonstrate how to use familiar proportional hazards models to adapt the Knox test for demographic applications. These adaptations allow for nonrepeatable events and for the incorporation of structural variables that change in space and time. We apply the modified test to data on the onset offertility decline in Brazil over 1960-2000 and show how the modified method can produce maps indicating where and when diffusion effects seem strongest, net of covariate effects.
Short Term Prediction of PM10 Concentrations Using Seasonal Time Series Analysis
Directory of Open Access Journals (Sweden)
Hamid Hazrul Abdul
2016-01-01
Full Text Available Air pollution modelling is one of an important tool that usually used to make short term and long term prediction. Since air pollution gives a big impact especially to human health, prediction of air pollutants concentration is needed to help the local authorities to give an early warning to people who are in risk of acute and chronic health effects from air pollution. Finding the best time series model would allow prediction to be made accurately. This research was carried out to find the best time series model to predict the PM10 concentrations in Nilai, Negeri Sembilan, Malaysia. By considering two seasons which is wet season (north east monsoon and dry season (south west monsoon, seasonal autoregressive integrated moving average model were used to find the most suitable model to predict the PM10 concentrations in Nilai, Negeri Sembilan by using three error measures. Based on AIC statistics, results show that ARIMA (1, 1, 1 × (1, 0, 012 is the most suitable model to predict PM10 concentrations in Nilai, Negeri Sembilan.
On the plurality of times: disunified time and the A-series | Nefdt ...
African Journals Online (AJOL)
Then, I attempt to show that disunified time is a problem for a semantics based on the A-series since A-truthmakers are hard to come by in a universe of temporally disconnected time-series. Finally, I provide a novel argument showing that presentists should be particularly fearful of such a universe. South African Journal of ...
Time-series modeling of long-term weight self-monitoring data.
Helander, Elina; Pavel, Misha; Jimison, Holly; Korhonen, Ilkka
2015-08-01
Long-term self-monitoring of weight is beneficial for weight maintenance, especially after weight loss. Connected weight scales accumulate time series information over long term and hence enable time series analysis of the data. The analysis can reveal individual patterns, provide more sensitive detection of significant weight trends, and enable more accurate and timely prediction of weight outcomes. However, long term self-weighing data has several challenges which complicate the analysis. Especially, irregular sampling, missing data, and existence of periodic (e.g. diurnal and weekly) patterns are common. In this study, we apply time series modeling approach on daily weight time series from two individuals and describe information that can be extracted from this kind of data. We study the properties of weight time series data, missing data and its link to individuals behavior, periodic patterns and weight series segmentation. Being able to understand behavior through weight data and give relevant feedback is desired to lead to positive intervention on health behaviors.
Time series prediction of apple scab using meteorological ...
African Journals Online (AJOL)
A new prediction model for the early warning of apple scab is proposed in this study. The method is based on artificial intelligence and time series prediction. The infection period of apple scab was evaluated as the time series prediction model instead of summation of wetness duration. Also, the relations of different ...
Time series analysis of brain regional volume by MR image
International Nuclear Information System (INIS)
Tanaka, Mika; Tarusawa, Ayaka; Nihei, Mitsuyo; Fukami, Tadanori; Yuasa, Tetsuya; Wu, Jin; Ishiwata, Kiichi; Ishii, Kenji
2010-01-01
The present study proposed a methodology of time series analysis of volumes of frontal, parietal, temporal and occipital lobes and cerebellum because such volumetric reports along the process of individual's aging have been scarcely presented. Subjects analyzed were brain images of 2 healthy males and 18 females of av. age of 69.0 y, of which T1-weighted 3D SPGR (spoiled gradient recalled in the steady state) acquisitions with a GE SIGNA EXCITE HD 1.5T machine were conducted for 4 times in the time series of 42-50 months. The image size was 256 x 256 x (86-124) voxels with digitization level 16 bits. As the template for the regions, the standard gray matter atlas (icbn452 a tlas p robability g ray) and its labeled one (icbn.Labels), provided by UCLA Laboratory of Neuro Imaging, were used for individual's standardization. Segmentation, normalization and coregistration were performed with the MR imaging software SPM8 (Statistic Parametric Mapping 8). Volumes of regions were calculated as their voxel ratio to the whole brain voxel in percent. It was found that the regional volumes decreased with aging in all above lobes examined and cerebellum in average percent per year of -0.11, -0.07, -0.04, -0.02, and -0.03, respectively. The procedure for calculation of the regional volumes, which has been manually operated hitherto, can be automatically conducted for the individual brain using the standard atlases above. (T.T.)
Quantifying Selection with Pool-Seq Time Series Data.
Taus, Thomas; Futschik, Andreas; Schlötterer, Christian
2017-11-01
Allele frequency time series data constitute a powerful resource for unraveling mechanisms of adaptation, because the temporal dimension captures important information about evolutionary forces. In particular, Evolve and Resequence (E&R), the whole-genome sequencing of replicated experimentally evolving populations, is becoming increasingly popular. Based on computer simulations several studies proposed experimental parameters to optimize the identification of the selection targets. No such recommendations are available for the underlying parameters selection strength and dominance. Here, we introduce a highly accurate method to estimate selection parameters from replicated time series data, which is fast enough to be applied on a genome scale. Using this new method, we evaluate how experimental parameters can be optimized to obtain the most reliable estimates for selection parameters. We show that the effective population size (Ne) and the number of replicates have the largest impact. Because the number of time points and sequencing coverage had only a minor effect, we suggest that time series analysis is feasible without major increase in sequencing costs. We anticipate that time series analysis will become routine in E&R studies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
DEFF Research Database (Denmark)
Thorndahl, Søren; Korup Andersen, Aske; Larsen, Anders Badsberg
2017-01-01
Continuous and long rainfall series are a necessity in rural and urban hydrology for analysis and design purposes. Local historical point rainfall series often cover several decades, which makes it possible to estimate rainfall means at different timescales, and to assess return periods of extreme...... includes climate changes projected to a specific future period. This paper presents a framework for resampling of historical point rainfall series in order to generate synthetic rainfall series, which has the same statistical properties as an original series. Using a number of key target predictions...... for the future climate, such as winter and summer precipitation, and representation of extreme events, the resampled historical series are projected to represent rainfall properties in a future climate. Climate-projected rainfall series are simulated by brute force randomization of model parameters, which leads...
Transformation-cost time-series method for analyzing irregularly sampled data.
Ozken, Ibrahim; Eroglu, Deniz; Stemler, Thomas; Marwan, Norbert; Bagci, G Baris; Kurths, Jürgen
2015-06-01
Irregular sampling of data sets is one of the challenges often encountered in time-series analysis, since traditional methods cannot be applied and the frequently used interpolation approach can corrupt the data and bias the subsequence analysis. Here we present the TrAnsformation-Cost Time-Series (TACTS) method, which allows us to analyze irregularly sampled data sets without degenerating the quality of the data set. Instead of using interpolation we consider time-series segments and determine how close they are to each other by determining the cost needed to transform one segment into the following one. Using a limited set of operations-with associated costs-to transform the time series segments, we determine a new time series, that is our transformation-cost time series. This cost time series is regularly sampled and can be analyzed using standard methods. While our main interest is the analysis of paleoclimate data, we develop our method using numerical examples like the logistic map and the Rössler oscillator. The numerical data allows us to test the stability of our method against noise and for different irregular samplings. In addition we provide guidance on how to choose the associated costs based on the time series at hand. The usefulness of the TACTS method is demonstrated using speleothem data from the Secret Cave in Borneo that is a good proxy for paleoclimatic variability in the monsoon activity around the maritime continent.
Transformation-cost time-series method for analyzing irregularly sampled data
Ozken, Ibrahim; Eroglu, Deniz; Stemler, Thomas; Marwan, Norbert; Bagci, G. Baris; Kurths, Jürgen
2015-06-01
Irregular sampling of data sets is one of the challenges often encountered in time-series analysis, since traditional methods cannot be applied and the frequently used interpolation approach can corrupt the data and bias the subsequence analysis. Here we present the TrAnsformation-Cost Time-Series (TACTS) method, which allows us to analyze irregularly sampled data sets without degenerating the quality of the data set. Instead of using interpolation we consider time-series segments and determine how close they are to each other by determining the cost needed to transform one segment into the following one. Using a limited set of operations—with associated costs—to transform the time series segments, we determine a new time series, that is our transformation-cost time series. This cost time series is regularly sampled and can be analyzed using standard methods. While our main interest is the analysis of paleoclimate data, we develop our method using numerical examples like the logistic map and the Rössler oscillator. The numerical data allows us to test the stability of our method against noise and for different irregular samplings. In addition we provide guidance on how to choose the associated costs based on the time series at hand. The usefulness of the TACTS method is demonstrated using speleothem data from the Secret Cave in Borneo that is a good proxy for paleoclimatic variability in the monsoon activity around the maritime continent.
A multidisciplinary database for geophysical time series management
Montalto, P.; Aliotta, M.; Cassisi, C.; Prestifilippo, M.; Cannata, A.
2013-12-01
The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.
Application of Time Series Analysis in Determination of Lag Time in Jahanbin Basin
Directory of Open Access Journals (Sweden)
Seied Yahya Mirzaee
2005-11-01
One of the important issues that have significant role in study of hydrology of basin is determination of lag time. Lag time has significant role in hydrological studies. Quantity of rainfall related lag time depends on several factors, such as permeability, vegetation cover, catchments slope, rainfall intensity, storm duration and type of rain. Determination of lag time is important parameter in many projects such as dam design and also water resource studies. Lag time of basin could be calculated using various methods. One of these methods is time series analysis of spectral density. The analysis is based on fouries series. The time series is approximated with Sinuous and Cosines functions. In this method harmonically significant quantities with individual frequencies are presented. Spectral density under multiple time series could be used to obtain basin lag time for annual runoff and short-term rainfall fluctuation. A long lag time could be due to snowmelt as well as melting ice due to rainfalls in freezing days. In this research the lag time of Jahanbin basin has been determined using spectral density method. The catchments is subjected to both rainfall and snowfall. For short term rainfall fluctuation with a return period 2, 3, 4 months, the lag times were found 0.18, 0.5 and 0.083 month, respectively.
Modeling Time Series Data for Supervised Learning
Baydogan, Mustafa Gokce
2012-01-01
Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning…
Blind source separation problem in GPS time series
Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.
2016-04-01
A critical point in the analysis of ground displacement time series, as those recorded by space geodetic techniques, is the development of data-driven methods that allow the different sources of deformation to be discerned and characterized in the space and time domains. Multivariate statistic includes several approaches that can be considered as a part of data-driven methods. A widely used technique is the principal component analysis (PCA), which allows us to reduce the dimensionality of the data space while maintaining most of the variance of the dataset explained. However, PCA does not perform well in finding the solution to the so-called blind source separation (BSS) problem, i.e., in recovering and separating the original sources that generate the observed data. This is mainly due to the fact that PCA minimizes the misfit calculated using an L2 norm (χ 2), looking for a new Euclidean space where the projected data are uncorrelated. The independent component analysis (ICA) is a popular technique adopted to approach the BSS problem. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we test the use of a modified variational Bayesian ICA (vbICA) method to recover the multiple sources of ground deformation even in the presence of missing data. The vbICA method models the probability density function (pdf) of each source signal using a mix of Gaussian distributions, allowing for more flexibility in the description of the pdf of the sources with respect to standard ICA, and giving a more reliable estimate of them. Here we present its application to synthetic global positioning system (GPS) position time series, generated by simulating deformation near an active fault, including inter-seismic, co-seismic, and post-seismic signals, plus seasonal signals and noise, and an additional time-dependent volcanic source. We evaluate the ability of the PCA and ICA decomposition
Empirical method to measure stochasticity and multifractality in nonlinear time series
Lin, Chih-Hao; Chang, Chia-Seng; Li, Sai-Ping
2013-12-01
An empirical algorithm is used here to study the stochastic and multifractal nature of nonlinear time series. A parameter can be defined to quantitatively measure the deviation of the time series from a Wiener process so that the stochasticity of different time series can be compared. The local volatility of the time series under study can be constructed using this algorithm, and the multifractal structure of the time series can be analyzed by using this local volatility. As an example, we employ this method to analyze financial time series from different stock markets. The result shows that while developed markets evolve very much like an Ito process, the emergent markets are far from efficient. Differences about the multifractal structures and leverage effects between developed and emergent markets are discussed. The algorithm used here can be applied in a similar fashion to study time series of other complex systems.
Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang
2014-10-01
Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Clinical time series prediction: Toward a hierarchical dynamical system framework.
Liu, Zitao; Hauskrecht, Milos
2015-09-01
Developing machine learning and data mining algorithms for building temporal models of clinical time series is important for understanding of the patient condition, the dynamics of a disease, effect of various patient management interventions and clinical decision making. In this work, we propose and develop a novel hierarchical framework for modeling clinical time series data of varied length and with irregularly sampled observations. Our hierarchical dynamical system framework for modeling clinical time series combines advantages of the two temporal modeling approaches: the linear dynamical system and the Gaussian process. We model the irregularly sampled clinical time series by using multiple Gaussian process sequences in the lower level of our hierarchical framework and capture the transitions between Gaussian processes by utilizing the linear dynamical system. The experiments are conducted on the complete blood count (CBC) panel data of 1000 post-surgical cardiac patients during their hospitalization. Our framework is evaluated and compared to multiple baseline approaches in terms of the mean absolute prediction error and the absolute percentage error. We tested our framework by first learning the time series model from data for the patients in the training set, and then using it to predict future time series values for the patients in the test set. We show that our model outperforms multiple existing models in terms of its predictive accuracy. Our method achieved a 3.13% average prediction accuracy improvement on ten CBC lab time series when it was compared against the best performing baseline. A 5.25% average accuracy improvement was observed when only short-term predictions were considered. A new hierarchical dynamical system framework that lets us model irregularly sampled time series data is a promising new direction for modeling clinical time series and for improving their predictive performance. Copyright © 2014 Elsevier B.V. All rights reserved.
Clinical time series prediction: towards a hierarchical dynamical system framework
Liu, Zitao; Hauskrecht, Milos
2014-01-01
Objective Developing machine learning and data mining algorithms for building temporal models of clinical time series is important for understanding of the patient condition, the dynamics of a disease, effect of various patient management interventions and clinical decision making. In this work, we propose and develop a novel hierarchical framework for modeling clinical time series data of varied length and with irregularly sampled observations. Materials and methods Our hierarchical dynamical system framework for modeling clinical time series combines advantages of the two temporal modeling approaches: the linear dynamical system and the Gaussian process. We model the irregularly sampled clinical time series by using multiple Gaussian process sequences in the lower level of our hierarchical framework and capture the transitions between Gaussian processes by utilizing the linear dynamical system. The experiments are conducted on the complete blood count (CBC) panel data of 1000 post-surgical cardiac patients during their hospitalization. Our framework is evaluated and compared to multiple baseline approaches in terms of the mean absolute prediction error and the absolute percentage error. Results We tested our framework by first learning the time series model from data for the patient in the training set, and then applying the model in order to predict future time series values on the patients in the test set. We show that our model outperforms multiple existing models in terms of its predictive accuracy. Our method achieved a 3.13% average prediction accuracy improvement on ten CBC lab time series when it was compared against the best performing baseline. A 5.25% average accuracy improvement was observed when only short-term predictions were considered. Conclusion A new hierarchical dynamical system framework that lets us model irregularly sampled time series data is a promising new direction for modeling clinical time series and for improving their predictive
Turbulencelike Behavior of Seismic Time Series
International Nuclear Information System (INIS)
Manshour, P.; Saberi, S.; Sahimi, Muhammad; Peinke, J.; Pacheco, Amalio F.; Rahimi Tabar, M. Reza
2009-01-01
We report on a stochastic analysis of Earth's vertical velocity time series by using methods originally developed for complex hierarchical systems and, in particular, for turbulent flows. Analysis of the fluctuations of the detrended increments of the series reveals a pronounced transition in their probability density function from Gaussian to non-Gaussian. The transition occurs 5-10 hours prior to a moderate or large earthquake, hence representing a new and reliable precursor for detecting such earthquakes
Characterizing time series: when Granger causality triggers complex networks
Ge, Tian; Cui, Yindong; Lin, Wei; Kurths, Jürgen; Liu, Chong
2012-08-01
In this paper, we propose a new approach to characterize time series with noise perturbations in both the time and frequency domains by combining Granger causality and complex networks. We construct directed and weighted complex networks from time series and use representative network measures to describe their physical and topological properties. Through analyzing the typical dynamical behaviors of some physical models and the MIT-BIHMassachusetts Institute of Technology-Beth Israel Hospital. human electrocardiogram data sets, we show that the proposed approach is able to capture and characterize various dynamics and has much potential for analyzing real-world time series of rather short length.
Characterizing time series: when Granger causality triggers complex networks
International Nuclear Information System (INIS)
Ge Tian; Cui Yindong; Lin Wei; Liu Chong; Kurths, Jürgen
2012-01-01
In this paper, we propose a new approach to characterize time series with noise perturbations in both the time and frequency domains by combining Granger causality and complex networks. We construct directed and weighted complex networks from time series and use representative network measures to describe their physical and topological properties. Through analyzing the typical dynamical behaviors of some physical models and the MIT-BIH human electrocardiogram data sets, we show that the proposed approach is able to capture and characterize various dynamics and has much potential for analyzing real-world time series of rather short length. (paper)
Multivariate time series analysis with R and financial applications
Tsay, Ruey S
2013-01-01
Since the publication of his first book, Analysis of Financial Time Series, Ruey Tsay has become one of the most influential and prominent experts on the topic of time series. Different from the traditional and oftentimes complex approach to multivariate (MV) time series, this sequel book emphasizes structural specification, which results in simplified parsimonious VARMA modeling and, hence, eases comprehension. Through a fundamental balance between theory and applications, the book supplies readers with an accessible approach to financial econometric models and their applications to real-worl
Measurements of spatial population synchrony: influence of time series transformations.
Chevalier, Mathieu; Laffaille, Pascal; Ferdy, Jean-Baptiste; Grenouillet, Gaël
2015-09-01
Two mechanisms have been proposed to explain spatial population synchrony: dispersal among populations, and the spatial correlation of density-independent factors (the "Moran effect"). To identify which of these two mechanisms is driving spatial population synchrony, time series transformations (TSTs) of abundance data have been used to remove the signature of one mechanism, and highlight the effect of the other. However, several issues with TSTs remain, and to date no consensus has emerged about how population time series should be handled in synchrony studies. Here, by using 3131 time series involving 34 fish species found in French rivers, we computed several metrics commonly used in synchrony studies to determine whether a large-scale climatic factor (temperature) influenced fish population dynamics at the regional scale, and to test the effect of three commonly used TSTs (detrending, prewhitening and a combination of both) on these metrics. We also tested whether the influence of TSTs on time series and population synchrony levels was related to the features of the time series using both empirical and simulated time series. For several species, and regardless of the TST used, we evidenced a Moran effect on freshwater fish populations. However, these results were globally biased downward by TSTs which reduced our ability to detect significant signals. Depending on the species and the features of the time series, we found that TSTs could lead to contradictory results, regardless of the metric considered. Finally, we suggest guidelines on how population time series should be processed in synchrony studies.
Neural network versus classical time series forecasting models
Nor, Maria Elena; Safuan, Hamizah Mohd; Shab, Noorzehan Fazahiyah Md; Asrul, Mohd; Abdullah, Affendi; Mohamad, Nurul Asmaa Izzati; Lee, Muhammad Hisyam
2017-05-01
Artificial neural network (ANN) has advantage in time series forecasting as it has potential to solve complex forecasting problems. This is because ANN is data driven approach which able to be trained to map past values of a time series. In this study the forecast performance between neural network and classical time series forecasting method namely seasonal autoregressive integrated moving average models was being compared by utilizing gold price data. Moreover, the effect of different data preprocessing on the forecast performance of neural network being examined. The forecast accuracy was evaluated using mean absolute deviation, root mean square error and mean absolute percentage error. It was found that ANN produced the most accurate forecast when Box-Cox transformation was used as data preprocessing.
Nonlinear time series analysis of the human electrocardiogram
International Nuclear Information System (INIS)
Perc, Matjaz
2005-01-01
We analyse the human electrocardiogram with simple nonlinear time series analysis methods that are appropriate for graduate as well as undergraduate courses. In particular, attention is devoted to the notions of determinism and stationarity in physiological data. We emphasize that methods of nonlinear time series analysis can be successfully applied only if the studied data set originates from a deterministic stationary system. After positively establishing the presence of determinism and stationarity in the studied electrocardiogram, we calculate the maximal Lyapunov exponent, thus providing interesting insights into the dynamics of the human heart. Moreover, to facilitate interest and enable the integration of nonlinear time series analysis methods into the curriculum at an early stage of the educational process, we also provide user-friendly programs for each implemented method
Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary
2014-11-01
Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Statistical tests for power-law cross-correlated processes
Podobnik, Boris; Jiang, Zhi-Qiang; Zhou, Wei-Xing; Stanley, H. Eugene
2011-12-01
For stationary time series, the cross-covariance and the cross-correlation as functions of time lag n serve to quantify the similarity of two time series. The latter measure is also used to assess whether the cross-correlations are statistically significant. For nonstationary time series, the analogous measures are detrended cross-correlations analysis (DCCA) and the recently proposed detrended cross-correlation coefficient, ρDCCA(T,n), where T is the total length of the time series and n the window size. For ρDCCA(T,n), we numerically calculated the Cauchy inequality -1≤ρDCCA(T,n)≤1. Here we derive -1≤ρDCCA(T,n)≤1 for a standard variance-covariance approach and for a detrending approach. For overlapping windows, we find the range of ρDCCA within which the cross-correlations become statistically significant. For overlapping windows we numerically determine—and for nonoverlapping windows we derive—that the standard deviation of ρDCCA(T,n) tends with increasing T to 1/T. Using ρDCCA(T,n) we show that the Chinese financial market's tendency to follow the U.S. market is extremely weak. We also propose an additional statistical test that can be used to quantify the existence of cross-correlations between two power-law correlated time series.
Hidden Markov Models for Time Series An Introduction Using R
Zucchini, Walter
2009-01-01
Illustrates the flexibility of HMMs as general-purpose models for time series data. This work presents an overview of HMMs for analyzing time series data, from continuous-valued, circular, and multivariate series to binary data, bounded and unbounded counts and categorical observations.
Constructing ordinal partition transition networks from multivariate time series.
Zhang, Jiayang; Zhou, Jie; Tang, Ming; Guo, Heng; Small, Michael; Zou, Yong
2017-08-10
A growing number of algorithms have been proposed to map a scalar time series into ordinal partition transition networks. However, most observable phenomena in the empirical sciences are of a multivariate nature. We construct ordinal partition transition networks for multivariate time series. This approach yields weighted directed networks representing the pattern transition properties of time series in velocity space, which hence provides dynamic insights of the underling system. Furthermore, we propose a measure of entropy to characterize ordinal partition transition dynamics, which is sensitive to capturing the possible local geometric changes of phase space trajectories. We demonstrate the applicability of pattern transition networks to capture phase coherence to non-coherence transitions, and to characterize paths to phase synchronizations. Therefore, we conclude that the ordinal partition transition network approach provides complementary insight to the traditional symbolic analysis of nonlinear multivariate time series.
Simple dead-time corrections for discrete time series of non-Poisson data
International Nuclear Information System (INIS)
Larsen, Michael L; Kostinski, Alexander B
2009-01-01
The problem of dead time (instrumental insensitivity to detectable events due to electronic or mechanical reset time) is considered. Most existing algorithms to correct for event count errors due to dead time implicitly rely on Poisson counting statistics of the underlying phenomena. However, when the events to be measured are clustered in time, the Poisson statistics assumption results in underestimating both the true event count and any statistics associated with count variability; the 'busiest' part of the signal is partially missed. Using the formalism associated with the pair-correlation function, we develop first-order correction expressions for the general case of arbitrary counting statistics. The results are verified through simulation of a realistic clustering scenario
Buonaccorsi, G A; Rose, C J; O'Connor, J P B; Roberts, C; Watson, Y; Jackson, A; Jayson, G C; Parker, G J M
2010-01-01
Clinical trials of anti-angiogenic and vascular-disrupting agents often use biomarkers derived from DCE-MRI, typically reporting whole-tumor summary statistics and so overlooking spatial parameter variations caused by tissue heterogeneity. We present a data-driven segmentation method comprising tracer-kinetic model-driven registration for motion correction, conversion from MR signal intensity to contrast agent concentration for cross-visit normalization, iterative principal components analysis for imputation of missing data and dimensionality reduction, and statistical outlier detection using the minimum covariance determinant to obtain a robust Mahalanobis distance. After applying these techniques we cluster in the principal components space using k-means. We present results from a clinical trial of a VEGF inhibitor, using time-series data selected because of problems due to motion and outlier time series. We obtained spatially-contiguous clusters that map to regions with distinct microvascular characteristics. This methodology has the potential to uncover localized effects in trials using DCE-MRI-based biomarkers.
Multiresolution analysis of Bursa Malaysia KLCI time series
Ismail, Mohd Tahir; Dghais, Amel Abdoullah Ahmed
2017-05-01
In general, a time series is simply a sequence of numbers collected at regular intervals over a period. Financial time series data processing is concerned with the theory and practice of processing asset price over time, such as currency, commodity data, and stock market data. The primary aim of this study is to understand the fundamental characteristics of selected financial time series by using the time as well as the frequency domain analysis. After that prediction can be executed for the desired system for in sample forecasting. In this study, multiresolution analysis which the assist of discrete wavelet transforms (DWT) and maximal overlap discrete wavelet transform (MODWT) will be used to pinpoint special characteristics of Bursa Malaysia KLCI (Kuala Lumpur Composite Index) daily closing prices and return values. In addition, further case study discussions include the modeling of Bursa Malaysia KLCI using linear ARIMA with wavelets to address how multiresolution approach improves fitting and forecasting results.
Hensman, James; Lawrence, Neil D; Rattray, Magnus
2013-08-20
Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
International Nuclear Information System (INIS)
Vajna, Szabolcs; Kertész, János; Tóth, Bálint
2013-01-01
Many human-related activities show power-law decaying interevent time distribution with exponents usually varying between 1 and 2. We study a simple task-queuing model, which produces bursty time series due to the non-trivial dynamics of the task list. The model is characterized by a priority distribution as an input parameter, which describes the choice procedure from the list. We give exact results on the asymptotic behaviour of the model and we show that the interevent time distribution is power-law decaying for any kind of input distributions that remain normalizable in the infinite list limit, with exponents tunable between 1 and 2. The model satisfies a scaling law between the exponents of interevent time distribution (β) and autocorrelation function (α): α + β = 2. This law is general for renewal processes with power-law decaying interevent time distribution. We conclude that slowly decaying autocorrelation function indicates long-range dependence only if the scaling law is violated. (paper)
Timing calibration and spectral cleaning of LOFAR time series data
Corstanje, A.; Buitink, S.; Enriquez, J. E.; Falcke, H.; Horandel, J. R.; Krause, M.; Nelles, A.; Rachen, J. P.; Schellart, P.; Scholten, O.; ter Veen, S.; Thoudam, S.; Trinh, T. N. G.
We describe a method for spectral cleaning and timing calibration of short time series data of the voltage in individual radio interferometer receivers. It makes use of phase differences in fast Fourier transform (FFT) spectra across antenna pairs. For strong, localized terrestrial sources these are
EmailTime: visual analytics and statistics for temporal email
Erfani Joorabchi, Minoo; Yim, Ji-Dong; Shaw, Christopher D.
2011-01-01
Although the discovery and analysis of communication patterns in large and complex email datasets are difficult tasks, they can be a valuable source of information. We present EmailTime, a visual analysis tool of email correspondence patterns over the course of time that interactively portrays personal and interpersonal networks using the correspondence in the email dataset. Our approach is to put time as a primary variable of interest, and plot emails along a time line. EmailTime helps email dataset explorers interpret archived messages by providing zooming, panning, filtering and highlighting etc. To support analysis, it also measures and visualizes histograms, graph centrality and frequency on the communication graph that can be induced from the email collection. This paper describes EmailTime's capabilities, along with a large case study with Enron email dataset to explore the behaviors of email users within different organizational positions from January 2000 to December 2001. We defined email behavior as the email activity level of people regarding a series of measured metrics e.g. sent and received emails, numbers of email addresses, etc. These metrics were calculated through EmailTime. Results showed specific patterns in the use email within different organizational positions. We suggest that integrating both statistics and visualizations in order to display information about the email datasets may simplify its evaluation.
Time series momentum and contrarian effects in the Chinese stock market
Shi, Huai-Long; Zhou, Wei-Xing
2017-10-01
This paper concentrates on the time series momentum or contrarian effects in the Chinese stock market. We evaluate the performance of the time series momentum strategy applied to major stock indices in mainland China and explore the relation between the performance of time series momentum strategies and some firm-specific characteristics. Our findings indicate that there is a time series momentum effect in the short run and a contrarian effect in the long run in the Chinese stock market. The performances of the time series momentum and contrarian strategies are highly dependent on the look-back and holding periods and firm-specific characteristics.
Time-Series Analysis: A Cautionary Tale
Damadeo, Robert
2015-01-01
Time-series analysis has often been a useful tool in atmospheric science for deriving long-term trends in various atmospherically important parameters (e.g., temperature or the concentration of trace gas species). In particular, time-series analysis has been repeatedly applied to satellite datasets in order to derive the long-term trends in stratospheric ozone, which is a critical atmospheric constituent. However, many of the potential pitfalls relating to the non-uniform sampling of the datasets were often ignored and the results presented by the scientific community have been unknowingly biased. A newly developed and more robust application of this technique is applied to the Stratospheric Aerosol and Gas Experiment (SAGE) II version 7.0 ozone dataset and the previous biases and newly derived trends are presented.
Characterizing interdependencies of multiple time series theory and applications
Hosoya, Yuzo; Takimoto, Taro; Kinoshita, Ryo
2017-01-01
This book introduces academic researchers and professionals to the basic concepts and methods for characterizing interdependencies of multiple time series in the frequency domain. Detecting causal directions between a pair of time series and the extent of their effects, as well as testing the non existence of a feedback relation between them, have constituted major focal points in multiple time series analysis since Granger introduced the celebrated definition of causality in view of prediction improvement. Causality analysis has since been widely applied in many disciplines. Although most analyses are conducted from the perspective of the time domain, a frequency domain method introduced in this book sheds new light on another aspect that disentangles the interdependencies between multiple time series in terms of long-term or short-term effects, quantitatively characterizing them. The frequency domain method includes the Granger noncausality test as a special case. Chapters 2 and 3 of the book introduce an i...
A perturbative approach for enhancing the performance of time series forecasting.
de Mattos Neto, Paulo S G; Ferreira, Tiago A E; Lima, Aranildo R; Vasconcelos, Germano C; Cavalcanti, George D C
2017-04-01
This paper proposes a method to perform time series prediction based on perturbation theory. The approach is based on continuously adjusting an initial forecasting model to asymptotically approximate a desired time series model. First, a predictive model generates an initial forecasting for a time series. Second, a residual time series is calculated as the difference between the original time series and the initial forecasting. If that residual series is not white noise, then it can be used to improve the accuracy of the initial model and a new predictive model is adjusted using residual series. The whole process is repeated until convergence or the residual series becomes white noise. The output of the method is then given by summing up the outputs of all trained predictive models in a perturbative sense. To test the method, an experimental investigation was conducted on six real world time series. A comparison was made with six other methods experimented and ten other results found in the literature. Results show that not only the performance of the initial model is significantly improved but also the proposed method outperforms the other results previously published. Copyright © 2017 Elsevier Ltd. All rights reserved.
Drunk driving detection based on classification of multivariate time series.
Li, Zhenlong; Jin, Xue; Zhao, Xiaohua
2015-09-01
This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Directory of Open Access Journals (Sweden)
Mihael Brenčič
2009-12-01
Full Text Available Statistical analyses of calcimetric data from boreholes BV-1 (north of PodpeČ and BV-2 (south of ^rna vas on Ljubljansko barje in central Slovenia are given. The original data are represented as unevenly spaced time series that are translated into evenly spaced time series. To calculate the interpolation weighted influence function,amodel based on the power correlated influence is defined.Parameters electionisper formed basedon the maximum entropy principle. In the reconstructed time series, autocorrelation and Fourier power spectrum analyses are performed. In both time series, a transition from white noise to red noise was detected. Such behaviour can be describedby a Lorentz process. Red noise is the result of a stochastic process with long-term memory. This effect can be seen predominantly in the autocorrelation function of borehole BV-1. In the calcimetric time series of borehole BV-2, periodicity with a period between 10.0 m and 12.5 m was also detected. We suppose that this period reflects climatic fluctuations during the Quaternary Period.
Geomechanical time series and its singularity spectrum analysis
Czech Academy of Sciences Publication Activity Database
Lyubushin, Alexei A.; Kaláb, Zdeněk; Lednická, Markéta
2012-01-01
Roč. 47, č. 1 (2012), s. 69-77 ISSN 1217-8977 R&D Projects: GA ČR GA105/09/0089 Institutional research plan: CEZ:AV0Z30860518 Keywords : geomechanical time series * singularity spectrum * time series segmentation * laser distance meter Subject RIV: DC - Siesmology, Volcanology, Earth Structure Impact factor: 0.347, year: 2012 http://www.akademiai.com/content/88v4027758382225/fulltext.pdf
Jumps in GNSS coordinates time series, a simple and fast methodology to clean the data sets
Bruni, Sara; Zerbini, Susanna; Raicich, Fabio; Errico, Maddalena; Santi, Efisio
2014-05-01
GNSS coordinate time series often suffer from the presence of undesired offsets of different nature which may impair the reliable estimation of the long-period trend and that should be corrected in the original data sets. Examples of such discontinuities are those originated by earthquakes, monumentation problems, replacement/maintenance of the station equipment, change of the reference system and by a number of unforeseen events. We have developed an automated and fast data inspection procedure for estimating the time of occurrence and the magnitude of the jumps and for correcting the time series accordingly. These processing characteristics are important because many time series are now spanning almost two decades, and dense GNSS networks are becoming a reality. The procedure has been developed and tailored to GNSS data sets starting from the Sequential T-test Analysis of Regime Shifts (STARS) originally conceived by Rodionov (Geophys. Res. Lett., 31, L09204, 2004) in the context of climatic studies. This technique does not make any a priori assumption on the time of occurrence and on the magnitude of the discontinuities. A jump is detected and its magnitude estimated when, over two consecutive time windows of the same length, the mean value exhibits a statistically significant change. Three user-defined parameters are required: the cut-off length, L, representing the minimum time interval between two consecutive discontinuities, the significance level, p, of the exploited two-tailed Student t-test, and the Huber parameter, H, used to compute a weighted mean over the L-day intervals. The method has been tested on GPS coordinates time series of stations located in the southeastern Po Plain, in Italy. The series span more than 15 years and are affected by offsets of different nature. The methodology has proven to be effective, as confirmed by the comparison between the corrected GPS time series and those obtained by other co-located observation techniques such as
Pseudo-random bit generator based on lag time series
García-Martínez, M.; Campos-Cantón, E.
2014-12-01
In this paper, we present a pseudo-random bit generator (PRBG) based on two lag time series of the logistic map using positive and negative values in the bifurcation parameter. In order to hidden the map used to build the pseudo-random series we have used a delay in the generation of time series. These new series when they are mapped xn against xn+1 present a cloud of points unrelated to the logistic map. Finally, the pseudo-random sequences have been tested with the suite of NIST giving satisfactory results for use in stream ciphers.
Non-linear forecasting in high-frequency financial time series
Strozzi, F.; Zaldívar, J. M.
2005-08-01
A new methodology based on state space reconstruction techniques has been developed for trading in financial markets. The methodology has been tested using 18 high-frequency foreign exchange time series. The results are in apparent contradiction with the efficient market hypothesis which states that no profitable information about future movements can be obtained by studying the past prices series. In our (off-line) analysis positive gain may be obtained in all those series. The trading methodology is quite general and may be adapted to other financial time series. Finally, the steps for its on-line application are discussed.
A Statistic Analysis Of Romanian Seaside Hydro Tourism
Secara Mirela
2011-01-01
Tourism represents one of the ways of spending spare time for rest, recreation, treatment and entertainment, and the specific aspect of Constanta County economy is touristic and spa capitalization of Romanian seaside. In order to analyze hydro tourism on Romanian seaside we have used statistic indicators within tourism as well as statistic methods such as chronological series, interdependent statistic series, regression and statistic correlation. The major objective of this research is to rai...
Analysis of JET ELMy time series
International Nuclear Information System (INIS)
Zvejnieks, G.; Kuzovkov, V.N.
2005-01-01
Full text: Achievement of the planned operational regime in the next generation tokamaks (such as ITER) still faces principal problems. One of the main challenges is obtaining the control of edge localized modes (ELMs), which should lead to both long plasma pulse times and reasonable divertor life time. In order to control ELMs the hypothesis was proposed by Degeling [1] that ELMs exhibit features of chaotic dynamics and thus a standard chaos control methods might be applicable. However, our findings which are based on the nonlinear autoregressive (NAR) model contradict this hypothesis for JET ELMy time-series. In turn, it means that ELM behavior is of a relaxation or random type. These conclusions coincide with our previous results obtained for ASDEX Upgrade time series [2]. [1] A.W. Degeling, Y.R. Martin, P.E. Bak, J. B.Lister, and X. Llobet, Plasma Phys. Control. Fusion 43, 1671 (2001). [2] G. Zvejnieks, V.N. Kuzovkov, O. Dumbrajs, A.W. Degeling, W. Suttrop, H. Urano, and H. Zohm, Physics of Plasmas 11, 5658 (2004)
Analysis of time series and size of equivalent sample
International Nuclear Information System (INIS)
Bernal, Nestor; Molina, Alicia; Pabon, Daniel; Martinez, Jorge
2004-01-01
In a meteorological context, a first approach to the modeling of time series is to use models of autoregressive type. This allows one to take into account the meteorological persistence or temporal behavior, thereby identifying the memory of the analyzed process. This article seeks to pre-sent the concept of the size of an equivalent sample, which helps to identify in the data series sub periods with a similar structure. Moreover, in this article we examine the alternative of adjusting the variance of the series, keeping in mind its temporal structure, as well as an adjustment to the covariance of two time series. This article presents two examples, the first one corresponding to seven simulated series with autoregressive structure of first order, and the second corresponding to seven meteorological series of anomalies of the air temperature at the surface in two Colombian regions
Scalable Prediction of Energy Consumption using Incremental Time Series Clustering
Energy Technology Data Exchange (ETDEWEB)
Simmhan, Yogesh; Noor, Muhammad Usman
2013-10-09
Time series datasets are a canonical form of high velocity Big Data, and often generated by pervasive sensors, such as found in smart infrastructure. Performing predictive analytics on time series data can be computationally complex, and requires approximation techniques. In this paper, we motivate this problem using a real application from the smart grid domain. We propose an incremental clustering technique, along with a novel affinity score for determining cluster similarity, which help reduce the prediction error for cumulative time series within a cluster. We evaluate this technique, along with optimizations, using real datasets from smart meters, totaling ~700,000 data points, and show the efficacy of our techniques in improving the prediction error of time series data within polynomial time.
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Conradsen, Knut; Skriver, Henning
2017-01-01
Based on an omnibus likelihood ratio test statistic for the equality of several variance-covariance matrices following the complex Wishart distribution and a factorization of this test statistic with associated p-values, change analysis in a time series of multilook polarimetric SAR data...... in the covariance matrix representation is carried out. The omnibus test statistic and its factorization detect if and when change occurs. Using airborne EMISAR and spaceborne RADARSAT-2 data this paper focuses on change detection based on the p-values, on visualization of change at pixel as well as segment level......, and on computer software....
Describing temporal variability of the mean Estonian precipitation series in climate time scale
Post, P.; Kärner, O.
2009-04-01
,1,1) model can be interpreted to be consisting of random walk in a noisy environment (Box and Jenkins, 1976). The fitted model appears to be weakly non-stationary, that gives us the possibility to use stationary approximation if only the noise component from that sum of white noise and random walk is exploited. We get a convenient routine to generate a stationary precipitation climatology with a reasonable accuracy, since the noise component variance is much larger than the dispersion of the random walk generator. This interpretation emphasizes dominating role of a random component in the precipitation series. The result is understandable due to a small territory of Estonia that is situated in the mid-latitude cyclone track. References Box, J.E.P. and G. Jenkins 1976: Time Series Analysis, Forecasting and Control (revised edn.), Holden Day San Francisco, CA, 575 pp. Davis, A., Marshak, A., Wiscombe, W. and R. Cahalan 1996: Multifractal characterizations of intermittency in nonstationary geophysical signals and fields.in G. Trevino et al. (eds) Current Topics in Nonsstationarity Analysis. World-Scientific, Singapore, 97-158. Kärner, O. 2002: On nonstationarity and antipersistency in global temperature series. J. Geophys. Res. D107; doi:10.1029/2001JD002024. Kärner, O. 2005: Some examples on negative feedback in the Earth climate system. Centr. European J. Phys. 3; 190-208. Monin, A.S. and A.M. Yaglom 1975: Statistical Fluid Mechanics, Vol 2. Mechanics of Turbulence , MIT Press Boston Mass, 886 pp.
Forecasting with nonlinear time series models
DEFF Research Database (Denmark)
Kock, Anders Bredahl; Teräsvirta, Timo
In this paper, nonlinear models are restricted to mean nonlinear parametric models. Several such models popular in time series econo- metrics are presented and some of their properties discussed. This in- cludes two models based on universal approximators: the Kolmogorov- Gabor polynomial model...... applied to economic fore- casting problems, is briefly highlighted. A number of large published studies comparing macroeconomic forecasts obtained using different time series models are discussed, and the paper also contains a small simulation study comparing recursive and direct forecasts in a partic...... and two versions of a simple artificial neural network model. Techniques for generating multi-period forecasts from nonlinear models recursively are considered, and the direct (non-recursive) method for this purpose is mentioned as well. Forecasting with com- plex dynamic systems, albeit less frequently...
Directory of Open Access Journals (Sweden)
Mohammad Reza
2013-01-01
Full Text Available This paper investigates the relationship between logistics and economic development in Indonesia using time series data on traffic volume and economic growth for the period from 1988 to 2010. Literature reviews were conducted to find the most applicable econometric model. The data of cargo volume that travels through sea, air and rail is used as the logistics index, while GDP is used for the economic index. The time series data was tested using stationarity and co-integration tests. Granger causality tests were employed, and then a proposed logistic model is presented. This study showed that logistics plays an important role in supporting and sustaining economic growth, in a form where the economic growth is the significant demand-pull effect towards logistics. Although the model is developed in the context of Indonesia, the overall statistical analysis can be generalized to other developing economies. Based on the model, this paper presented the importance of sustaining economic development with regards continuously improving the logistics infrastructure.
Nonparametric factor analysis of time series
Rodríguez-Poo, Juan M.; Linton, Oliver Bruce
1998-01-01
We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.
Time Series Outlier Detection Based on Sliding Window Prediction
Directory of Open Access Journals (Sweden)
Yufeng Yu
2014-01-01
Full Text Available In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI, which can be calculated by the predicted value and confidence coefficient. The use of PCI as threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis.
Metagenomics meets time series analysis: unraveling microbial community dynamics
Faust, K.; Lahti, L.M.; Gonze, D.; Vos, de W.M.; Raes, J.
2015-01-01
The recent increase in the number of microbial time series studies offers new insights into the stability and dynamics of microbial communities, from the world's oceans to human microbiota. Dedicated time series analysis tools allow taking full advantage of these data. Such tools can reveal periodic
Time series forecasting based on deep extreme learning machine
Guo, Xuqi; Pang, Y.; Yan, Gaowei; Qiao, Tiezhu; Yang, Guang-Hong; Yang, Dan
2017-01-01
Multi-layer Artificial Neural Networks (ANN) has caught widespread attention as a new method for time series forecasting due to the ability of approximating any nonlinear function. In this paper, a new local time series prediction model is established with the nearest neighbor domain theory, in
False-nearest-neighbors algorithm and noise-corrupted time series
International Nuclear Information System (INIS)
Rhodes, C.; Morari, M.
1997-01-01
The false-nearest-neighbors (FNN) algorithm was originally developed to determine the embedding dimension for autonomous time series. For noise-free computer-generated time series, the algorithm does a good job in predicting the embedding dimension. However, the problem of predicting the embedding dimension when the time-series data are corrupted by noise was not fully examined in the original studies of the FNN algorithm. Here it is shown that with large data sets, even small amounts of noise can lead to incorrect prediction of the embedding dimension. Surprisingly, as the length of the time series analyzed by FNN grows larger, the cause of incorrect prediction becomes more pronounced. An analysis of the effect of noise on the FNN algorithm and a solution for dealing with the effects of noise are given here. Some results on the theoretically correct choice of the FNN threshold are also presented. copyright 1997 The American Physical Society
CauseMap: fast inference of causality from complex time series.
Maher, M Cyrus; Hernandez, Ryan D
2015-01-01
Background. Establishing health-related causal relationships is a central pursuit in biomedical research. Yet, the interdependent non-linearity of biological systems renders causal dynamics laborious and at times impractical to disentangle. This pursuit is further impeded by the dearth of time series that are sufficiently long to observe and understand recurrent patterns of flux. However, as data generation costs plummet and technologies like wearable devices democratize data collection, we anticipate a coming surge in the availability of biomedically-relevant time series data. Given the life-saving potential of these burgeoning resources, it is critical to invest in the development of open source software tools that are capable of drawing meaningful insight from vast amounts of time series data. Results. Here we present CauseMap, the first open source implementation of convergent cross mapping (CCM), a method for establishing causality from long time series data (≳25 observations). Compared to existing time series methods, CCM has the advantage of being model-free and robust to unmeasured confounding that could otherwise induce spurious associations. CCM builds on Takens' Theorem, a well-established result from dynamical systems theory that requires only mild assumptions. This theorem allows us to reconstruct high dimensional system dynamics using a time series of only a single variable. These reconstructions can be thought of as shadows of the true causal system. If reconstructed shadows can predict points from opposing time series, we can infer that the corresponding variables are providing views of the same causal system, and so are causally related. Unlike traditional metrics, this test can establish the directionality of causation, even in the presence of feedback loops. Furthermore, since CCM can extract causal relationships from times series of, e.g., a single individual, it may be a valuable tool to personalized medicine. We implement CCM in Julia, a
CauseMap: fast inference of causality from complex time series
Directory of Open Access Journals (Sweden)
M. Cyrus Maher
2015-03-01
Full Text Available Background. Establishing health-related causal relationships is a central pursuit in biomedical research. Yet, the interdependent non-linearity of biological systems renders causal dynamics laborious and at times impractical to disentangle. This pursuit is further impeded by the dearth of time series that are sufficiently long to observe and understand recurrent patterns of flux. However, as data generation costs plummet and technologies like wearable devices democratize data collection, we anticipate a coming surge in the availability of biomedically-relevant time series data. Given the life-saving potential of these burgeoning resources, it is critical to invest in the development of open source software tools that are capable of drawing meaningful insight from vast amounts of time series data.Results. Here we present CauseMap, the first open source implementation of convergent cross mapping (CCM, a method for establishing causality from long time series data (≳25 observations. Compared to existing time series methods, CCM has the advantage of being model-free and robust to unmeasured confounding that could otherwise induce spurious associations. CCM builds on Takens’ Theorem, a well-established result from dynamical systems theory that requires only mild assumptions. This theorem allows us to reconstruct high dimensional system dynamics using a time series of only a single variable. These reconstructions can be thought of as shadows of the true causal system. If reconstructed shadows can predict points from opposing time series, we can infer that the corresponding variables are providing views of the same causal system, and so are causally related. Unlike traditional metrics, this test can establish the directionality of causation, even in the presence of feedback loops. Furthermore, since CCM can extract causal relationships from times series of, e.g., a single individual, it may be a valuable tool to personalized medicine. We implement
Track Irregularity Time Series Analysis and Trend Forecasting
Directory of Open Access Journals (Sweden)
Jia Chaolong
2012-01-01
Full Text Available The combination of linear and nonlinear methods is widely used in the prediction of time series data. This paper analyzes track irregularity time series data by using gray incidence degree models and methods of data transformation, trying to find the connotative relationship between the time series data. In this paper, GM (1,1 is based on first-order, single variable linear differential equations; after an adaptive improvement and error correction, it is used to predict the long-term changing trend of track irregularity at a fixed measuring point; the stochastic linear AR, Kalman filtering model, and artificial neural network model are applied to predict the short-term changing trend of track irregularity at unit section. Both long-term and short-term changes prove that the model is effective and can achieve the expected accuracy.
PRESEE: an MDL/MML algorithm to time-series stream segmenting.
Xu, Kaikuo; Jiang, Yexi; Tang, Mingjie; Yuan, Changan; Tang, Changjie
2013-01-01
Time-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on the issue of ameliorating precision instead of paying much attention to the efficiency. Moreover, the performance of these algorithms depends heavily on parameters, which are hard for the users to set. In this paper, we propose PRESEE (parameter-free, real-time, and scalable time-series stream segmenting algorithm), which greatly improves the efficiency of time-series stream segmenting. PRESEE is based on both MDL (minimum description length) and MML (minimum message length) methods, which could segment the data automatically. To evaluate the performance of PRESEE, we conduct several experiments on time-series streams of different types and compare it with the state-of-art algorithm. The empirical results show that PRESEE is very efficient for real-time stream datasets by improving segmenting speed nearly ten times. The novelty of this algorithm is further demonstrated by the application of PRESEE in segmenting real-time stream datasets from ChinaFLUX sensor networks data stream.
Local normalization: Uncovering correlations in non-stationary financial time series
Schäfer, Rudi; Guhr, Thomas
2010-09-01
The measurement of correlations between financial time series is of vital importance for risk management. In this paper we address an estimation error that stems from the non-stationarity of the time series. We put forward a method to rid the time series of local trends and variable volatility, while preserving cross-correlations. We test this method in a Monte Carlo simulation, and apply it to empirical data for the S&P 500 stocks.
Mackenzie River Delta morphological change based on Landsat time series
Vesakoski, Jenni-Mari; Alho, Petteri; Gustafsson, David; Arheimer, Berit; Isberg, Kristina
2015-04-01
Arctic rivers are sensitive and yet quite unexplored river systems to which the climate change will impact on. Research has not focused in detail on the fluvial geomorphology of the Arctic rivers mainly due to the remoteness and wideness of the watersheds, problems with data availability and difficult accessibility. Nowadays wide collaborative spatial databases in hydrology as well as extensive remote sensing datasets over the Arctic are available and they enable improved investigation of the Arctic watersheds. Thereby, it is also important to develop and improve methods that enable detecting the fluvio-morphological processes based on the available data. Furthermore, it is essential to reconstruct and improve the understanding of the past fluvial processes in order to better understand prevailing and future fluvial processes. In this study we sum up the fluvial geomorphological change in the Mackenzie River Delta during the last ~30 years. The Mackenzie River Delta (~13 000 km2) is situated in the North Western Territories, Canada where the Mackenzie River enters to the Beaufort Sea, Arctic Ocean near the city of Inuvik. Mackenzie River Delta is lake-rich, productive ecosystem and ecologically sensitive environment. Research objective is achieved through two sub-objectives: 1) Interpretation of the deltaic river channel planform change by applying Landsat time series. 2) Definition of the variables that have impacted the most on detected changes by applying statistics and long hydrological time series derived from Arctic-HYPE model (HYdrologic Predictions for Environment) developed by Swedish Meteorological and Hydrological Institute. According to our satellite interpretation, field observations and statistical analyses, notable spatio-temporal changes have occurred in the morphology of the river channel and delta during the past 30 years. For example, the channels have been developing in braiding and sinuosity. In addition, various linkages between the studied
Fuzzy time-series based on Fibonacci sequence for stock price forecasting
Chen, Tai-Liang; Cheng, Ching-Hsue; Jong Teoh, Hia
2007-07-01
Time-series models have been utilized to make reasonably accurate predictions in the areas of stock price movements, academic enrollments, weather, etc. For promoting the forecasting performance of fuzzy time-series models, this paper proposes a new model, which incorporates the concept of the Fibonacci sequence, the framework of Song and Chissom's model and the weighted method of Yu's model. This paper employs a 5-year period TSMC (Taiwan Semiconductor Manufacturing Company) stock price data and a 13-year period of TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) stock index data as experimental datasets. By comparing our forecasting performances with Chen's (Forecasting enrollments based on fuzzy time-series. Fuzzy Sets Syst. 81 (1996) 311-319), Yu's (Weighted fuzzy time-series models for TAIEX forecasting. Physica A 349 (2004) 609-624) and Huarng's (The application of neural networks to forecast fuzzy time series. Physica A 336 (2006) 481-491) models, we conclude that the proposed model surpasses in accuracy these conventional fuzzy time-series models.
Connection between recurrence time statistics and anomalous transport
International Nuclear Information System (INIS)
Zaslavsky, G.M.; Tippett, M.K.
1991-01-01
For a model stationary flow with hexagonal symmetry, the recurrence time statistics are studied. The model has been shown to have a sharp transition from normal to anomalous transport. Here it is shown that this transition is accompanied by a correspondent change of the recurrence time statistics from normal to anomalous. The latter one displays the existence of a power tail. Recurrence time statistics provide a local measurement of anomalous transport that is of practical interest
Amorese, D.; Grasso, J.-R.; Garambois, S.; Font, M.
2018-05-01
The rank-sum multiple change-point method is a robust statistical procedure designed to search for the optimal number and the location of change points in an arbitrary continue or discrete sequence of values. As such, this procedure can be used to analyse time-series data. Twelve years of robust data sets for the Séchilienne (French Alps) rockslide show a continuous increase in average displacement rate from 50 to 280 mm per month, in the 2004-2014 period, followed by a strong decrease back to 50 mm per month in the 2014-2015 period. When possible kinematic phases are tentatively suggested in previous studies, its solely rely on the basis of empirical threshold values. In this paper, we analyse how the use of a statistical algorithm for change-point detection helps to better understand time phases in landslide kinematics. First, we test the efficiency of the statistical algorithm on geophysical benchmark data, these data sets (stream flows and Northern Hemisphere temperatures) being already analysed by independent statistical tools. Second, we apply the method to 12-yr daily time-series of the Séchilienne landslide, for rainfall and displacement data, from 2003 December to 2015 December, in order to quantitatively extract changes in landslide kinematics. We find two strong significant discontinuities in the weekly cumulated rainfall values: an average rainfall rate increase is resolved in 2012 April and a decrease in 2014 August. Four robust changes are highlighted in the displacement time-series (2008 May, 2009 November-December-2010 January, 2012 September and 2014 March), the 2010 one being preceded by a significant but weak rainfall rate increase (in 2009 November). Accordingly, we are able to quantitatively define five kinematic stages for the Séchilienne rock avalanche during this period. The synchronization between the rainfall and displacement rate, only resolved at the end of 2009 and beginning of 2010, corresponds to a remarkable change (fourfold
Parameterizing unconditional skewness in models for financial time series
DEFF Research Database (Denmark)
He, Changli; Silvennoinen, Annastiina; Teräsvirta, Timo
In this paper we consider the third-moment structure of a class of time series models. It is often argued that the marginal distribution of financial time series such as returns is skewed. Therefore it is of importance to know what properties a model should possess if it is to accommodate...
Self-organising mixture autoregressive model for non-stationary time series modelling.
Ni, He; Yin, Hujun
2008-12-01
Modelling non-stationary time series has been a difficult task for both parametric and nonparametric methods. One promising solution is to combine the flexibility of nonparametric models with the simplicity of parametric models. In this paper, the self-organising mixture autoregressive (SOMAR) network is adopted as a such mixture model. It breaks time series into underlying segments and at the same time fits local linear regressive models to the clusters of segments. In such a way, a global non-stationary time series is represented by a dynamic set of local linear regressive models. Neural gas is used for a more flexible structure of the mixture model. Furthermore, a new similarity measure has been introduced in the self-organising network to better quantify the similarity of time series segments. The network can be used naturally in modelling and forecasting non-stationary time series. Experiments on artificial, benchmark time series (e.g. Mackey-Glass) and real-world data (e.g. numbers of sunspots and Forex rates) are presented and the results show that the proposed SOMAR network is effective and superior to other similar approaches.
A stochastic HMM-based forecasting model for fuzzy time series.
Li, Sheng-Tun; Cheng, Yi-Chung
2010-10-01
Recently, fuzzy time series have attracted more academic attention than traditional time series due to their capability of dealing with the uncertainty and vagueness inherent in the data collected. The formulation of fuzzy relations is one of the key issues affecting forecasting results. Most of the present works adopt IF-THEN rules for relationship representation, which leads to higher computational overhead and rule redundancy. Sullivan and Woodall proposed a Markov-based formulation and a forecasting model to reduce computational overhead; however, its applicability is limited to handling one-factor problems. In this paper, we propose a novel forecasting model based on the hidden Markov model by enhancing Sullivan and Woodall's work to allow handling of two-factor forecasting problems. Moreover, in order to make the nature of conjecture and randomness of forecasting more realistic, the Monte Carlo method is adopted to estimate the outcome. To test the effectiveness of the resulting stochastic model, we conduct two experiments and compare the results with those from other models. The first experiment consists of forecasting the daily average temperature and cloud density in Taipei, Taiwan, and the second experiment is based on the Taiwan Weighted Stock Index by forecasting the exchange rate of the New Taiwan dollar against the U.S. dollar. In addition to improving forecasting accuracy, the proposed model adheres to the central limit theorem, and thus, the result statistically approximates to the real mean of the target value being forecast.
St.Clair, Travis; Cook, Thomas D.; Hallberg, Kelly
2014-01-01
Although evaluators often use an interrupted time series (ITS) design to test hypotheses about program effects, there are few empirical tests of the design's validity. We take a randomized experiment on an educational topic and compare its effects to those from a comparative ITS (CITS) design that uses the same treatment group as the experiment…
The Prediction of Teacher Turnover Employing Time Series Analysis.
Costa, Crist H.
The purpose of this study was to combine knowledge of teacher demographic data with time-series forecasting methods to predict teacher turnover. Moving averages and exponential smoothing were used to forecast discrete time series. The study used data collected from the 22 largest school districts in Iowa, designated as FACT schools. Predictions…
Stacked Heterogeneous Neural Networks for Time Series Forecasting
Directory of Open Access Journals (Sweden)
Florin Leon
2010-01-01
Full Text Available A hybrid model for time series forecasting is proposed. It is a stacked neural network, containing one normal multilayer perceptron with bipolar sigmoid activation functions, and the other with an exponential activation function in the output layer. As shown by the case studies, the proposed stacked hybrid neural model performs well on a variety of benchmark time series. The combination of weights of the two stack components that leads to optimal performance is also studied.
Chaotic time series prediction: From one to another
International Nuclear Information System (INIS)
Zhao Pengfei; Xing Lei; Yu Jun
2009-01-01
In this Letter, a new local linear prediction model is proposed to predict a chaotic time series of a component x(t) by using the chaotic time series of another component y(t) in the same system with x(t). Our approach is based on the phase space reconstruction coming from the Takens embedding theorem. To illustrate our results, we present an example of Lorenz system and compare with the performance of the original local linear prediction model.
Directory of Open Access Journals (Sweden)
G. P. Pavlos
1999-01-01
Full Text Available A long AE index time series is used as a crucial magnetospheric quantity in order to study the underlying dynainics. For this purpose we utilize methods of nonlinear and chaotic analysis of time series. Two basic components of this analysis are the reconstruction of the experimental tiine series state space trajectory of the underlying process and the statistical testing of an null hypothesis. The null hypothesis against which the experimental time series are tested is that the observed AE index signal is generated by a linear stochastic signal possibly perturbed by a static nonlinear distortion. As dis ' ' ating statistics we use geometrical characteristics of the reconstructed state space (Part I, which is the work of this paper and dynamical characteristics (Part II, which is the work a separate paper, and "nonlinear" surrogate data, generated by two different techniques which can mimic the original (AE index signal. lie null hypothesis is tested for geometrical characteristics which are the dimension of the reconstructed trajectory and some new geometrical parameters introduced in this work for the efficient discrimination between the nonlinear stochastic surrogate data and the AE index. Finally, the estimated geometric characteristics of the magnetospheric AE index present new evidence about the nonlinear and low dimensional character of the underlying magnetospheric dynamics for the AE index.
Grammar-based feature generation for time-series prediction
De Silva, Anthony Mihirana
2015-01-01
This book proposes a novel approach for time-series prediction using machine learning techniques with automatic feature generation. Application of machine learning techniques to predict time-series continues to attract considerable attention due to the difficulty of the prediction problems compounded by the non-linear and non-stationary nature of the real world time-series. The performance of machine learning techniques, among other things, depends on suitable engineering of features. This book proposes a systematic way for generating suitable features using context-free grammar. A number of feature selection criteria are investigated and a hybrid feature generation and selection algorithm using grammatical evolution is proposed. The book contains graphical illustrations to explain the feature generation process. The proposed approaches are demonstrated by predicting the closing price of major stock market indices, peak electricity load and net hourly foreign exchange client trade volume. The proposed method ...
Forecasting autoregressive time series under changing persistence
DEFF Research Database (Denmark)
Kruse, Robinson
Changing persistence in time series models means that a structural change from nonstationarity to stationarity or vice versa occurs over time. Such a change has important implications for forecasting, as negligence may lead to inaccurate model predictions. This paper derives generally applicable...
Recurrent Neural Networks for Multivariate Time Series with Missing Values.
Che, Zhengping; Purushotham, Sanjay; Cho, Kyunghyun; Sontag, David; Liu, Yan
2018-04-17
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
Conditional time series forecasting with convolutional neural networks
A. Borovykh (Anastasia); S.M. Bohte (Sander); C.W. Oosterlee (Cornelis)
2017-01-01
textabstractForecasting financial time series using past observations has been a significant topic of interest. While temporal relationships in the data exist, they are difficult to analyze and predict accurately due to the non-linear trends and noise present in the series. We propose to learn these
Fukaya, Keiichi; Kawamori, Ai; Osada, Yutaka; Kitazawa, Masumi; Ishiguro, Makio
2017-09-20
Women's basal body temperature (BBT) shows a periodic pattern that associates with menstrual cycle. Although this fact suggests a possibility that daily BBT time series can be useful for estimating the underlying phase state as well as for predicting the length of current menstrual cycle, little attention has been paid to model BBT time series. In this study, we propose a state-space model that involves the menstrual phase as a latent state variable to explain the daily fluctuation of BBT and the menstruation cycle length. Conditional distributions of the phase are obtained by using sequential Bayesian filtering techniques. A predictive distribution of the next menstruation day can be derived based on this conditional distribution and the model, leading to a novel statistical framework that provides a sequentially updated prediction for upcoming menstruation day. We applied this framework to a real data set of women's BBT and menstruation days and compared prediction accuracy of the proposed method with that of previous methods, showing that the proposed method generally provides a better prediction. Because BBT can be obtained with relatively small cost and effort, the proposed method can be useful for women's health management. Potential extensions of this framework as the basis of modeling and predicting events that are associated with the menstrual cycles are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
forecasting with nonlinear time series model: a monte-carlo
African Journals Online (AJOL)
PUBLICATIONS1
erated recursively up to any step greater than one. For nonlinear time series model, point forecast for step one can be done easily like in the linear case but forecast for a step greater than or equal to ..... London. Franses, P. H. (1998). Time series models for business and Economic forecasting, Cam- bridge University press.
Time series analysis of temporal networks
Sikdar, Sandipan; Ganguly, Niloy; Mukherjee, Animesh
2016-01-01
A common but an important feature of all real-world networks is that they are temporal in nature, i.e., the network structure changes over time. Due to this dynamic nature, it becomes difficult to propose suitable growth models that can explain the various important characteristic properties of these networks. In fact, in many application oriented studies only knowing these properties is sufficient. For instance, if one wishes to launch a targeted attack on a network, this can be done even without the knowledge of the full network structure; rather an estimate of some of the properties is sufficient enough to launch the attack. We, in this paper show that even if the network structure at a future time point is not available one can still manage to estimate its properties. We propose a novel method to map a temporal network to a set of time series instances, analyze them and using a standard forecast model of time series, try to predict the properties of a temporal network at a later time instance. To our aim, we consider eight properties such as number of active nodes, average degree, clustering coefficient etc. and apply our prediction framework on them. We mainly focus on the temporal network of human face-to-face contacts and observe that it represents a stochastic process with memory that can be modeled as Auto-Regressive-Integrated-Moving-Average (ARIMA). We use cross validation techniques to find the percentage accuracy of our predictions. An important observation is that the frequency domain properties of the time series obtained from spectrogram analysis could be used to refine the prediction framework by identifying beforehand the cases where the error in prediction is likely to be high. This leads to an improvement of 7.96% (for error level ≤20%) in prediction accuracy on an average across all datasets. As an application we show how such prediction scheme can be used to launch targeted attacks on temporal networks. Contribution to the Topical Issue
The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure
Euá n, Carolina; Ombao, Hernando; Ortega, Joaquí n
2018-01-01
We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms
Notes on economic time series analysis system theoretic perspectives
Aoki, Masanao
1983-01-01
In seminars and graduate level courses I have had several opportunities to discuss modeling and analysis of time series with economists and economic graduate students during the past several years. These experiences made me aware of a gap between what economic graduate students are taught about vector-valued time series and what is available in recent system literature. Wishing to fill or narrow the gap that I suspect is more widely spread than my personal experiences indicate, I have written these notes to augment and reor ganize materials I have given in these courses and seminars. I have endeavored to present, in as much a self-contained way as practicable, a body of results and techniques in system theory that I judge to be relevant and useful to economists interested in using time series in their research. I have essentially acted as an intermediary and interpreter of system theoretic results and perspectives in time series by filtering out non-essential details, and presenting coherent accounts of wha...
Liang, Y.; Gallaher, D. W.; Grant, G.; Lv, Q.
2011-12-01
Change over time, is the central driver of climate change detection. The goal is to diagnose the underlying causes, and make projections into the future. In an effort to optimize this process we have developed the Data Rod model, an object-oriented approach that provides the ability to query grid cell changes and their relationships to neighboring grid cells through time. The time series data is organized in time-centric structures called "data rods." A single data rod can be pictured as the multi-spectral data history at one grid cell: a vertical column of data through time. This resolves the long-standing problem of managing time-series data and opens new possibilities for temporal data analysis. This structure enables rapid time- centric analysis at any grid cell across multiple sensors and satellite platforms. Collections of data rods can be spatially and temporally filtered, statistically analyzed, and aggregated for use with pattern matching algorithms. Likewise, individual image pixels can be extracted to generate multi-spectral imagery at any spatial and temporal location. The Data Rods project has created a series of prototype databases to store and analyze massive datasets containing multi-modality remote sensing data. Using object-oriented technology, this method overcomes the operational limitations of traditional relational databases. To demonstrate the speed and efficiency of time-centric analysis using the Data Rods model, we have developed a sea ice detection algorithm. This application determines the concentration of sea ice in a small spatial region across a long temporal window. If performed using traditional analytical techniques, this task would typically require extensive data downloads and spatial filtering. Using Data Rods databases, the exact spatio-temporal data set is immediately available No extraneous data is downloaded, and all selected data querying occurs transparently on the server side. Moreover, fundamental statistical
Dynamical analysis and visualization of tornadoes time series.
Directory of Open Access Journals (Sweden)
António M Lopes
Full Text Available In this paper we analyze the behavior of tornado time-series in the U.S. from the perspective of dynamical systems. A tornado is a violently rotating column of air extending from a cumulonimbus cloud down to the ground. Such phenomena reveal features that are well described by power law functions and unveil characteristics found in systems with long range memory effects. Tornado time series are viewed as the output of a complex system and are interpreted as a manifestation of its dynamics. Tornadoes are modeled as sequences of Dirac impulses with amplitude proportional to the events size. First, a collection of time series involving 64 years is analyzed in the frequency domain by means of the Fourier transform. The amplitude spectra are approximated by power law functions and their parameters are read as an underlying signature of the system dynamics. Second, it is adopted the concept of circular time and the collective behavior of tornadoes analyzed. Clustering techniques are then adopted to identify and visualize the emerging patterns.
Dynamical analysis and visualization of tornadoes time series.
Lopes, António M; Tenreiro Machado, J A
2015-01-01
In this paper we analyze the behavior of tornado time-series in the U.S. from the perspective of dynamical systems. A tornado is a violently rotating column of air extending from a cumulonimbus cloud down to the ground. Such phenomena reveal features that are well described by power law functions and unveil characteristics found in systems with long range memory effects. Tornado time series are viewed as the output of a complex system and are interpreted as a manifestation of its dynamics. Tornadoes are modeled as sequences of Dirac impulses with amplitude proportional to the events size. First, a collection of time series involving 64 years is analyzed in the frequency domain by means of the Fourier transform. The amplitude spectra are approximated by power law functions and their parameters are read as an underlying signature of the system dynamics. Second, it is adopted the concept of circular time and the collective behavior of tornadoes analyzed. Clustering techniques are then adopted to identify and visualize the emerging patterns.
Seismic assessment of a site using the time series method
International Nuclear Information System (INIS)
Krutzik, N.J.; Rotaru, I.; Bobei, M.; Mingiuc, C.; Serban, V.; Androne, M.
1997-01-01
To increase the safety of a NPP located on a seismic site, the seismic acceleration level to which the NPP should be qualified must be as representative as possible for that site, with a conservative degree of safety but not too exaggerated. The consideration of the seismic events affecting the site as independent events and the use of statistic methods to define some safety levels with very low annual occurrence probability (10 -4 ) may lead to some exaggerations of the seismic safety level. The use of some very high value for the seismic acceleration imposed by the seismic safety levels required by the hazard analysis may lead to very costly technical solutions that can make the plant operation more difficult and increase maintenance costs. The considerations of seismic events as a time series with dependence among the events produced, may lead to a more representative assessment of a NPP site seismic activity and consequently to a prognosis on the seismic level values to which the NPP would be ensured throughout its life-span. That prognosis should consider the actual seismic activity (including small earthquakes in real time) of the focuses that affect the plant site. The paper proposes the applications of Autoregressive Time Series to issue a prognosis on the seismic activity of a focus and presents the analysis on Vrancea focus that affects NPP Cernavoda site, by this method. The paper also presents the manner to analyse the focus activity as per the new approach and it assesses the maximum seismic acceleration that may affect NPP Cernavoda throughout its life-span (∼ 30 years). Development and applications of new mathematical analysis method, both for long - and short - time intervals, may lead to important contributions in the process of foretelling the seismic events in the future. (authors)
Modelling road accidents: An approach using structural time series
Junus, Noor Wahida Md; Ismail, Mohd Tahir
2014-09-01
In this paper, the trend of road accidents in Malaysia for the years 2001 until 2012 was modelled using a structural time series approach. The structural time series model was identified using a stepwise method, and the residuals for each model were tested. The best-fitted model was chosen based on the smallest Akaike Information Criterion (AIC) and prediction error variance. In order to check the quality of the model, a data validation procedure was performed by predicting the monthly number of road accidents for the year 2012. Results indicate that the best specification of the structural time series model to represent road accidents is the local level with a seasonal model.
Multiscale Poincaré plots for visualizing the structure of heartbeat time series.
Henriques, Teresa S; Mariani, Sara; Burykin, Anton; Rodrigues, Filipa; Silva, Tiago F; Goldberger, Ary L
2016-02-09
Poincaré delay maps are widely used in the analysis of cardiac interbeat interval (RR) dynamics. To facilitate visualization of the structure of these time series, we introduce multiscale Poincaré (MSP) plots. Starting with the original RR time series, the method employs a coarse-graining procedure to create a family of time series, each of which represents the system's dynamics in a different time scale. Next, the Poincaré plots are constructed for the original and the coarse-grained time series. Finally, as an optional adjunct, color can be added to each point to represent its normalized frequency. We illustrate the MSP method on simulated Gaussian white and 1/f noise time series. The MSP plots of 1/f noise time series reveal relative conservation of the phase space area over multiple time scales, while those of white noise show a marked reduction in area. We also show how MSP plots can be used to illustrate the loss of complexity when heartbeat time series from healthy subjects are compared with those from patients with chronic (congestive) heart failure syndrome or with atrial fibrillation. This generalized multiscale approach to Poincaré plots may be useful in visualizing other types of time series.
Time series patterns and language support in DBMS
Telnarova, Zdenka
2017-07-01
This contribution is focused on pattern type Time Series as a rich in semantics representation of data. Some example of implementation of this pattern type in traditional Data Base Management Systems is briefly presented. There are many approaches how to manipulate with patterns and query patterns. Crucial issue can be seen in systematic approach to pattern management and specific pattern query language which takes into consideration semantics of patterns. Query language SQL-TS for manipulating with patterns is shown on Time Series data.
Two-fractal overlap time series: Earthquakes and market crashes
Indian Academy of Sciences (India)
velocity over the other and time series of stock prices. An anticipation method for some of the crashes have been proposed here, based on these observations. Keywords. Cantor set; time series; earthquake; market crash. PACS Nos 05.00; 02.50.-r; 64.60; 89.65.Gh; 95.75.Wx. 1. Introduction. Capturing dynamical patterns of ...
Analysis of cyclical behavior in time series of stock market returns
Stratimirović, Djordje; Sarvan, Darko; Miljković, Vladimir; Blesić, Suzana
2018-01-01
In this paper we have analyzed scaling properties and cyclical behavior of the three types of stock market indexes (SMI) time series: data belonging to stock markets of developed economies, emerging economies, and of the underdeveloped or transitional economies. We have used two techniques of data analysis to obtain and verify our findings: the wavelet transform (WT) spectral analysis to identify cycles in the SMI returns data, and the time-dependent detrended moving average (tdDMA) analysis to investigate local behavior around market cycles and trends. We found cyclical behavior in all SMI data sets that we have analyzed. Moreover, the positions and the boundaries of cyclical intervals that we found seam to be common for all markets in our dataset. We list and illustrate the presence of nine such periods in our SMI data. We report on the possibilities to differentiate between the level of growth of the analyzed markets by way of statistical analysis of the properties of wavelet spectra that characterize particular peak behaviors. Our results show that measures like the relative WT energy content and the relative WT amplitude of the peaks in the small scales region could be used to partially differentiate between market economies. Finally, we propose a way to quantify the level of development of a stock market based on estimation of local complexity of market's SMI series. From the local scaling exponents calculated for our nine peak regions we have defined what we named the Development Index, which proved, at least in the case of our dataset, to be suitable to rank the SMI series that we have analyzed in three distinct groups.
Time-resolved statistics of photon pairs in two-cavity Josephson photonics
Energy Technology Data Exchange (ETDEWEB)
Dambach, Simon; Kubala, Bjoern; Ankerhold, Joachim [Institute for Complex Quantum Systems and IQST, Ulm University (Germany)
2017-06-15
We analyze the creation and emission of pairs of highly nonclassical microwave photons in a setup where a voltage-biased Josephson junction is connected in series to two electromagnetic oscillators. Tuning the external voltage such that the Josephson frequency equals the sum of the two mode frequencies, each tunneling Cooper pair creates one additional photon in both of the two oscillators. The time-resolved statistics of photon emission events from the two oscillators is investigated by means of single- and cross-oscillator variants of the second-order correlation function g{sup (2)}(τ) and the waiting-time distribution w(τ). They provide insight into the strongly correlated quantum dynamics of the two oscillator subsystems and reveal a rich variety of quantum features of light including strong antibunching and the presence of negative values in the Wigner function. (copyright 2016 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)
Nonlinear time series analysis with R
Huffaker, Ray; Rosa, Rodolfo
2017-01-01
In the process of data analysis, the investigator is often facing highly-volatile and random-appearing observed data. A vast body of literature shows that the assumption of underlying stochastic processes was not necessarily representing the nature of the processes under investigation and, when other tools were used, deterministic features emerged. Non Linear Time Series Analysis (NLTS) allows researchers to test whether observed volatility conceals systematic non linear behavior, and to rigorously characterize governing dynamics. Behavioral patterns detected by non linear time series analysis, along with scientific principles and other expert information, guide the specification of mechanistic models that serve to explain real-world behavior rather than merely reproducing it. Often there is a misconception regarding the complexity of the level of mathematics needed to understand and utilize the tools of NLTS (for instance Chaos theory). However, mathematics used in NLTS is much simpler than many other subjec...
InSAR Deformation Time Series Processed On-Demand in the Cloud
Horn, W. B.; Weeden, R.; Dimarchi, H.; Arko, S. A.; Hogenson, K.
2017-12-01
During this past year, ASF has developed a cloud-based on-demand processing system known as HyP3 (http://hyp3.asf.alaska.edu/), the Hybrid Pluggable Processing Pipeline, for Synthetic Aperture Radar (SAR) data. The system makes it easy for a user who doesn't have the time or inclination to install and use complex SAR processing software to leverage SAR data in their research or operations. One such processing algorithm is generation of a deformation time series product, which is a series of images representing ground displacements over time, which can be computed using a time series of interferometric SAR (InSAR) products. The set of software tools necessary to generate this useful product are difficult to install, configure, and use. Moreover, for a long time series with many images, the processing of just the interferograms can take days. Principally built by three undergraduate students at the ASF DAAC, the deformation time series processing relies the new Amazon Batch service, which enables processing of jobs with complex interconnected dependencies in a straightforward and efficient manner. In the case of generating a deformation time series product from a stack of single-look complex SAR images, the system uses Batch to serialize the up-front processing, interferogram generation, optional tropospheric correction, and deformation time series generation. The most time consuming portion is the interferogram generation, because even for a fairly small stack of images many interferograms need to be processed. By using AWS Batch, the interferograms are all generated in parallel; the entire process completes in hours rather than days. Additionally, the individual interferograms are saved in Amazon's cloud storage, so that when new data is acquired in the stack, an updated time series product can be generated with minimal addiitonal processing. This presentation will focus on the development techniques and enabling technologies that were used in developing the time
Hyvärinen, A
1985-01-01
The main purpose of the present study was to describe the statistical behaviour of daily analytical errors in the dimensions of place and time, providing a statistical basis for realistic estimates of the analytical error, and hence allowing the importance of the error and the relative contributions of its different sources to be re-evaluated. The observation material consists of creatinine and glucose results for control sera measured in daily routine quality control in five laboratories for a period of one year. The observation data were processed and computed by means of an automated data processing system. Graphic representations of time series of daily observations, as well as their means and dispersion limits when grouped over various time intervals, were investigated. For partition of the total variation several two-way analyses of variance were done with laboratory and various time classifications as factors. Pooled sets of observations were tested for normality of distribution and for consistency of variances, and the distribution characteristics of error variation in different categories of place and time were compared. Errors were found from the time series to vary typically between days. Due to irregular fluctuations in general and particular seasonal effects in creatinine, stable estimates of means or of dispersions for errors in individual laboratories could not be easily obtained over short periods of time but only from data sets pooled over long intervals (preferably at least one year). Pooled estimates of proportions of intralaboratory variation were relatively low (less than 33%) when the variation was pooled within days. However, when the variation was pooled over longer intervals this proportion increased considerably, even to a maximum of 89-98% (95-98% in each method category) when an outlying laboratory in glucose was omitted, with a concomitant decrease in the interaction component (representing laboratory-dependent variation with time
Vector bilinear autoregressive time series model and its superiority ...
African Journals Online (AJOL)
In this research, a vector bilinear autoregressive time series model was proposed and used to model three revenue series (X1, X2, X3) . The “orders” of the three series were identified on the basis of the distribution of autocorrelation and partial autocorrelation functions and were used to construct the vector bilinear models.
R - evolution in Time Series Analysis Software Applied on R - omanian Capital Market
Directory of Open Access Journals (Sweden)
Ciprian ALEXANDRU
2014-06-01
Full Text Available Worldwide and during the last decade, R has developed in a balanced way and nowadays it represents the most powerful tool for computational statistics, data science and visualization. Millions of data scientists use R to face their most challenging problems in topics ranging from economics to engineering and genetics. In this study, R was used to compute data on stock market prices in order to build trading models and to estimate the evolution of the quantitative financial market. These models were already applied on the international capital markets. In Romania, the quantitative modeling of capital market is available only for clients of trading brokers because the time series data are collected for the commercial purpose; in that circumstance, the statistical computing tools meet the inertia to change. This paper aims to expose a small part of the capability of R to use mix-and-match models and cutting-edge methods in statistics and quantitative modeling in order to build an alternative way to analyze capital market in Romania over the commercial threshold.
25 years of time series forecasting
de Gooijer, J.G.; Hyndman, R.J.
2006-01-01
We review the past 25 years of research into time series forecasting. In this silver jubilee issue, we naturally highlight results published in journals managed by the International Institute of Forecasters (Journal of Forecasting 1982-1985 and International Journal of Forecasting 1985-2005). During
Markov Trends in Macroeconomic Time Series
R. Paap (Richard)
1997-01-01
textabstractMany macroeconomic time series are characterised by long periods of positive growth, expansion periods, and short periods of negative growth, recessions. A popular model to describe this phenomenon is the Markov trend, which is a stochastic segmented trend where the slope depends on the
Modeling seasonality in bimonthly time series
Ph.H.B.F. Franses (Philip Hans)
1992-01-01
textabstractA recurring issue in modeling seasonal time series variables is the choice of the most adequate model for the seasonal movements. One selection method for quarterly data is proposed in Hylleberg et al. (1990). Market response models are often constructed for bimonthly variables, and
Effects of dating errors on nonparametric trend analyses of speleothem time series
Directory of Open Access Journals (Sweden)
M. Mudelsee
2012-10-01
Full Text Available A fundamental problem in paleoclimatology is to take fully into account the various error sources when examining proxy records with quantitative methods of statistical time series analysis. Records from dated climate archives such as speleothems add extra uncertainty from the age determination to the other sources that consist in measurement and proxy errors. This paper examines three stalagmite time series of oxygen isotopic composition (δ^{18}O from two caves in western Germany, the series AH-1 from the Atta Cave and the series Bu1 and Bu4 from the Bunker Cave. These records carry regional information about past changes in winter precipitation and temperature. U/Th and radiocarbon dating reveals that they cover the later part of the Holocene, the past 8.6 thousand years (ka. We analyse centennial- to millennial-scale climate trends by means of nonparametric Gasser–Müller kernel regression. Error bands around fitted trend curves are determined by combining (1 block bootstrap resampling to preserve noise properties (shape, autocorrelation of the δ^{18}O residuals and (2 timescale simulations (models StalAge and iscam. The timescale error influences on centennial- to millennial-scale trend estimation are not excessively large. We find a "mid-Holocene climate double-swing", from warm to cold to warm winter conditions (6.5 ka to 6.0 ka to 5.1 ka, with warm–cold amplitudes of around 0.5‰ δ^{18}O; this finding is documented by all three records with high confidence. We also quantify the Medieval Warm Period (MWP, the Little Ice Age (LIA and the current warmth. Our analyses cannot unequivocally support the conclusion that current regional winter climate is warmer than that during the MWP.
Phenomapping of rangelands in South Africa using time series of RapidEye data
Parplies, André; Dubovyk, Olena; Tewes, Andreas; Mund, Jan-Peter; Schellberg, Jürgen
2016-12-01
Phenomapping is an approach which allows the derivation of spatial patterns of vegetation phenology and rangeland productivity based on time series of vegetation indices. In our study, we propose a new spatial mapping approach which combines phenometrics derived from high resolution (HR) satellite time series with spatial logistic regression modeling to discriminate land management systems in rangelands. From the RapidEye time series for selected rangelands in South Africa, we calculated bi-weekly noise reduced Normalized Difference Vegetation Index (NDVI) images. For the growing season of 20112012, we further derived principal phenology metrics such as start, end and length of growing season and related phenological variables such as amplitude, left derivative and small integral of the NDVI curve. We then mapped these phenometrics across two different tenure systems, communal and commercial, at the very detailed spatial resolution of 5 m. The result of a binary logistic regression (BLR) has shown that the amplitude and the left derivative of the NDVI curve were statistically significant. These indicators are useful to discriminate commercial from communal rangeland systems. We conclude that phenomapping combined with spatial modeling is a powerful tool that allows efficient aggregation of phenology and productivity metrics for spatially explicit analysis of the relationships of crop phenology with site conditions and management. This approach has particular potential for disaggregated and patchy environments such as in farming systems in semi-arid South Africa, where phenology varies considerably among and within years. Further, we see a strong perspective for phenomapping to support spatially explicit modelling of vegetation.
FALSE DETERMINATIONS OF CHAOS IN SHORT NOISY TIME SERIES. (R828745)
A method (NEMG) proposed in 1992 for diagnosing chaos in noisy time series with 50 or fewer observations entails fitting the time series with an empirical function which predicts an observation in the series from previous observations, and then estimating the rate of divergenc...
Multiscale multifractal multiproperty analysis of financial time series based on Rényi entropy
Yujun, Yang; Jianping, Li; Yimei, Yang
This paper introduces a multiscale multifractal multiproperty analysis based on Rényi entropy (3MPAR) method to analyze short-range and long-range characteristics of financial time series, and then applies this method to the five time series of five properties in four stock indices. Combining the two analysis techniques of Rényi entropy and multifractal detrended fluctuation analysis (MFDFA), the 3MPAR method focuses on the curves of Rényi entropy and generalized Hurst exponent of five properties of four stock time series, which allows us to study more universal and subtle fluctuation characteristics of financial time series. By analyzing the curves of the Rényi entropy and the profiles of the logarithm distribution of MFDFA of five properties of four stock indices, the 3MPAR method shows some fluctuation characteristics of the financial time series and the stock markets. Then, it also shows a richer information of the financial time series by comparing the profile of five properties of four stock indices. In this paper, we not only focus on the multifractality of time series but also the fluctuation characteristics of the financial time series and subtle differences in the time series of different properties. We find that financial time series is far more complex than reported in some research works using one property of time series.
A Literature Survey of Early Time Series Classification and Deep Learning
Santos, Tiago; Kern, Roman
2017-01-01
This paper provides an overview of current literature on time series classification approaches, in particular of early time series classification. A very common and effective time series classification approach is the 1-Nearest Neighbor classier, with different distance measures such as the Euclidean or dynamic time warping distances. This paper starts by reviewing these baseline methods. More recently, with the gain in popularity in the application of deep neural networks to the eld of...
Signal Processing for Time-Series Functions on a Graph
2018-02-01
Figures Fig. 1 Time -series function on a fixed graph.............................................2 iv Approved for public release; distribution is...φi〉`2(V)φi (39) 6= f̄ (40) Instead, we simply recover the average of f over time . 13 Approved for public release; distribution is unlimited. This...ARL-TR-8276• FEB 2018 US Army Research Laboratory Signal Processing for Time -Series Functions on a Graph by Humberto Muñoz-Barona, Jean Vettel, and
Learning of time series through neuron-to-neuron instruction
Energy Technology Data Exchange (ETDEWEB)
Miyazaki, Y [Department of Physics, Kyoto University, Kyoto 606-8502, (Japan); Kinzel, W [Institut fuer Theoretische Physik, Universitaet Wurzburg, 97074 Wurzburg (Germany); Shinomoto, S [Department of Physics, Kyoto University, Kyoto (Japan)
2003-02-07
A model neuron with delayline feedback connections can learn a time series generated by another model neuron. It has been known that some student neurons that have completed such learning under the instruction of a teacher's quasi-periodic sequence mimic the teacher's time series over a long interval, even after instruction has ceased. We found that in addition to such faithful students, there are unfaithful students whose time series eventually diverge exponentially from that of the teacher. In order to understand the circumstances that allow for such a variety of students, the orbit dimension was estimated numerically. The quasi-periodic orbits in question were found to be confined in spaces with dimensions significantly smaller than that of the full phase space.
Learning of time series through neuron-to-neuron instruction
International Nuclear Information System (INIS)
Miyazaki, Y; Kinzel, W; Shinomoto, S
2003-01-01
A model neuron with delayline feedback connections can learn a time series generated by another model neuron. It has been known that some student neurons that have completed such learning under the instruction of a teacher's quasi-periodic sequence mimic the teacher's time series over a long interval, even after instruction has ceased. We found that in addition to such faithful students, there are unfaithful students whose time series eventually diverge exponentially from that of the teacher. In order to understand the circumstances that allow for such a variety of students, the orbit dimension was estimated numerically. The quasi-periodic orbits in question were found to be confined in spaces with dimensions significantly smaller than that of the full phase space
Directory of Open Access Journals (Sweden)
V. Tramutoli
1996-06-01
Full Text Available An autoregressive model was selected to describe geoelectrical time series. An objective technique was subsequently applied to analyze and discriminate values above (below an a priorifixed threshold possibly related to seismic events. A complete check of the model and the main guidelines to estimate the occurrence probability of extreme events are reported. A first application of the proposed technique is discussed through the analysis of the experimental data recorded by an automatic station located in Tito, a small town on the Apennine chain in Southern Italy. This region was hit by the November 1980 Irpinia-Basilicata earthquake and it is one of most active areas of the Mediterranean region. After a preliminary filtering procedure to reduce the influence of external parameters (i.e. the meteo-climatic effects, it was demonstrated that the geoelectrical residual time series are well described by means of a second order autoregressive model. Our findings outline a statistical methodology to evaluate the efficiency of electrical seismic precursors.
Quirky patterns in time-series of estimates of recruitment could be artefacts
DEFF Research Database (Denmark)
Dickey-Collas, M.; Hinzen, N.T.; Nash, R.D.M.
2015-01-01
of recruitment time-series in databases is therefore not consistent across or within species and stocks. Caution is therefore required as perhaps the characteristics of the time-series of stock dynamics may be determined by the model used to generate them, rather than underlying ecological phenomena......The accessibility of databases of global or regional stock assessment outputs is leading to an increase in meta-analysis of the dynamics of fish stocks. In most of these analyses, each of the time-series is generally assumed to be directly comparable. However, the approach to stock assessment...... employed, and the associated modelling assumptions, can have an important influence on the characteristics of each time-series. We explore this idea by investigating recruitment time-series with three different recruitment parameterizations: a stock–recruitment model, a random-walk time-series model...
The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure
Euán, Carolina
2018-04-12
We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.
Mullan, Donal; Chen, Jie; Zhang, Xunchang John
2016-02-01
Statistical downscaling (SD) methods have become a popular, low-cost and accessible means of bridging the gap between the coarse spatial resolution at which climate models output climate scenarios and the finer spatial scale at which impact modellers require these scenarios, with various different SD techniques used for a wide range of applications across the world. This paper compares the Generator for Point Climate Change (GPCC) model and the Statistical DownScaling Model (SDSM)—two contrasting SD methods—in terms of their ability to generate precipitation series under non-stationary conditions across ten contrasting global climates. The mean, maximum and a selection of distribution statistics as well as the cumulative frequencies of dry and wet spells for four different temporal resolutions were compared between the models and the observed series for a validation period. Results indicate that both methods can generate daily precipitation series that generally closely mirror observed series for a wide range of non-stationary climates. However, GPCC tends to overestimate higher precipitation amounts, whilst SDSM tends to underestimate these. This infers that GPCC is more likely to overestimate the effects of precipitation on a given impact sector, whilst SDSM is likely to underestimate the effects. GPCC performs better than SDSM in reproducing wet and dry day frequency, which is a key advantage for many impact sectors. Overall, the mixed performance of the two methods illustrates the importance of users performing a thorough validation in order to determine the influence of simulated precipitation on their chosen impact sector.
A Multivariate Time Series Method for Monte Carlo Reactor Analysis
International Nuclear Information System (INIS)
Taro Ueki
2008-01-01
A robust multivariate time series method has been established for the Monte Carlo calculation of neutron multiplication problems. The method is termed Coarse Mesh Projection Method (CMPM) and can be implemented using the coarse statistical bins for acquisition of nuclear fission source data. A novel aspect of CMPM is the combination of the general technical principle of projection pursuit in the signal processing discipline and the neutron multiplication eigenvalue problem in the nuclear engineering discipline. CMPM enables reactor physicists to accurately evaluate major eigenvalue separations of nuclear reactors with continuous energy Monte Carlo calculation. CMPM was incorporated in the MCNP Monte Carlo particle transport code of Los Alamos National Laboratory. The great advantage of CMPM over the traditional Fission Matrix method is demonstrated for the three space-dimensional modeling of the initial core of a pressurized water reactor