time series preprocessing: Topics by WorldWideScience.org

Sample records for time series preprocessing

Effective Feature Preprocessing for Time Series Forecasting

DEFF Research Database (Denmark)

Zhao, Junhua; Dong, Zhaoyang; Xu, Zhao

2006-01-01

Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting...... performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time...... series forecasting models....
Change detection using landsat time series: A review of frequencies, preprocessing, algorithms, and applications

Science.gov (United States)

Zhu, Zhe

2017-08-01

The free and open access to all archived Landsat images in 2008 has completely changed the way of using Landsat data. Many novel change detection algorithms based on Landsat time series have been developed We present a comprehensive review of four important aspects of change detection studies based on Landsat time series, including frequencies, preprocessing, algorithms, and applications. We observed the trend that the more recent the study, the higher the frequency of Landsat time series used. We reviewed a series of image preprocessing steps, including atmospheric correction, cloud and cloud shadow detection, and composite/fusion/metrics techniques. We divided all change detection algorithms into six categories, including thresholding, differencing, segmentation, trajectory classification, statistical boundary, and regression. Within each category, six major characteristics of different algorithms, such as frequency, change index, univariate/multivariate, online/offline, abrupt/gradual change, and sub-pixel/pixel/spatial were analyzed. Moreover, some of the widely-used change detection algorithms were also discussed. Finally, we reviewed different change detection applications by dividing these applications into two categories, change target and change agent detection.
THE EFFECT OF DECOMPOSITION METHOD AS DATA PREPROCESSING ON NEURAL NETWORKS MODEL FOR FORECASTING TREND AND SEASONAL TIME SERIES

Directory of Open Access Journals (Sweden)

Subanar Subanar

2006-01-01

Full Text Available Recently, one of the central topics for the neural networks (NN community is the issue of data preprocessing on the use of NN. In this paper, we will investigate this topic particularly on the effect of Decomposition method as data processing and the use of NN for modeling effectively time series with both trend and seasonal patterns. Limited empirical studies on seasonal time series forecasting with neural networks show that some find neural networks are able to model seasonality directly and prior deseasonalization is not necessary, and others conclude just the opposite. In this research, we study particularly on the effectiveness of data preprocessing, including detrending and deseasonalization by applying Decomposition method on NN modeling and forecasting performance. We use two kinds of data, simulation and real data. Simulation data are examined on multiplicative of trend and seasonality patterns. The results are compared to those obtained from the classical time series model. Our result shows that a combination of detrending and deseasonalization by applying Decomposition method is the effective data preprocessing on the use of NN for forecasting trend and seasonal time series.
Characterizing the continuously acquired cardiovascular time series during hemodialysis, using median hybrid filter preprocessing noise reduction.

Science.gov (United States)

Wilson, Scott; Bowyer, Andrea; Harrap, Stephen B

2015-01-01

The clinical characterization of cardiovascular dynamics during hemodialysis (HD) has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP) changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP) readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF) algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information.
Characterizing the continuously acquired cardiovascular time series during hemodialysis, using median hybrid filter preprocessing noise reduction

Directory of Open Access Journals (Sweden)

Wilson S

2015-01-01

Full Text Available Scott Wilson,1,2 Andrea Bowyer,3 Stephen B Harrap4 1Department of Renal Medicine, The Alfred Hospital, 2Baker IDI, Melbourne, 3Department of Anaesthesia, Royal Melbourne Hospital, 4University of Melbourne, Parkville, VIC, Australia Abstract: The clinical characterization of cardiovascular dynamics during hemodialysis (HD has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information. Keywords: continuous monitoring, blood pressure
Neural network versus classical time series forecasting models

Science.gov (United States)

Nor, Maria Elena; Safuan, Hamizah Mohd; Shab, Noorzehan Fazahiyah Md; Asrul, Mohd; Abdullah, Affendi; Mohamad, Nurul Asmaa Izzati; Lee, Muhammad Hisyam

2017-05-01

Artificial neural network (ANN) has advantage in time series forecasting as it has potential to solve complex forecasting problems. This is because ANN is data driven approach which able to be trained to map past values of a time series. In this study the forecast performance between neural network and classical time series forecasting method namely seasonal autoregressive integrated moving average models was being compared by utilizing gold price data. Moreover, the effect of different data preprocessing on the forecast performance of neural network being examined. The forecast accuracy was evaluated using mean absolute deviation, root mean square error and mean absolute percentage error. It was found that ANN produced the most accurate forecast when Box-Cox transformation was used as data preprocessing.
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery.

Science.gov (United States)

Qi, Baogui; Shi, Hao; Zhuang, Yin; Chen, He; Chen, Liang

2018-04-25

With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited.
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery

Science.gov (United States)

Qi, Baogui; Zhuang, Yin; Chen, He; Chen, Liang

2018-01-01

With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited. PMID:29693585
Real-time topic-aware influence maximization using preprocessing.

Science.gov (United States)

Chen, Wei; Lin, Tian; Yang, Cheng

2016-01-01

Influence maximization is the task of finding a set of seed nodes in a social network such that the influence spread of these seed nodes based on certain influence diffusion model is maximized. Topic-aware influence diffusion models have been recently proposed to address the issue that influence between a pair of users are often topic-dependent and information, ideas, innovations etc. being propagated in networks are typically mixtures of topics. In this paper, we focus on the topic-aware influence maximization task. In particular, we study preprocessing methods to avoid redoing influence maximization for each mixture from scratch. We explore two preprocessing algorithms with theoretical justifications. Our empirical results on data obtained in a couple of existing studies demonstrate that one of our algorithms stands out as a strong candidate providing microsecond online response time and competitive influence spread, with reasonable preprocessing effort.
Testing for intracycle determinism in pseudoperiodic time series.

Science.gov (United States)

Coelho, Mara C S; Mendes, Eduardo M A M; Aguirre, Luis A

2008-06-01

A determinism test is proposed based on the well-known method of the surrogate data. Assuming predictability to be a signature of determinism, the proposed method checks for intracycle (e.g., short-term) determinism in the pseudoperiodic time series for which standard methods of surrogate analysis do not apply. The approach presented is composed of two steps. First, the data are preprocessed to reduce the effects of seasonal and trend components. Second, standard tests of surrogate analysis can then be used. The determinism test is applied to simulated and experimental pseudoperiodic time series and the results show the applicability of the proposed test.
FTSPlot: fast time series visualization for large datasets.

Directory of Open Access Journals (Sweden)

Michael Riss

Full Text Available The analysis of electrophysiological recordings often involves visual inspection of time series data to locate specific experiment epochs, mask artifacts, and verify the results of signal processing steps, such as filtering or spike detection. Long-term experiments with continuous data acquisition generate large amounts of data. Rapid browsing through these massive datasets poses a challenge to conventional data plotting software because the plotting time increases proportionately to the increase in the volume of data. This paper presents FTSPlot, which is a visualization concept for large-scale time series datasets using techniques from the field of high performance computer graphics, such as hierarchic level of detail and out-of-core data handling. In a preprocessing step, time series data, event, and interval annotations are converted into an optimized data format, which then permits fast, interactive visualization. The preprocessing step has a computational complexity of O(n x log(N; the visualization itself can be done with a complexity of O(1 and is therefore independent of the amount of data. A demonstration prototype has been implemented and benchmarks show that the technology is capable of displaying large amounts of time series data, event, and interval annotations lag-free with < 20 ms ms. The current 64-bit implementation theoretically supports datasets with up to 2(64 bytes, on the x86_64 architecture currently up to 2(48 bytes are supported, and benchmarks have been conducted with 2(40 bytes/1 TiB or 1.3 x 10(11 double precision samples. The presented software is freely available and can be included as a Qt GUI component in future software projects, providing a standard visualization method for long-term electrophysiological experiments.
UniFIeD Univariate Frequency-based Imputation for Time Series Data

OpenAIRE

Friese, Martina; Stork, Jörg; Ramos Guerra, Ricardo; Bartz-Beielstein, Thomas; Thaker, Soham; Flasch, Oliver; Zaefferer, Martin

2013-01-01

This paper introduces UniFIeD, a new data preprocessing method for time series. UniFIeD can cope with large intervals of missing data. A scalable test function generator, which allows the simulation of time series with different gap sizes, is presented additionally. An experimental study demonstrates that (i) UniFIeD shows a significant better performance than simple imputation methods and (ii) UniFIeD is able to handle situations, where advanced imputation methods fail. The results are indep...
Parallel pipeline algorithm of real time star map preprocessing

Science.gov (United States)

Wang, Hai-yong; Qin, Tian-mu; Liu, Jia-qi; Li, Zhi-feng; Li, Jian-hua

2016-03-01

To improve the preprocessing speed of star map and reduce the resource consumption of embedded system of star tracker, a parallel pipeline real-time preprocessing algorithm is presented. The two characteristics, the mean and the noise standard deviation of the background gray of a star map, are firstly obtained dynamically by the means that the intervene of the star image itself to the background is removed in advance. The criterion on whether or not the following noise filtering is needed is established, then the extraction threshold value is assigned according to the level of background noise, so that the centroiding accuracy is guaranteed. In the processing algorithm, as low as two lines of pixel data are buffered, and only 100 shift registers are used to record the connected domain label, by which the problems of resources wasting and connected domain overflow are solved. The simulating results show that the necessary data of the selected bright stars could be immediately accessed in a delay time as short as 10us after the pipeline processing of a 496×496 star map in 50Mb/s is finished, and the needed memory and registers resource total less than 80kb. To verify the accuracy performance of the algorithm proposed, different levels of background noise are added to the processed ideal star map, and the statistic centroiding error is smaller than 1/23 pixel under the condition that the signal to noise ratio is greater than 1. The parallel pipeline algorithm of real time star map preprocessing helps to increase the data output speed and the anti-dynamic performance of star tracker.
Assessing error sources for Landsat time series analysis for tropical test sites in Viet Nam and Ethiopia

Science.gov (United States)

Schultz, Michael; Verbesselt, Jan; Herold, Martin; Avitabile, Valerio

2013-10-01

Researchers who use remotely sensed data can spend half of their total effort analysing prior data. If this data preprocessing does not match the application, this time spent on data analysis can increase considerably and can lead to inaccuracies. Despite the existence of a number of methods for pre-processing Landsat time series, each method has shortcomings, particularly for mapping forest changes under varying illumination, data availability and atmospheric conditions. Based on the requirements of mapping forest changes as defined by the United Nations (UN) Reducing Emissions from Forest Degradation and Deforestation (REDD) program, the accurate reporting of the spatio-temporal properties of these changes is necessary. We compared the impact of three fundamentally different radiometric preprocessing techniques Moderate Resolution Atmospheric TRANsmission (MODTRAN), Second Simulation of a Satellite Signal in the Solar Spectrum (6S) and simple Dark Object Subtraction (DOS) on mapping forest changes using Landsat time series data. A modification of Breaks For Additive Season and Trend (BFAST) monitor was used to jointly map the spatial and temporal agreement of forest changes at test sites in Ethiopia and Viet Nam. The suitability of the pre-processing methods for the occurring forest change drivers was assessed using recently captured Ground Truth and high resolution data (1000 points). A method for creating robust generic forest maps used for the sampling design is presented. An assessment of error sources has been performed identifying haze as a major source for time series analysis commission error.
Effectiveness of Multivariate Time Series Classification Using Shapelets

Directory of Open Access Journals (Sweden)

A. P. Karpenko

2015-01-01

Full Text Available Typically, time series classifiers require signal pre-processing (filtering signals from noise and artifact removal, etc., enhancement of signal features (amplitude, frequency, spectrum, etc., classification of signal features in space using the classical techniques and classification algorithms of multivariate data. We consider a method of classifying time series, which does not require enhancement of the signal features. The method uses the shapelets of time series (time series shapelets i.e. small fragments of this series, which reflect properties of one of its classes most of all.Despite the significant number of publications on the theory and shapelet applications for classification of time series, the task to evaluate the effectiveness of this technique remains relevant. An objective of this publication is to study the effectiveness of a number of modifications of the original shapelet method as applied to the multivariate series classification that is a littlestudied problem. The paper presents the problem statement of multivariate time series classification using the shapelets and describes the shapelet–based basic method of binary classification, as well as various generalizations and proposed modification of the method. It also offers the software that implements a modified method and results of computational experiments confirming the effectiveness of the algorithmic and software solutions.The paper shows that the modified method and the software to use it allow us to reach the classification accuracy of about 85%, at best. The shapelet search time increases in proportion to input data dimension.
The Timeseries Toolbox - A Web Application to Enable Accessible, Reproducible Time Series Analysis

Science.gov (United States)

Veatch, W.; Friedman, D.; Baker, B.; Mueller, C.

2017-12-01

The vast majority of data analyzed by climate researchers are repeated observations of physical process or time series data. This data lends itself of a common set of statistical techniques and models designed to determine trends and variability (e.g., seasonality) of these repeated observations. Often, these same techniques and models can be applied to a wide variety of different time series data. The Timeseries Toolbox is a web application designed to standardize and streamline these common approaches to time series analysis and modeling with particular attention to hydrologic time series used in climate preparedness and resilience planning and design by the U. S. Army Corps of Engineers. The application performs much of the pre-processing of time series data necessary for more complex techniques (e.g. interpolation, aggregation). With this tool, users can upload any dataset that conforms to a standard template and immediately begin applying these techniques to analyze their time series data.
Creation and evaluation of a database of renewable production time series and other data for energy system modelling

International Nuclear Information System (INIS)

Janker, Karl Albert

2015-01-01

This thesis describes a model which generates renewable power generation time series as input data for energy system models. The focus is on photovoltaic systems and wind turbines. The basis is a high resolution global raster data set of weather data for many years. This data is validated, corrected and preprocessed. The composition of the hourly generation data is done via simulation of the respective technology. The generated time series are aggregated for different regions and are validated against historical production time series.
Automated preparation of Kepler time series of planet hosts for asteroseismic analysis

DEFF Research Database (Denmark)

Handberg, R.; Lund, M. N.

2014-01-01

. In this paper we present the KASOC Filter, which is used to automatically prepare data from the Kepler/K2 mission for asteroseismic analyses of solar-like planet host stars. The methods are very effective at removing unwanted signals of both instrumental and planetary origins and produce significantly cleaner......One of the tasks of the Kepler Asteroseismic Science Operations Center (KASOC) is to provide asteroseismic analyses on Kepler Objects of Interest (KOIs). However, asteroseismic analysis of planetary host stars presents some unique complications with respect to data preprocessing, compared to pure...... asteroseismic targets. If not accounted for, the presence of planetary transits in the photometric time series often greatly complicates or even hinders these asteroseismic analyses. This drives the need for specialised methods of preprocessing data to make them suitable for asteroseismic analysis...
JTSA: an open source framework for time series abstractions.

Science.gov (United States)

Sacchi, Lucia; Capozzi, Davide; Bellazzi, Riccardo; Larizza, Cristiana

2015-10-01

The evaluation of the clinical status of a patient is frequently based on the temporal evolution of some parameters, making the detection of temporal patterns a priority in data analysis. Temporal abstraction (TA) is a methodology widely used in medical reasoning for summarizing and abstracting longitudinal data. This paper describes JTSA (Java Time Series Abstractor), a framework including a library of algorithms for time series preprocessing and abstraction and an engine to execute a workflow for temporal data processing. The JTSA framework is grounded on a comprehensive ontology that models temporal data processing both from the data storage and the abstraction computation perspective. The JTSA framework is designed to allow users to build their own analysis workflows by combining different algorithms. Thanks to the modular structure of a workflow, simple to highly complex patterns can be detected. The JTSA framework has been developed in Java 1.7 and is distributed under GPL as a jar file. JTSA provides: a collection of algorithms to perform temporal abstraction and preprocessing of time series, a framework for defining and executing data analysis workflows based on these algorithms, and a GUI for workflow prototyping and testing. The whole JTSA project relies on a formal model of the data types and of the algorithms included in the library. This model is the basis for the design and implementation of the software application. Taking into account this formalized structure, the user can easily extend the JTSA framework by adding new algorithms. Results are shown in the context of the EU project MOSAIC to extract relevant patterns from data coming related to the long term monitoring of diabetic patients. The proof that JTSA is a versatile tool to be adapted to different needs is given by its possible uses, both as a standalone tool for data summarization and as a module to be embedded into other architectures to select specific phenotypes based on TAs in a large
Constant time distance queries in planar unweighted graphs with subquadratic preprocessing time

DEFF Research Database (Denmark)

Wulff-Nilsen, C.

2013-01-01

Let G be an n-vertex planar, undirected, and unweighted graph. It was stated as open problems whether the Wiener index, defined as the sum of all-pairs shortest path distances, and the diameter of G can be computed in o(n(2)) time. We show that both problems can be solved in O(n(2) log log n/log n......) time with O(n) space. The techniques that we apply allow us to build, within the same time bound, an oracle for exact distance queries in G. More generally, for any parameter S is an element of [(log n/log log n)(2), n(2/5)], distance queries can be answered in O (root S log S/log n) time per query...... with O(n(2)/root S) preprocessing time and space requirement. With respect to running time, this is better than previous algorithms when log S = o(log n). All algorithms have linear space requirement. Our results generalize to a larger class of graphs including those with a fixed excluded minor. (C) 2012...

Spectral Estimation of UV-Vis Absorbance Time Series for Water Quality Monitoring

Directory of Open Access Journals (Sweden)

Leonardo Plazas-Nossa

2017-05-01

Full Text Available Context: Signals recorded as multivariate time series by UV-Vis absorbance captors installed in urban sewer systems, can be non-stationary, yielding complications in the analysis of water quality monitoring. This work proposes to perform spectral estimation using the Box-Cox transformation and differentiation in order to obtain stationary multivariate time series in a wide sense. Additionally, Principal Component Analysis (PCA is applied to reduce their dimensionality. Method: Three different UV-Vis absorbance time series for different Colombian locations were studied: (i El-Salitre Wastewater Treatment Plant (WWTP in Bogotá; (ii Gibraltar Pumping Station (GPS in Bogotá; and (iii San-Fernando WWTP in Itagüí. Each UV-Vis absorbance time series had equal sample number (5705. The esti-mation of the spectral power density is obtained using the average of modified periodograms with rectangular window and an overlap of 50%, with the 20 most important harmonics from the Discrete Fourier Transform (DFT and Inverse Fast Fourier Transform (IFFT. Results: Absorbance time series dimensionality reduction using PCA, resulted in 6, 8 and 7 principal components for each study site respectively, altogether explaining more than 97% of their variability. Values of differences below 30% for the UV range were obtained for the three study sites, while for the visible range the maximum differences obtained were: (i 35% for El-Salitre WWTP; (ii 61% for GPS; and (iii 75% for San-Fernando WWTP. Conclusions: The Box-Cox transformation and the differentiation process applied to the UV-Vis absorbance time series for the study sites (El-Salitre, GPS and San-Fernando, allowed to reduce variance and to eliminate ten-dency of the time series. A pre-processing of UV-Vis absorbance time series is recommended to detect and remove outliers and then apply the proposed process for spectral estimation. Language: Spanish.
Preprocessing Algorithm for Deciphering Historical Inscriptions Using String Metric

Directory of Open Access Journals (Sweden)

Lorand Lehel Toth

2016-07-01

Full Text Available The article presents the improvements in the preprocessing part of the deciphering method (shortly preprocessing algorithm for historical inscriptions of unknown origin. Glyphs used in historical inscriptions changed through time; therefore, various versions of the same script may contain different glyphs for each grapheme. The purpose of the preprocessing algorithm is reducing the running time of the deciphering process by filtering out the less probable interpretations of the examined inscription. However, the first version of the preprocessing algorithm leads incorrect outcome or no result in the output in certain cases. Therefore, its improved version was developed to find the most similar words in the dictionary by relaying the search conditions more accurately, but still computationally effectively. Moreover, a sophisticated similarity metric used to determine the possible meaning of the unknown inscription is introduced. The results of the evaluations are also detailed.
Examination of Speed Contribution of Parallelization for Several Fingerprint Pre-Processing Algorithms

Directory of Open Access Journals (Sweden)

GORGUNOGLU, S.

2014-05-01

Full Text Available In analysis of minutiae based fingerprint systems, fingerprints needs to be pre-processed. The pre-processing is carried out to enhance the quality of the fingerprint and to obtain more accurate minutiae points. Reducing the pre-processing time is important for identification and verification in real time systems and especially for databases holding large fingerprints information. Parallel processing and parallel CPU computing can be considered as distribution of processes over multi core processor. This is done by using parallel programming techniques. Reducing the execution time is the main objective in parallel processing. In this study, pre-processing of minutiae based fingerprint system is implemented by parallel processing on multi core computers using OpenMP and on graphics processor using CUDA to improve execution time. The execution times and speedup ratios are compared with the one that of single core processor. The results show that by using parallel processing, execution time is substantially improved. The improvement ratios obtained for different pre-processing algorithms allowed us to make suggestions on the more suitable approaches for parallelization.
Automated Bayesian model development for frequency detection in biological time series

Directory of Open Access Journals (Sweden)

Oldroyd Giles ED

2011-06-01

Full Text Available Abstract Background A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. Results In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Conclusions Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and
Automated Bayesian model development for frequency detection in biological time series.

Science.gov (United States)

Granqvist, Emma; Oldroyd, Giles E D; Morris, Richard J

2011-06-24

A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time
BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data

Directory of Open Access Journals (Sweden)

Madeira Sara C

2009-07-01

Full Text Available Abstract Background The ability to monitor changes in expression patterns over time, and to observe the emergence of coherent temporal responses using expression time series, is critical to advance our understanding of complex biological processes. Biclustering has been recognized as an effective method for discovering local temporal expression patterns and unraveling potential regulatory mechanisms. The general biclustering problem is NP-hard. In the case of time series this problem is tractable, and efficient algorithms can be used. However, there is still a need for specialized applications able to take advantage of the temporal properties inherent to expression time series, both from a computational and a biological perspective. Findings BiGGEsTS makes available state-of-the-art biclustering algorithms for analyzing expression time series. Gene Ontology (GO annotations are used to assess the biological relevance of the biclusters. Methods for preprocessing expression time series and post-processing results are also included. The analysis is additionally supported by a visualization module capable of displaying informative representations of the data, including heatmaps, dendrograms, expression charts and graphs of enriched GO terms. Conclusion BiGGEsTS is a free open source graphical software tool for revealing local coexpression of genes in specific intervals of time, while integrating meaningful information on gene annotations. It is freely available at: http://kdbio.inesc-id.pt/software/biggests. We present a case study on the discovery of transcriptional regulatory modules in the response of Saccharomyces cerevisiae to heat stress.
Poisson pre-processing of nonstationary photonic signals: Signals with equality between mean and variance.

Science.gov (United States)

Poplová, Michaela; Sovka, Pavel; Cifra, Michal

2017-01-01

Photonic signals are broadly exploited in communication and sensing and they typically exhibit Poisson-like statistics. In a common scenario where the intensity of the photonic signals is low and one needs to remove a nonstationary trend of the signals for any further analysis, one faces an obstacle: due to the dependence between the mean and variance typical for a Poisson-like process, information about the trend remains in the variance even after the trend has been subtracted, possibly yielding artifactual results in further analyses. Commonly available detrending or normalizing methods cannot cope with this issue. To alleviate this issue we developed a suitable pre-processing method for the signals that originate from a Poisson-like process. In this paper, a Poisson pre-processing method for nonstationary time series with Poisson distribution is developed and tested on computer-generated model data and experimental data of chemiluminescence from human neutrophils and mung seeds. The presented method transforms a nonstationary Poisson signal into a stationary signal with a Poisson distribution while preserving the type of photocount distribution and phase-space structure of the signal. The importance of the suggested pre-processing method is shown in Fano factor and Hurst exponent analysis of both computer-generated model signals and experimental photonic signals. It is demonstrated that our pre-processing method is superior to standard detrending-based methods whenever further signal analysis is sensitive to variance of the signal.
Detection of Outliers and Imputing of Missing Values for Water Quality UV-VIS Absorbance Time Series

OpenAIRE

Plazas-Nossa, Leonardo; Ávila Angulo, Miguel Antonio; Torres, Andrés

2017-01-01

Context:The UV-Vis absorbance collection using online optical captors for water quality detection may yield outliers and/or missing values. Therefore, pre-processing to correct these anomalies is required to improve the analysis of monitoring data. The aim of this study is to propose a method to detect outliers as well as to fill-in the gaps in time series. Method:Outliers are detected using Winsorising procedure and the application of the Discrete Fourier Transform (DFT) and the Inverse of F...
The recursive combination filter approach of pre-processing for the estimation of standard deviation of RR series.

Science.gov (United States)

Mishra, Alok; Swati, D

2015-09-01

Variation in the interval between the R-R peaks of the electrocardiogram represents the modulation of the cardiac oscillations by the autonomic nervous system. This variation is contaminated by anomalous signals called ectopic beats, artefacts or noise which mask the true behaviour of heart rate variability. In this paper, we have proposed a combination filter of recursive impulse rejection filter and recursive 20% filter, with recursive application and preference of replacement over removal of abnormal beats to improve the pre-processing of the inter-beat intervals. We have tested this novel recursive combinational method with median method replacement to estimate the standard deviation of normal to normal (SDNN) beat intervals of congestive heart failure (CHF) and normal sinus rhythm subjects. This work discusses the improvement in pre-processing over single use of impulse rejection filter and removal of abnormal beats for heart rate variability for the estimation of SDNN and Poncaré plot descriptors (SD1, SD2, and SD1/SD2) in detail. We have found the 22 ms value of SDNN and 36 ms value of SD2 descriptor of Poincaré plot as clinical indicators in discriminating the normal cases from CHF cases. The pre-processing is also useful in calculation of Lyapunov exponent which is a nonlinear index as Lyapunov exponents calculated after proposed pre-processing modified in a way that it start following the notion of less complex behaviour of diseased states.
Reliable RANSAC Using a Novel Preprocessing Model

Directory of Open Access Journals (Sweden)

Xiaoyan Wang

2013-01-01

Full Text Available Geometric assumption and verification with RANSAC has become a crucial step for corresponding to local features due to its wide applications in biomedical feature analysis and vision computing. However, conventional RANSAC is very time-consuming due to redundant sampling times, especially dealing with the case of numerous matching pairs. This paper presents a novel preprocessing model to explore a reduced set with reliable correspondences from initial matching dataset. Both geometric model generation and verification are carried out on this reduced set, which leads to considerable speedups. Afterwards, this paper proposes a reliable RANSAC framework using preprocessing model, which was implemented and verified using Harris and SIFT features, respectively. Compared with traditional RANSAC, experimental results show that our method is more efficient.
Creation and evaluation of a database of renewable production time series and other data for energy system modelling; Aufbau und Bewertung einer fuer die Energiemodellierung verwendbaren Datenbasis an Zeitreihen erneuerbarer Erzeugung und sonstiger Daten

Energy Technology Data Exchange (ETDEWEB)

Janker, Karl Albert

2015-01-28

This thesis describes a model which generates renewable power generation time series as input data for energy system models. The focus is on photovoltaic systems and wind turbines. The basis is a high resolution global raster data set of weather data for many years. This data is validated, corrected and preprocessed. The composition of the hourly generation data is done via simulation of the respective technology. The generated time series are aggregated for different regions and are validated against historical production time series.
Effect of packaging on physicochemical characteristics of irradiated pre-processed chicken

International Nuclear Information System (INIS)

Jiang Xiujie; Zhang Dongjie; Zhang Dequan; Li Shurong; Gao Meixu; Wang Zhidong

2011-01-01

To explore the effect of modified atmosphere packaging and antioxidants on the physicochemical characteristics of irradiated pre-processed chicken, the pre-processed chicken was added antioxidants first, and then packaged in common, vacuum and gas respectively, and finally irradiated at 5 kGy dosage. All samples was stored at 4 ℃. The pH, TBA, TVB-N and color deviation were evaluated after 0, 3, 7, 10, 14, 18 and 21 d of storage. The results showed that pH value of pre-processed chicken with antioxidants and vacuum packaged increased with the storage time but not significantly among different treatments. The TBA value was also increased but not significantly (P > 0.05), which indicated that vacuum package inhibited the lipid oxidation. TVB-N value increased with storage time, TVB-N value of vacuum package samples reached 14.29 mg/100 g at 21 d storage, which did not exceeded the reference indexes of fresh meat. a * value of the pre-processed chicken of vacuum package and non-oxygen package samples increased significantly during storage (P > 0.05), and chicken color kept bright red after 21 d storage with vacuum package It is concluded that vacuum packaging of irradiated pre-processed chicken is effective on ensuring its physical and chemical properties during storage. (authors)
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.

Science.gov (United States)

Guzzi, Pietro Hiram; Cannataro, Mario

2013-08-01

A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power
Detection of Outliers and Imputing of Missing Values for Water Quality UV-VIS Absorbance Time Series

Directory of Open Access Journals (Sweden)

Leonardo Plazas-Nossa

2017-01-01

Full Text Available Context: The UV-Vis absorbance collection using online optical captors for water quality detection may yield outliers and/or missing values. Therefore, data pre-processing is a necessary pre-requisite to monitoring data processing. Thus, the aim of this study is to propose a method that detects and removes outliers as well as fills gaps in time series. Method: Outliers are detected using Winsorising procedure and the application of the Discrete Fourier Transform (DFT and the Inverse of Fast Fourier Transform (IFFT to complete the time series. Together, these tools were used to analyse a case study comprising three sites in Colombia ((i Bogotá D.C. Salitre-WWTP (Waste Water Treatment Plant, influent; (ii Bogotá D.C. Gibraltar Pumping Station (GPS; and, (iii Itagüí, San Fernando-WWTP, influent (Medellín metropolitan area analysed via UV-Vis (Ultraviolet and Visible spectra. Results: Outlier detection with the proposed method obtained promising results when window parameter values are small and self-similar, despite that the three time series exhibited different sizes and behaviours. The DFT allowed to process different length gaps having missing values. To assess the validity of the proposed method, continuous subsets (a section of the absorbance time series without outlier or missing values were removed from the original time series obtaining an average 12% error rate in the three testing time series. Conclusions: The application of the DFT and the IFFT, using the 10% most important harmonics of useful values, can be useful for its later use in different applications, specifically for time series of water quality and quantity in urban sewer systems. One potential application would be the analysis of dry weather interesting to rain events, a feat achieved by detecting values that correspond to unusual behaviour in a time series. Additionally, the result hints at the potential of the method in correcting other hydrologic time series.
Facilitating Watermark Insertion by Preprocessing Media

Directory of Open Access Journals (Sweden)

Matt L. Miller

2004-10-01

Full Text Available There are several watermarking applications that require the deployment of a very large number of watermark embedders. These applications often have severe budgetary constraints that limit the computation resources that are available. Under these circumstances, only simple embedding algorithms can be deployed, which have limited performance. In order to improve performance, we propose preprocessing the original media. It is envisaged that this preprocessing occurs during content creation and has no budgetary or computational constraints. Preprocessing combined with simple embedding creates a watermarked Work, the performance of which exceeds that of simple embedding alone. However, this performance improvement is obtained without any increase in the computational complexity of the embedder. Rather, the additional computational burden is shifted to the preprocessing stage. A simple example of this procedure is described and experimental results confirm our assertions.
Revising time series of the Elbe river discharge for flood frequency determination at gauge Dresden

Directory of Open Access Journals (Sweden)

S. Bartl

2009-11-01

Full Text Available The German research programme RIsk MAnagment of eXtreme flood events has accomplished the improvement of regional hazard assessment for the large rivers in Germany. Here we focused on the Elbe river at its gauge Dresden, which belongs to the oldest gauges in Europe with officially available daily discharge time series beginning on 1 January 1890. The project on the one hand aimed to extend and to revise the existing time series, and on the other hand to examine the variability of the Elbe river discharge conditions on a greater time scale. Therefore one major task were the historical searches and the examination of the retrieved documents and the contained information. After analysing this information the development of the river course and the discharge conditions were discussed. Using the provided knowledge, in an other subproject, a historical hydraulic model was established. Its results then again were used here. A further purpose was the determining of flood frequency based on all pre-processed data. The obtained knowledge about historical changes was also used to get an idea about possible future variations under climate change conditions. Especially variations in the runoff characteristic of the Elbe river over the course of the year were analysed. It succeeded to obtain a much longer discharge time series which contain fewer errors and uncertainties. Hence an optimized regional hazard assessment was realised.
Revising time series of the Elbe river discharge for flood frequency determination at gauge Dresden

Science.gov (United States)

Bartl, S.; Schümberg, S.; Deutsch, M.

2009-11-01

The German research programme RIsk MAnagment of eXtreme flood events has accomplished the improvement of regional hazard assessment for the large rivers in Germany. Here we focused on the Elbe river at its gauge Dresden, which belongs to the oldest gauges in Europe with officially available daily discharge time series beginning on 1 January 1890. The project on the one hand aimed to extend and to revise the existing time series, and on the other hand to examine the variability of the Elbe river discharge conditions on a greater time scale. Therefore one major task were the historical searches and the examination of the retrieved documents and the contained information. After analysing this information the development of the river course and the discharge conditions were discussed. Using the provided knowledge, in an other subproject, a historical hydraulic model was established. Its results then again were used here. A further purpose was the determining of flood frequency based on all pre-processed data. The obtained knowledge about historical changes was also used to get an idea about possible future variations under climate change conditions. Especially variations in the runoff characteristic of the Elbe river over the course of the year were analysed. It succeeded to obtain a much longer discharge time series which contain fewer errors and uncertainties. Hence an optimized regional hazard assessment was realised.
ENDF/B Pre-Processing Codes: Implementing and testing on a Personal Computer

International Nuclear Information System (INIS)

McLaughlin, P.K.

1987-05-01

This document describes the contents of the diskettes containing the ENDF/B Pre-Processing codes by D.E. Cullen, and example data for use in implementing and testing these codes on a Personal Computer of the type IBM-PC/AT. Upon request the codes are available from the IAEA Nuclear Data Section, free of charge, on a series of 7 diskettes. (author)
Retinal Image Preprocessing: Background and Noise Segmentation

Directory of Open Access Journals (Sweden)

Usman Akram

2012-09-01

Full Text Available Retinal images are used for the automated screening and diagnosis of diabetic retinopathy. The retinal image quality must be improved for the detection of features and abnormalities and for this purpose preprocessing of retinal images is vital. In this paper, we present a novel automated approach for preprocessing of colored retinal images. The proposed technique improves the quality of input retinal image by separating the background and noisy area from the overall image. It contains coarse segmentation and fine segmentation. Standard retinal images databases Diaretdb0, Diaretdb1, DRIVE and STARE are used to test the validation of our preprocessing technique. The experimental results show the validity of proposed preprocessing technique.
An Effective Measured Data Preprocessing Method in Electrical Impedance Tomography

Directory of Open Access Journals (Sweden)

Chenglong Yu

2014-01-01

Full Text Available As an advanced process detection technology, electrical impedance tomography (EIT has widely been paid attention to and studied in the industrial fields. But the EIT techniques are greatly limited to the low spatial resolutions. This problem may result from the incorrect preprocessing of measuring data and lack of general criterion to evaluate different preprocessing processes. In this paper, an EIT data preprocessing method is proposed by all rooting measured data and evaluated by two constructed indexes based on all rooted EIT measured data. By finding the optimums of the two indexes, the proposed method can be applied to improve the EIT imaging spatial resolutions. In terms of a theoretical model, the optimal rooting times of the two indexes range in [0.23, 0.33] and in [0.22, 0.35], respectively. Moreover, these factors that affect the correctness of the proposed method are generally analyzed. The measuring data preprocessing is necessary and helpful for any imaging process. Thus, the proposed method can be generally and widely used in any imaging process. Experimental results validate the two proposed indexes.

A Real-Time Embedded System for Stereo Vision Preprocessing Using an FPGA

DEFF Research Database (Denmark)

Kjær-Nielsen, Anders; Jensen, Lars Baunegaard With; Sørensen, Anders Stengaard

2008-01-01

In this paper a low level vision processing node for use in existing IEEE 1394 camera setups is presented. The processing node is a small embedded system, that utilizes an FPGA to perform stereo vision preprocessing at rates limited by the bandwidth of IEEE 1394a (400Mbit). The system is used...
GPS Position Time Series @ JPL

Science.gov (United States)

Owen, Susan; Moore, Angelyn; Kedar, Sharon; Liu, Zhen; Webb, Frank; Heflin, Mike; Desai, Shailen

2013-01-01

Different flavors of GPS time series analysis at JPL - Use same GPS Precise Point Positioning Analysis raw time series - Variations in time series analysis/post-processing driven by different users. center dot JPL Global Time Series/Velocities - researchers studying reference frame, combining with VLBI/SLR/DORIS center dot JPL/SOPAC Combined Time Series/Velocities - crustal deformation for tectonic, volcanic, ground water studies center dot ARIA Time Series/Coseismic Data Products - Hazard monitoring and response focused center dot ARIA data system designed to integrate GPS and InSAR - GPS tropospheric delay used for correcting InSAR - Caltech's GIANT time series analysis uses GPS to correct orbital errors in InSAR - Zhen Liu's talking tomorrow on InSAR Time Series analysis
Time series analysis time series analysis methods and applications

CERN Document Server

Rao, Tata Subba; Rao, C R

2012-01-01

The field of statistics not only affects all areas of scientific activity, but also many other matters such as public policy. It is branching rapidly into so many different subjects that a series of handbooks is the only way of comprehensively presenting the various aspects of statistical methodology, applications, and recent developments. The Handbook of Statistics is a series of self-contained reference books. Each volume is devoted to a particular topic in statistics, with Volume 30 dealing with time series. The series is addressed to the entire community of statisticians and scientists in various disciplines who use statistical methodology in their work. At the same time, special emphasis is placed on applications-oriented techniques, with the applied statistician in mind as the primary audience. Comprehensively presents the various aspects of statistical methodology Discusses a wide variety of diverse applications and recent developments Contributors are internationally renowened experts in their respect...
Highly comparative time-series analysis: the empirical structure of time series and their methods.

Science.gov (United States)

Fulcher, Ben D; Little, Max A; Jones, Nick S

2013-06-06

The process of collecting and organizing sets of observations represents a common theme throughout the history of science. However, despite the ubiquity of scientists measuring, recording and analysing the dynamics of different processes, an extensive organization of scientific time-series data and analysis methods has never been performed. Addressing this, annotated collections of over 35 000 real-world and model-generated time series, and over 9000 time-series analysis algorithms are analysed in this work. We introduce reduced representations of both time series, in terms of their properties measured by diverse scientific methods, and of time-series analysis methods, in terms of their behaviour on empirical time series, and use them to organize these interdisciplinary resources. This new approach to comparing across diverse scientific data and methods allows us to organize time-series datasets automatically according to their properties, retrieve alternatives to particular analysis methods developed in other scientific disciplines and automate the selection of useful methods for time-series classification and regression tasks. The broad scientific utility of these tools is demonstrated on datasets of electroencephalograms, self-affine time series, heartbeat intervals, speech signals and others, in each case contributing novel analysis techniques to the existing literature. Highly comparative techniques that compare across an interdisciplinary literature can thus be used to guide more focused research in time-series analysis for applications across the scientific disciplines.
Min st-cut oracle for planar graphs with near-linear preprocessing time

DEFF Research Database (Denmark)

Borradaile, Glencora; Sankowski, Piotr; Wulff-Nilsen, Christian

2010-01-01

For an undirected n-vertex planar graph G with non-negative edge-weights, we consider the following type of query: given two vertices s and t in G, what is the weight of a min st-cut in G? We show how to answer such queries in constant time with O(n log5 n) preprocessing time and O(n log n) space....... We use a Gomory-Hu tree to represent all the pairwise min st-cuts implicitly. Previously, no subquadratic time algorithm was known for this problem. Our oracle can be extended to report the min st-cuts in time proportional to their size. Since all-pairs min st-cut and the minimum cycle basis are dual...... problems in planar graphs, we also obtain an implicit representation of a minimum cycle basis in O(n log5 n) time and O(n log n) space and an explicit representation with additional O(C) time and space where C is the size of the basis. To obtain our results, we require that shortest paths be unique...
Image preprocessing study on KPCA-based face recognition

Science.gov (United States)

Li, Xuan; Li, Dehua

2015-12-01

Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.
Introduction to Time Series Modeling

CERN Document Server

Kitagawa, Genshiro

2010-01-01

In time series modeling, the behavior of a certain phenomenon is expressed in relation to the past values of itself and other covariates. Since many important phenomena in statistical analysis are actually time series and the identification of conditional distribution of the phenomenon is an essential part of the statistical modeling, it is very important and useful to learn fundamental methods of time series modeling. Illustrating how to build models for time series using basic methods, "Introduction to Time Series Modeling" covers numerous time series models and the various tools f
The Effect of Preprocessing on Arabic Document Categorization

Directory of Open Access Journals (Sweden)

Abdullah Ayedh

2016-04-01

Full Text Available Preprocessing is one of the main components in a conventional document categorization (DC framework. This paper aims to highlight the effect of preprocessing tasks on the efficiency of the Arabic DC system. In this study, three classification techniques are used, namely, naive Bayes (NB, k-nearest neighbor (KNN, and support vector machine (SVM. Experimental analysis on Arabic datasets reveals that preprocessing techniques have a significant impact on the classification accuracy, especially with complicated morphological structure of the Arabic language. Choosing appropriate combinations of preprocessing tasks provides significant improvement on the accuracy of document categorization depending on the feature size and classification techniques. Findings of this study show that the SVM technique has outperformed the KNN and NB techniques. The SVM technique achieved 96.74% micro-F1 value by using the combination of normalization and stemming as preprocessing tasks.
Piecewise Polynomial Aggregation as Preprocessing for Data Numerical Modeling

Science.gov (United States)

Dobronets, B. S.; Popova, O. A.

2018-05-01

Data aggregation issues for numerical modeling are reviewed in the present study. The authors discuss data aggregation procedures as preprocessing for subsequent numerical modeling. To calculate the data aggregation, the authors propose using numerical probabilistic analysis (NPA). An important feature of this study is how the authors represent the aggregated data. The study shows that the offered approach to data aggregation can be interpreted as the frequency distribution of a variable. To study its properties, the density function is used. For this purpose, the authors propose using the piecewise polynomial models. A suitable example of such approach is the spline. The authors show that their approach to data aggregation allows reducing the level of data uncertainty and significantly increasing the efficiency of numerical calculations. To demonstrate the degree of the correspondence of the proposed methods to reality, the authors developed a theoretical framework and considered numerical examples devoted to time series aggregation.
Data preprocessing in data mining

CERN Document Server

García, Salvador; Herrera, Francisco

2015-01-01

Data Preprocessing for Data Mining addresses one of the most important issues within the well-known Knowledge Discovery from Data process. Data directly taken from the source will likely have inconsistencies, errors or most importantly, it is not ready to be considered for a data mining process. Furthermore, the increasing amount of data in recent science, industry and business applications, calls to the requirement of more complex tools to analyze it. Thanks to data preprocessing, it is possible to convert the impossible into possible, adapting the data to fulfill the input demands of each data mining algorithm. Data preprocessing includes the data reduction techniques, which aim at reducing the complexity of the data, detecting or removing irrelevant and noisy elements from the data. This book is intended to review the tasks that fill the gap between the data acquisition from the source and the data mining process. A comprehensive look from a practical point of view, including basic concepts and surveying t...
A COMPARATIVE STUDY OF FORECASTING MODELS FOR TREND AND SEASONAL TIME SERIES DOES COMPLEX MODEL ALWAYS YIELD BETTER FORECAST THAN SIMPLE MODELS

Directory of Open Access Journals (Sweden)

Suhartono Suhartono

2005-01-01

Full Text Available Many business and economic time series are non-stationary time series that contain trend and seasonal variations. Seasonality is a periodic and recurrent pattern caused by factors such as weather, holidays, or repeating promotions. A stochastic trend is often accompanied with the seasonal variations and can have a significant impact on various forecasting methods. In this paper, we will investigate and compare some forecasting methods for modeling time series with both trend and seasonal patterns. These methods are Winter's, Decomposition, Time Series Regression, ARIMA and Neural Networks models. In this empirical research, we study on the effectiveness of the forecasting performance, particularly to answer whether a complex method always give a better forecast than a simpler method. We use a real data, that is airline passenger data. The result shows that the more complex model does not always yield a better result than a simpler one. Additionally, we also find the possibility to do further research especially the use of hybrid model by combining some forecasting method to get better forecast, for example combination between decomposition (as data preprocessing and neural network model.
Boosting reversible pushdown machines by preprocessing

DEFF Research Database (Denmark)

Axelsen, Holger Bock; Kutrib, Martin; Malcher, Andreas

2016-01-01

languages, whereas for reversible pushdown automata the accepted family of languages lies strictly in between the reversible deterministic context-free languages and the real-time deterministic context-free languages. Moreover, it is shown that the computational power of both types of machines...... is not changed by allowing the preprocessing sequential transducer to work irreversibly. Finally, we examine the closure properties of the family of languages accepted by such machines....
International Work-Conference on Time Series

CERN Document Server

Pomares, Héctor; Valenzuela, Olga

2017-01-01

This volume of selected and peer-reviewed contributions on the latest developments in time series analysis and forecasting updates the reader on topics such as analysis of irregularly sampled time series, multi-scale analysis of univariate and multivariate time series, linear and non-linear time series models, advanced time series forecasting methods, applications in time series analysis and forecasting, advanced methods and online learning in time series and high-dimensional and complex/big data time series. The contributions were originally presented at the International Work-Conference on Time Series, ITISE 2016, held in Granada, Spain, June 27-29, 2016. The series of ITISE conferences provides a forum for scientists, engineers, educators and students to discuss the latest ideas and implementations in the foundations, theory, models and applications in the field of time series analysis and forecasting. It focuses on interdisciplinary and multidisciplinary rese arch encompassing the disciplines of comput...
Forecasting business cycle with chaotic time series based on neural network with weighted fuzzy membership functions

International Nuclear Information System (INIS)

Chai, Soo H.; Lim, Joon S.

2016-01-01

This study presents a forecasting model of cyclical fluctuations of the economy based on the time delay coordinate embedding method. The model uses a neuro-fuzzy network called neural network with weighted fuzzy membership functions (NEWFM). The preprocessed time series of the leading composite index using the time delay coordinate embedding method are used as input data to the NEWFM to forecast the business cycle. A comparative study is conducted using other methods based on wavelet transform and Principal Component Analysis for the performance comparison. The forecasting results are tested using a linear regression analysis to compare the approximation of the input data against the target class, gross domestic product (GDP). The chaos based model captures nonlinear dynamics and interactions within the system, which other two models ignore. The test results demonstrated that chaos based method significantly improved the prediction capability, thereby demonstrating superior performance to the other methods.
From Networks to Time Series

Science.gov (United States)

Shimada, Yutaka; Ikeguchi, Tohru; Shigehara, Takaomi

2012-10-01

In this Letter, we propose a framework to transform a complex network to a time series. The transformation from complex networks to time series is realized by the classical multidimensional scaling. Applying the transformation method to a model proposed by Watts and Strogatz [Nature (London) 393, 440 (1998)], we show that ring lattices are transformed to periodic time series, small-world networks to noisy periodic time series, and random networks to random time series. We also show that these relationships are analytically held by using the circulant-matrix theory and the perturbation theory of linear operators. The results are generalized to several high-dimensional lattices.
Predicting prices of agricultural commodities in Thailand using combined approach emphasizing on data pre-processing technique

Directory of Open Access Journals (Sweden)

Thoranin Sujjaviriyasup

2018-02-01

Full Text Available In this research, a combined approach emphasizing on data pre-processing technique is developed to forecast prices of agricultural commodities in Thailand. The future prices play significant role in decision making to cultivate crops in next year. The proposed model takes ability of MODWT to decompose original time series data into more stable and explicit subseries, and SVR model to formulate complex function of forecasting. The experimental results indicated that the proposed model outperforms traditional forecasting models based on MAE and MAPE criteria. Furthermore, the proposed model reveals that it is able to be a useful forecasting tool for prices of agricultural commodities in Thailand
Duality between Time Series and Networks

Science.gov (United States)

Campanharo, Andriana S. L. O.; Sirer, M. Irmak; Malmgren, R. Dean; Ramos, Fernando M.; Amaral, Luís A. Nunes.

2011-01-01

Studying the interaction between a system's components and the temporal evolution of the system are two common ways to uncover and characterize its internal workings. Recently, several maps from a time series to a network have been proposed with the intent of using network metrics to characterize time series. Although these maps demonstrate that different time series result in networks with distinct topological properties, it remains unclear how these topological properties relate to the original time series. Here, we propose a map from a time series to a network with an approximate inverse operation, making it possible to use network statistics to characterize time series and time series statistics to characterize networks. As a proof of concept, we generate an ensemble of time series ranging from periodic to random and confirm that application of the proposed map retains much of the information encoded in the original time series (or networks) after application of the map (or its inverse). Our results suggest that network analysis can be used to distinguish different dynamic regimes in time series and, perhaps more importantly, time series analysis can provide a powerful set of tools that augment the traditional network analysis toolkit to quantify networks in new and useful ways. PMID:21858093
Long time series

DEFF Research Database (Denmark)

Hisdal, H.; Holmqvist, E.; Hyvärinen, V.

Awareness that emission of greenhouse gases will raise the global temperature and change the climate has led to studies trying to identify such changes in long-term climate and hydrologic time series. This report, written by the......Awareness that emission of greenhouse gases will raise the global temperature and change the climate has led to studies trying to identify such changes in long-term climate and hydrologic time series. This report, written by the...
Practical Secure Computation with Pre-Processing

DEFF Research Database (Denmark)

Zakarias, Rasmus Winther

Secure Multiparty Computation has been divided between protocols best suited for binary circuits and protocols best suited for arithmetic circuits. With their MiniMac protocol in [DZ13], Damgård and Zakarias take an important step towards bridging these worlds with an arithmetic protocol tuned...... space for pre-processing material than computing the non-linear parts online (depends on the quality of circuit of course). Surprisingly, even for our optimized AES-circuit this is not the case. We further improve the design of the pre-processing material and end up with only 10 megabyes of pre...... a protocol for small field arithmetic to do fast large integer multipli- cations. This is achieved by devising pre-processing material that allows the Toom-Cook multiplication algorithm to run between the parties with linear communication complexity. With this result computation on the CPU by the parties...
Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration

International Nuclear Information System (INIS)

Xu Lu; Zhou Yanping; Tang Lijuan; Wu Hailong; Jiang Jianhui; Shen Guoli; Yu Ruqin

2008-01-01

Preprocessing of raw near-infrared (NIR) spectral data is indispensable in multivariate calibration when the measured spectra are subject to significant noises, baselines and other undesirable factors. However, due to the lack of sufficient prior information and an incomplete knowledge of the raw data, NIR spectra preprocessing in multivariate calibration is still trial and error. How to select a proper method depends largely on both the nature of the data and the expertise and experience of the practitioners. This might limit the applications of multivariate calibration in many fields, where researchers are not very familiar with the characteristics of many preprocessing methods unique in chemometrics and have difficulties to select the most suitable methods. Another problem is many preprocessing methods, when used alone, might degrade the data in certain aspects or lose some useful information while improving certain qualities of the data. In order to tackle these problems, this paper proposes a new concept of data preprocessing, ensemble preprocessing method, where partial least squares (PLSs) models built on differently preprocessed data are combined by Monte Carlo cross validation (MCCV) stacked regression. Little or no prior information of the data and expertise are required. Moreover, fusion of complementary information obtained by different preprocessing methods often leads to a more stable and accurate calibration model. The investigation of two real data sets has demonstrated the advantages of the proposed method

A Course in Time Series Analysis

CERN Document Server

Peña, Daniel; Tsay, Ruey S

2011-01-01

New statistical methods and future directions of research in time series A Course in Time Series Analysis demonstrates how to build time series models for univariate and multivariate time series data. It brings together material previously available only in the professional literature and presents a unified view of the most advanced procedures available for time series model building. The authors begin with basic concepts in univariate time series, providing an up-to-date presentation of ARIMA models, including the Kalman filter, outlier analysis, automatic methods for building ARIMA models, a
Comparison of planar images and SPECT with bayesean preprocessing for the demonstration of facial anatomy and craniomandibular disorders

International Nuclear Information System (INIS)

Kircos, L.T.; Ortendahl, D.A.; Hattner, R.S.; Faulkner, D.; Taylor, R.L.

1984-01-01

Craniomandiublar disorders involving the facial anatomy may be difficult to demonstrate in planar images. Although bone scanning is generally more sensitive than radiography, facial bone anatomy is complex and focal areas of increased or decreased radiotracer may become obscured by overlapping structures in planar images. Thus SPECT appears ideally suited to examination of the facial skeleton. A series of patients with craniomandibular disorders of unknown origin were imaged using 20 mCi Tc-99m MDP. Planar and SPECT (Siemens 7500 ZLC Orbiter) images were obtained four hours after injection. The SPECT images were reconstructed with a filtered back-projection algorithm. In order to improve image contrast and resolution in SPECT images, the rotation views were pre-processed with a Bayesean deblurring algorithm which has previously been show to offer improved contrast and resolution in planar images. SPECT images using the pre-processed rotation views were obtained and compared to the SPECT images without pre-processing and the planar images. TMJ arthropathy involving either the glenoid fossa or the mandibular condyle, orthopedic changes involving the mandible or maxilla, localized dental pathosis, as well as changes in structures peripheral to the facial skeleton were identified. Bayesean pre-processed SPECT depicted the facial skeleton more clearly as well as providing a more obvious demonstration of the bony changes associated with craniomandibular disorders than either planar images or SPECT without pre-processing
Kolmogorov Space in Time Series Data

OpenAIRE

Kanjamapornkul, K.; Pinčák, R.

2016-01-01

We provide the proof that the space of time series data is a Kolmogorov space with $T_{0}$-separation axiom using the loop space of time series data. In our approach we define a cyclic coordinate of intrinsic time scale of time series data after empirical mode decomposition. A spinor field of time series data comes from the rotation of data around price and time axis by defining a new extradimension to time series data. We show that there exist hidden eight dimensions in Kolmogorov space for ...
On the Use of Running Trends as Summary Statistics for Univariate Time Series and Time Series Association

OpenAIRE

Trottini, Mario; Vigo, Isabel; Belda, Santiago

2015-01-01

Given a time series, running trends analysis (RTA) involves evaluating least squares trends over overlapping time windows of L consecutive time points, with overlap by all but one observation. This produces a new series called the “running trends series,” which is used as summary statistics of the original series for further analysis. In recent years, RTA has been widely used in climate applied research as summary statistics for time series and time series association. There is no doubt that ...
Multiple Indicator Stationary Time Series Models.

Science.gov (United States)

Sivo, Stephen A.

2001-01-01

Discusses the propriety and practical advantages of specifying multivariate time series models in the context of structural equation modeling for time series and longitudinal panel data. For time series data, the multiple indicator model specification improves on classical time series analysis. For panel data, the multiple indicator model…
Time Series Momentum

DEFF Research Database (Denmark)

Moskowitz, Tobias J.; Ooi, Yao Hua; Heje Pedersen, Lasse

2012-01-01

We document significant “time series momentum” in equity index, currency, commodity, and bond futures for each of the 58 liquid instruments we consider. We find persistence in returns for one to 12 months that partially reverses over longer horizons, consistent with sentiment theories of initial...... under-reaction and delayed over-reaction. A diversified portfolio of time series momentum strategies across all asset classes delivers substantial abnormal returns with little exposure to standard asset pricing factors and performs best during extreme markets. Examining the trading activities...
International Work-Conference on Time Series

CERN Document Server

Pomares, Héctor

2016-01-01

This volume presents selected peer-reviewed contributions from The International Work-Conference on Time Series, ITISE 2015, held in Granada, Spain, July 1-3, 2015. It discusses topics in time series analysis and forecasting, advanced methods and online learning in time series, high-dimensional and complex/big data time series as well as forecasting in real problems. The International Work-Conferences on Time Series (ITISE) provide a forum for scientists, engineers, educators and students to discuss the latest ideas and implementations in the foundations, theory, models and applications in the field of time series analysis and forecasting. It focuses on interdisciplinary and multidisciplinary research encompassing the disciplines of computer science, mathematics, statistics and econometrics.
Robust preprocessing for stimulus-based functional MRI of the moving fetus.

Science.gov (United States)

You, Wonsang; Evangelou, Iordanis E; Zun, Zungho; Andescavage, Nickie; Limperopoulos, Catherine

2016-04-01

Fetal motion manifests as signal degradation and image artifact in the acquired time series of blood oxygen level dependent (BOLD) functional magnetic resonance imaging (fMRI) studies. We present a robust preprocessing pipeline to specifically address fetal and placental motion-induced artifacts in stimulus-based fMRI with slowly cycled block design in the living fetus. In the proposed pipeline, motion correction is optimized to the experimental paradigm, and it is performed separately in each phase as well as in each region of interest (ROI), recognizing that each phase and organ experiences different types of motion. To obtain the averaged BOLD signals for each ROI, both misaligned volumes and noisy voxels are automatically detected and excluded, and the missing data are then imputed by statistical estimation based on local polynomial smoothing. Our experimental results demonstrate that the proposed pipeline was effective in mitigating the motion-induced artifacts in stimulus-based fMRI data of the fetal brain and placenta.
Evaluating the impact of image preprocessing on iris segmentation

Directory of Open Access Journals (Sweden)

José F. Valencia-Murillo

2014-08-01

Full Text Available Segmentation is one of the most important stages in iris recognition systems. In this paper, image preprocessing algorithms are applied in order to evaluate their impact on successful iris segmentation. The preprocessing algorithms are based on histogram adjustment, Gaussian filters and suppression of specular reflections in human eye images. The segmentation method introduced by Masek is applied on 199 images acquired under unconstrained conditions, belonging to the CASIA-irisV3 database, before and after applying the preprocessing algorithms. Then, the impact of image preprocessing algorithms on the percentage of successful iris segmentation is evaluated by means of a visual inspection of images in order to determine if circumferences of iris and pupil were detected correctly. An increase from 59% to 73% in percentage of successful iris segmentation is obtained with an algorithm that combine elimination of specular reflections, followed by the implementation of a Gaussian filter having a 5x5 kernel. The results highlight the importance of a preprocessing stage as a previous step in order to improve the performance during the edge detection and iris segmentation processes.
Stochastic models for time series

CERN Document Server

Doukhan, Paul

2018-01-01

This book presents essential tools for modelling non-linear time series. The first part of the book describes the main standard tools of probability and statistics that directly apply to the time series context to obtain a wide range of modelling possibilities. Functional estimation and bootstrap are discussed, and stationarity is reviewed. The second part describes a number of tools from Gaussian chaos and proposes a tour of linear time series models. It goes on to address nonlinearity from polynomial or chaotic models for which explicit expansions are available, then turns to Markov and non-Markov linear models and discusses Bernoulli shifts time series models. Finally, the volume focuses on the limit theory, starting with the ergodic theorem, which is seen as the first step for statistics of time series. It defines the distributional range to obtain generic tools for limit theory under long or short-range dependences (LRD/SRD) and explains examples of LRD behaviours. More general techniques (central limit ...
Graphical Data Analysis on the Circle: Wrap-Around Time Series Plots for (Interrupted) Time Series Designs.

Science.gov (United States)

Rodgers, Joseph Lee; Beasley, William Howard; Schuelke, Matthew

2014-01-01

Many data structures, particularly time series data, are naturally seasonal, cyclical, or otherwise circular. Past graphical methods for time series have focused on linear plots. In this article, we move graphical analysis onto the circle. We focus on 2 particular methods, one old and one new. Rose diagrams are circular histograms and can be produced in several different forms using the RRose software system. In addition, we propose, develop, illustrate, and provide software support for a new circular graphical method, called Wrap-Around Time Series Plots (WATS Plots), which is a graphical method useful to support time series analyses in general but in particular in relation to interrupted time series designs. We illustrate the use of WATS Plots with an interrupted time series design evaluating the effect of the Oklahoma City bombing on birthrates in Oklahoma County during the 10 years surrounding the bombing of the Murrah Building in Oklahoma City. We compare WATS Plots with linear time series representations and overlay them with smoothing and error bands. Each method is shown to have advantages in relation to the other; in our example, the WATS Plots more clearly show the existence and effect size of the fertility differential.
Preprocessing Moist Lignocellulosic Biomass for Biorefinery Feedstocks

Energy Technology Data Exchange (ETDEWEB)

Neal Yancey; Christopher T. Wright; Craig Conner; J. Richard Hess

2009-06-01

Biomass preprocessing is one of the primary operations in the feedstock assembly system of a lignocellulosic biorefinery. Preprocessing is generally accomplished using industrial grinders to format biomass materials into a suitable biorefinery feedstock for conversion to ethanol and other bioproducts. Many factors affect machine efficiency and the physical characteristics of preprocessed biomass. For example, moisture content of the biomass as received from the point of production has a significant impact on overall system efficiency and can significantly affect the characteristics (particle size distribution, flowability, storability, etc.) of the size-reduced biomass. Many different grinder configurations are available on the market, each with advantages under specific conditions. Ultimately, the capacity and/or efficiency of the grinding process can be enhanced by selecting the grinder configuration that optimizes grinder performance based on moisture content and screen size. This paper discusses the relationships of biomass moisture with respect to preprocessing system performance and product physical characteristics and compares data obtained on corn stover, switchgrass, and wheat straw as model feedstocks during Vermeer HG 200 grinder testing. During the tests, grinder screen configuration and biomass moisture content were varied and tested to provide a better understanding of their relative impact on machine performance and the resulting feedstock physical characteristics and uniformity relative to each crop tested.
Time Series with Long Memory

OpenAIRE

西埜, 晴久

2004-01-01

The paper investigates an application of long-memory processes to economic time series. We show properties of long-memory processes, which are motivated to model a long-memory phenomenon in economic time series. An FARIMA model is described as an example of long-memory model in statistical terms. The paper explains basic limit theorems and estimation methods for long-memory processes in order to apply long-memory models to economic time series.
Comparison of pre-processing methods for multiplex bead-based immunoassays.

Science.gov (United States)

Rausch, Tanja K; Schillert, Arne; Ziegler, Andreas; Lüking, Angelika; Zucht, Hans-Dieter; Schulz-Knappe, Peter

2016-08-11

High throughput protein expression studies can be performed using bead-based protein immunoassays, such as the Luminex® xMAP® technology. Technical variability is inherent to these experiments and may lead to systematic bias and reduced power. To reduce technical variability, data pre-processing is performed. However, no recommendations exist for the pre-processing of Luminex® xMAP® data. We compared 37 different data pre-processing combinations of transformation and normalization methods in 42 samples on 384 analytes obtained from a multiplex immunoassay based on the Luminex® xMAP® technology. We evaluated the performance of each pre-processing approach with 6 different performance criteria. Three performance criteria were plots. All plots were evaluated by 15 independent and blinded readers. Four different combinations of transformation and normalization methods performed well as pre-processing procedure for this bead-based protein immunoassay. The following combinations of transformation and normalization were suitable for pre-processing Luminex® xMAP® data in this study: weighted Box-Cox followed by quantile or robust spline normalization (rsn), asinh transformation followed by loess normalization and Box-Cox followed by rsn.
Visibility Graph Based Time Series Analysis.

Science.gov (United States)

Stephen, Mutua; Gu, Changgui; Yang, Huijie

2015-01-01

Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq) and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.
Visibility Graph Based Time Series Analysis.

Directory of Open Access Journals (Sweden)

Mutua Stephen

Full Text Available Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.
Effects of preprocessing method on TVOC emission of car mat

Science.gov (United States)

Wang, Min; Jia, Li

2013-02-01

The effects of the mat preprocessing method on total volatile organic compounds (TVOC) emission of car mat are studied in this paper. An appropriate TVOC emission period for car mat is suggested. The emission factors for total volatile organic compounds from three kinds of new car mats are discussed. The car mats are preprocessed by washing, baking and ventilation. When car mats are preprocessed by washing, the TVOC emission for all samples tested are lower than that preprocessed in other methods. The TVOC emission is in stable situation for a minimum of 4 days. The TVOC emitted from some samples may exceed 2500μg/kg. But the TVOC emitted from washed Polyamide (PA) and wool mat is less than 2500μg/kg. The emission factors of total volatile organic compounds (TVOC) are experimentally investigated in the case of different preprocessing methods. The air temperature in environment chamber and the water temperature for washing are important factors influencing on emission of car mats.
An Innovative Hybrid Model Based on Data Pre-Processing and Modified Optimization Algorithm and Its Application in Wind Speed Forecasting

Directory of Open Access Journals (Sweden)

Ping Jiang

2017-07-01

Full Text Available Wind speed forecasting has an unsuperseded function in the high-efficiency operation of wind farms, and is significant in wind-related engineering studies. Back-propagation (BP algorithms have been comprehensively employed to forecast time series that are nonlinear, irregular, and unstable. However, the single model usually overlooks the importance of data pre-processing and parameter optimization of the model, which results in weak forecasting performance. In this paper, a more precise and robust model that combines data pre-processing, BP neural network, and a modified artificial intelligence optimization algorithm was proposed, which succeeded in avoiding the limitations of the individual algorithm. The novel model not only improves the forecasting accuracy but also retains the advantages of the firefly algorithm (FA and overcomes the disadvantage of the FA while optimizing in the later stage. To verify the forecasting performance of the presented hybrid model, 10-min wind speed data from Penglai city, Shandong province, China, were analyzed in this study. The simulations revealed that the proposed hybrid model significantly outperforms other single metaheuristics.
Network structure of multivariate time series.

Science.gov (United States)

Lacasa, Lucas; Nicosia, Vincenzo; Latora, Vito

2015-10-21

Our understanding of a variety of phenomena in physics, biology and economics crucially depends on the analysis of multivariate time series. While a wide range tools and techniques for time series analysis already exist, the increasing availability of massive data structures calls for new approaches for multidimensional signal processing. We present here a non-parametric method to analyse multivariate time series, based on the mapping of a multidimensional time series into a multilayer network, which allows to extract information on a high dimensional dynamical system through the analysis of the structure of the associated multiplex network. The method is simple to implement, general, scalable, does not require ad hoc phase space partitioning, and is thus suitable for the analysis of large, heterogeneous and non-stationary time series. We show that simple structural descriptors of the associated multiplex networks allow to extract and quantify nontrivial properties of coupled chaotic maps, including the transition between different dynamical phases and the onset of various types of synchronization. As a concrete example we then study financial time series, showing that a multiplex network analysis can efficiently discriminate crises from periods of financial stability, where standard methods based on time-series symbolization often fail.
What marketing scholars should know about time series analysis : time series applications in marketing

NARCIS (Netherlands)

Horváth, Csilla; Kornelis, Marcel; Leeflang, Peter S.H.

2002-01-01

In this review, we give a comprehensive summary of time series techniques in marketing, and discuss a variety of time series analysis (TSA) techniques and models. We classify them in the sets (i) univariate TSA, (ii) multivariate TSA, and (iii) multiple TSA. We provide relevant marketing

Data mining in time series databases

CERN Document Server

Kandel, Abraham; Bunke, Horst

2004-01-01

Adding the time dimension to real-world databases produces Time SeriesDatabases (TSDB) and introduces new aspects and difficulties to datamining and knowledge discovery. This book covers the state-of-the-artmethodology for mining time series databases. The novel data miningmethods presented in the book include techniques for efficientsegmentation, indexing, and classification of noisy and dynamic timeseries. A graph-based method for anomaly detection in time series isdescribed and the book also studies the implications of a novel andpotentially useful representation of time series as strings. Theproblem of detecting changes in data mining models that are inducedfrom temporal databases is additionally discussed.
Models for dependent time series

CERN Document Server

Tunnicliffe Wilson, Granville; Haywood, John

2015-01-01

Models for Dependent Time Series addresses the issues that arise and the methodology that can be applied when the dependence between time series is described and modeled. Whether you work in the economic, physical, or life sciences, the book shows you how to draw meaningful, applicable, and statistically valid conclusions from multivariate (or vector) time series data.The first four chapters discuss the two main pillars of the subject that have been developed over the last 60 years: vector autoregressive modeling and multivariate spectral analysis. These chapters provide the foundational mater
The 1996 ENDF pre-processing codes

International Nuclear Information System (INIS)

Cullen, D.E.

1996-01-01

The codes are named 'the Pre-processing' codes, because they are designed to pre-process ENDF/B data, for later, further processing for use in applications. This is a modular set of computer codes, each of which reads and writes evaluated nuclear data in the ENDF/B format. Each code performs one or more independent operations on the data, as described below. These codes are designed to be computer independent, and are presently operational on every type of computer from large mainframe computer to small personal computers, such as IBM-PC and Power MAC. The codes are available from the IAEA Nuclear Data Section, free of charge upon request. (author)
Visual time series analysis

DEFF Research Database (Denmark)

Fischer, Paul; Hilbert, Astrid

2012-01-01

We introduce a platform which supplies an easy-to-handle, interactive, extendable, and fast analysis tool for time series analysis. In contrast to other software suits like Maple, Matlab, or R, which use a command-line-like interface and where the user has to memorize/look-up the appropriate...... commands, our application is select-and-click-driven. It allows to derive many different sequences of deviations for a given time series and to visualize them in different ways in order to judge their expressive power and to reuse the procedure found. For many transformations or model-ts, the user may...... choose between manual and automated parameter selection. The user can dene new transformations and add them to the system. The application contains efficient implementations of advanced and recent techniques for time series analysis including techniques related to extreme value analysis and filtering...
A Review of Subsequence Time Series Clustering

Directory of Open Access Journals (Sweden)

Seyedjamal Zolhavarieh

2014-01-01

Full Text Available Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies.
A review of subsequence time series clustering.

Science.gov (United States)

Zolhavarieh, Seyedjamal; Aghabozorgi, Saeed; Teh, Ying Wah

2014-01-01

Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies.
A Review of Subsequence Time Series Clustering

Science.gov (United States)

Teh, Ying Wah

2014-01-01

Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies. PMID:25140332
Analysis of Heavy-Tailed Time Series

DEFF Research Database (Denmark)

Xie, Xiaolei

This thesis is about analysis of heavy-tailed time series. We discuss tail properties of real-world equity return series and investigate the possibility that a single tail index is shared by all return series of actively traded equities in a market. Conditions for this hypothesis to be true...... are identified. We study the eigenvalues and eigenvectors of sample covariance and sample auto-covariance matrices of multivariate heavy-tailed time series, and particularly for time series with very high dimensions. Asymptotic approximations of the eigenvalues and eigenvectors of such matrices are found...... and expressed in terms of the parameters of the dependence structure, among others. Furthermore, we study an importance sampling method for estimating rare-event probabilities of multivariate heavy-tailed time series generated by matrix recursion. We show that the proposed algorithm is efficient in the sense...
Voice preprocessing system incorporating a real-time spectrum analyzer with programmable switched-capacitor filters

Science.gov (United States)

Knapp, G.

1984-01-01

As part of a speaker verification program for BISS (Base Installation Security System), a test system is being designed with a flexible preprocessing system for the evaluation of voice spectrum/verification algorithm related problems. The main part of this report covers the design, construction, and testing of a voice analyzer with 16 integrating real-time frequency channels ranging from 300 Hz to 3 KHz. The bandpass filter response of each channel is programmable by NMOS switched capacitor quad filter arrays. Presently, the accuracy of these units is limited to a moderate precision by the finite steps of programming. However, repeatability of characteristics between filter units and sections seems to be excellent for the implemented fourth-order Butterworth bandpass responses. We obtained a 0.1 dB linearity error of signal detection and measured a signal-to-noise ratio of approximately 70 dB. The proprocessing system discussed includes preemphasis filter design, gain normalizer design, and data acquisition system design as well as test results.
Adaptive time-variant models for fuzzy-time-series forecasting.

Science.gov (United States)

Wong, Wai-Keung; Bai, Enjian; Chu, Alice Wai-Ching

2010-12-01

A fuzzy time series has been applied to the prediction of enrollment, temperature, stock indices, and other domains. Related studies mainly focus on three factors, namely, the partition of discourse, the content of forecasting rules, and the methods of defuzzification, all of which greatly influence the prediction accuracy of forecasting models. These studies use fixed analysis window sizes for forecasting. In this paper, an adaptive time-variant fuzzy-time-series forecasting model (ATVF) is proposed to improve forecasting accuracy. The proposed model automatically adapts the analysis window size of fuzzy time series based on the prediction accuracy in the training phase and uses heuristic rules to generate forecasting values in the testing phase. The performance of the ATVF model is tested using both simulated and actual time series including the enrollments at the University of Alabama, Tuscaloosa, and the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX). The experiment results show that the proposed ATVF model achieves a significant improvement in forecasting accuracy as compared to other fuzzy-time-series forecasting models.
Interactive Web-based Visualization of Atomic Position-time Series Data

Science.gov (United States)

Thapa, S.; Karki, B. B.

2017-12-01

Extracting and interpreting the information contained in large sets of time-varying three dimensional positional data for the constituent atoms of simulated material is a challenging task. We have recently implemented a web-based visualization system to analyze the position-time series data extracted from the local or remote hosts. It involves a pre-processing step for data reduction, which involves skipping uninteresting parts of the data uniformly (at full atomic configuration level) or non-uniformly (at atomic species level or individual atom level). Atomic configuration snapshot is rendered using the ball-stick representation and can be animated by rendering successive configurations. The entire atomic dynamics can be captured as the trajectories by rendering the atomic positions at all time steps together as points. The trajectories can be manipulated at both species and atomic levels so that we can focus on one or more trajectories of interest, and can be also superimposed with the instantaneous atomic structure. The implementation was done using WebGL and Three.js for graphical rendering, HTML5 and Javascript for GUI, and Elasticsearch and JSON for data storage and retrieval within the Grails Framework. We have applied our visualization system to the simulation datatsets for proton-bearing forsterite (Mg2SiO4) - an abundant mineral of Earths upper mantle. Visualization reveals that protons (hydrogen ions) incorporated as interstitials are much more mobile than protons substituting the host Mg and Si cation sites. The proton diffusion appears to be anisotropic with high mobility along the x-direction, showing limited discrete jumps in other two directions.
Time Series Analysis and Forecasting by Example

CERN Document Server

Bisgaard, Soren

2011-01-01

An intuition-based approach enables you to master time series analysis with ease Time Series Analysis and Forecasting by Example provides the fundamental techniques in time series analysis using various examples. By introducing necessary theory through examples that showcase the discussed topics, the authors successfully help readers develop an intuitive understanding of seemingly complicated time series models and their implications. The book presents methodologies for time series analysis in a simplified, example-based approach. Using graphics, the authors discuss each presented example in
Time series with tailored nonlinearities

Science.gov (United States)

Räth, C.; Laut, I.

2015-10-01

It is demonstrated how to generate time series with tailored nonlinearities by inducing well-defined constraints on the Fourier phases. Correlations between the phase information of adjacent phases and (static and dynamic) measures of nonlinearities are established and their origin is explained. By applying a set of simple constraints on the phases of an originally linear and uncorrelated Gaussian time series, the observed scaling behavior of the intensity distribution of empirical time series can be reproduced. The power law character of the intensity distributions being typical for, e.g., turbulence and financial data can thus be explained in terms of phase correlations.
A survey of visual preprocessing and shape representation techniques

Science.gov (United States)

Olshausen, Bruno A.

1988-01-01

Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).
Influence of Time-Series Normalization, Number of Nodes, Connectivity and Graph Measure Selection on Seizure-Onset Zone Localization from Intracranial EEG.

Science.gov (United States)

van Mierlo, Pieter; Lie, Octavian; Staljanssens, Willeke; Coito, Ana; Vulliémoz, Serge

2018-04-26

We investigated the influence of processing steps in the estimation of multivariate directed functional connectivity during seizures recorded with intracranial EEG (iEEG) on seizure-onset zone (SOZ) localization. We studied the effect of (i) the number of nodes, (ii) time-series normalization, (iii) the choice of multivariate time-varying connectivity measure: Adaptive Directed Transfer Function (ADTF) or Adaptive Partial Directed Coherence (APDC) and (iv) graph theory measure: outdegree or shortest path length. First, simulations were performed to quantify the influence of the various processing steps on the accuracy to localize the SOZ. Afterwards, the SOZ was estimated from a 113-electrodes iEEG seizure recording and compared with the resection that rendered the patient seizure-free. The simulations revealed that ADTF is preferred over APDC to localize the SOZ from ictal iEEG recordings. Normalizing the time series before analysis resulted in an increase of 25-35% of correctly localized SOZ, while adding more nodes to the connectivity analysis led to a moderate decrease of 10%, when comparing 128 with 32 input nodes. The real-seizure connectivity estimates localized the SOZ inside the resection area using the ADTF coupled to outdegree or shortest path length. Our study showed that normalizing the time-series is an important pre-processing step, while adding nodes to the analysis did only marginally affect the SOZ localization. The study shows that directed multivariate Granger-based connectivity analysis is feasible with many input nodes (> 100) and that normalization of the time-series before connectivity analysis is preferred.
Clustering of financial time series

Science.gov (United States)

D'Urso, Pierpaolo; Cappelli, Carmela; Di Lallo, Dario; Massari, Riccardo

2013-05-01

This paper addresses the topic of classifying financial time series in a fuzzy framework proposing two fuzzy clustering models both based on GARCH models. In general clustering of financial time series, due to their peculiar features, needs the definition of suitable distance measures. At this aim, the first fuzzy clustering model exploits the autoregressive representation of GARCH models and employs, in the framework of a partitioning around medoids algorithm, the classical autoregressive metric. The second fuzzy clustering model, also based on partitioning around medoids algorithm, uses the Caiado distance, a Mahalanobis-like distance, based on estimated GARCH parameters and covariances that takes into account the information about the volatility structure of time series. In order to illustrate the merits of the proposed fuzzy approaches an application to the problem of classifying 29 time series of Euro exchange rates against international currencies is presented and discussed, also comparing the fuzzy models with their crisp version.
Data Mining Smart Energy Time Series

Directory of Open Access Journals (Sweden)

Janina POPEANGA

2015-07-01

Full Text Available With the advent of smart metering technology the amount of energy data will increase significantly and utilities industry will have to face another big challenge - to find relationships within time-series data and even more - to analyze such huge numbers of time series to find useful patterns and trends with fast or even real-time response. This study makes a small review of the literature in the field, trying to demonstrate how essential is the application of data mining techniques in the time series to make the best use of this large quantity of data, despite all the difficulties. Also, the most important Time Series Data Mining techniques are presented, highlighting their applicability in the energy domain.
Predicting chaotic time series

International Nuclear Information System (INIS)

Farmer, J.D.; Sidorowich, J.J.

1987-01-01

We present a forecasting technique for chaotic data. After embedding a time series in a state space using delay coordinates, we ''learn'' the induced nonlinear mapping using local approximation. This allows us to make short-term predictions of the future behavior of a time series, using information based only on past values. We present an error estimate for this technique, and demonstrate its effectiveness by applying it to several examples, including data from the Mackey-Glass delay differential equation, Rayleigh-Benard convection, and Taylor-Couette flow
Measuring multiscaling in financial time-series

International Nuclear Information System (INIS)

Buonocore, R.J.; Aste, T.; Di Matteo, T.

2016-01-01

We discuss the origin of multiscaling in financial time-series and investigate how to best quantify it. Our methodology consists in separating the different sources of measured multifractality by analyzing the multi/uni-scaling behavior of synthetic time-series with known properties. We use the results from the synthetic time-series to interpret the measure of multifractality of real log-returns time-series. The main finding is that the aggregation horizon of the returns can introduce a strong bias effect on the measure of multifractality. This effect can become especially important when returns distributions have power law tails with exponents in the range (2, 5). We discuss the right aggregation horizon to mitigate this bias.
Time averaging, ageing and delay analysis of financial time series

Science.gov (United States)

Cherstvy, Andrey G.; Vinod, Deepak; Aghion, Erez; Chechkin, Aleksei V.; Metzler, Ralf

2017-06-01

We introduce three strategies for the analysis of financial time series based on time averaged observables. These comprise the time averaged mean squared displacement (MSD) as well as the ageing and delay time methods for varying fractions of the financial time series. We explore these concepts via statistical analysis of historic time series for several Dow Jones Industrial indices for the period from the 1960s to 2015. Remarkably, we discover a simple universal law for the delay time averaged MSD. The observed features of the financial time series dynamics agree well with our analytical results for the time averaged measurables for geometric Brownian motion, underlying the famed Black-Scholes-Merton model. The concepts we promote here are shown to be useful for financial data analysis and enable one to unveil new universal features of stock market dynamics.

Applied time series analysis

CERN Document Server

Woodward, Wayne A; Elliott, Alan C

2011-01-01

""There is scarcely a standard technique that the reader will find left out … this book is highly recommended for those requiring a ready introduction to applicable methods in time series and serves as a useful resource for pedagogical purposes.""-International Statistical Review (2014), 82""Current time series theory for practice is well summarized in this book.""-Emmanuel Parzen, Texas A&M University""What an extraordinary range of topics covered, all very insightfully. I like [the authors'] innovations very much, such as the AR factor table.""-David Findley, U.S. Census Bureau (retired)""…
Scientific data products and the data pre-processing subsystem of the Chang'e-3 mission

International Nuclear Information System (INIS)

Tan Xu; Liu Jian-Jun; Li Chun-Lai; Feng Jian-Qing; Ren Xin; Wang Fen-Fei; Yan Wei; Zuo Wei; Wang Xiao-Qian; Zhang Zhou-Bin

2014-01-01

The Chang'e-3 (CE-3) mission is China's first exploration mission on the surface of the Moon that uses a lander and a rover. Eight instruments that form the scientific payloads have the following objectives: (1) investigate the morphological features and geological structures at the landing site; (2) integrated in-situ analysis of minerals and chemical compositions; (3) integrated exploration of the structure of the lunar interior; (4) exploration of the lunar-terrestrial space environment, lunar surface environment and acquire Moon-based ultraviolet astronomical observations. The Ground Research and Application System (GRAS) is in charge of data acquisition and pre-processing, management of the payload in orbit, and managing the data products and their applications. The Data Pre-processing Subsystem (DPS) is a part of GRAS. The task of DPS is the pre-processing of raw data from the eight instruments that are part of CE-3, including channel processing, unpacking, package sorting, calibration and correction, identification of geographical location, calculation of probe azimuth angle, probe zenith angle, solar azimuth angle, and solar zenith angle and so on, and conducting quality checks. These processes produce Level 0, Level 1 and Level 2 data. The computing platform of this subsystem is comprised of a high-performance computing cluster, including a real-time subsystem used for processing Level 0 data and a post-time subsystem for generating Level 1 and Level 2 data. This paper describes the CE-3 data pre-processing method, the data pre-processing subsystem, data classification, data validity and data products that are used for scientific studies
Entropic Analysis of Electromyography Time Series

Science.gov (United States)

Kaufman, Miron; Sung, Paul

2005-03-01

We are in the process of assessing the effectiveness of fractal and entropic measures for the diagnostic of low back pain from surface electromyography (EMG) time series. Surface electromyography (EMG) is used to assess patients with low back pain. In a typical EMG measurement, the voltage is measured every millisecond. We observed back muscle fatiguing during one minute, which results in a time series with 60,000 entries. We characterize the complexity of time series by computing the Shannon entropy time dependence. The analysis of the time series from different relevant muscles from healthy and low back pain (LBP) individuals provides evidence that the level of variability of back muscle activities is much larger for healthy individuals than for individuals with LBP. In general the time dependence of the entropy shows a crossover from a diffusive regime to a regime characterized by long time correlations (self organization) at about 0.01s.
Compact Circuit Preprocesses Accelerometer Output

Science.gov (United States)

Bozeman, Richard J., Jr.

1993-01-01

Compact electronic circuit transfers dc power to, and preprocesses ac output of, accelerometer and associated preamplifier. Incorporated into accelerometer case during initial fabrication or retrofit onto commercial accelerometer. Made of commercial integrated circuits and other conventional components; made smaller by use of micrologic and surface-mount technology.
Quantifying memory in complex physiological time-series.

Science.gov (United States)

Shirazi, Amir H; Raoufy, Mohammad R; Ebadi, Haleh; De Rui, Michele; Schiff, Sami; Mazloom, Roham; Hajizadeh, Sohrab; Gharibzadeh, Shahriar; Dehpour, Ahmad R; Amodio, Piero; Jafari, G Reza; Montagnese, Sara; Mani, Ali R

2013-01-01

In a time-series, memory is a statistical feature that lasts for a period of time and distinguishes the time-series from a random, or memory-less, process. In the present study, the concept of "memory length" was used to define the time period, or scale over which rare events within a physiological time-series do not appear randomly. The method is based on inverse statistical analysis and provides empiric evidence that rare fluctuations in cardio-respiratory time-series are 'forgotten' quickly in healthy subjects while the memory for such events is significantly prolonged in pathological conditions such as asthma (respiratory time-series) and liver cirrhosis (heart-beat time-series). The memory length was significantly higher in patients with uncontrolled asthma compared to healthy volunteers. Likewise, it was significantly higher in patients with decompensated cirrhosis compared to those with compensated cirrhosis and healthy volunteers. We also observed that the cardio-respiratory system has simple low order dynamics and short memory around its average, and high order dynamics around rare fluctuations.
Preprocessing of emotional visual information in the human piriform cortex.

Science.gov (United States)

Schulze, Patrick; Bestgen, Anne-Kathrin; Lech, Robert K; Kuchinke, Lars; Suchan, Boris

2017-08-23

This study examines the processing of visual information by the olfactory system in humans. Recent data point to the processing of visual stimuli by the piriform cortex, a region mainly known as part of the primary olfactory cortex. Moreover, the piriform cortex generates predictive templates of olfactory stimuli to facilitate olfactory processing. This study fills the gap relating to the question whether this region is also capable of preprocessing emotional visual information. To gain insight into the preprocessing and transfer of emotional visual information into olfactory processing, we recorded hemodynamic responses during affective priming using functional magnetic resonance imaging (fMRI). Odors of different valence (pleasant, neutral and unpleasant) were primed by images of emotional facial expressions (happy, neutral and disgust). Our findings are the first to demonstrate that the piriform cortex preprocesses emotional visual information prior to any olfactory stimulation and that the emotional connotation of this preprocessing is subsequently transferred and integrated into an extended olfactory network for olfactory processing.
A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series

Science.gov (United States)

Patel, Ameera X.; Kundu, Prantik; Rubinov, Mikail; Jones, P. Simon; Vértes, Petra E.; Ersche, Karen D.; Suckling, John; Bullmore, Edward T.

2014-01-01

The impact of in-scanner head movement on functional magnetic resonance imaging (fMRI) signals has long been established as undesirable. These effects have been traditionally corrected by methods such as linear regression of head movement parameters. However, a number of recent independent studies have demonstrated that these techniques are insufficient to remove motion confounds, and that even small movements can spuriously bias estimates of functional connectivity. Here we propose a new data-driven, spatially-adaptive, wavelet-based method for identifying, modeling, and removing non-stationary events in fMRI time series, caused by head movement, without the need for data scrubbing. This method involves the addition of just one extra step, the Wavelet Despike, in standard pre-processing pipelines. With this method, we demonstrate robust removal of a range of different motion artifacts and motion-related biases including distance-dependent connectivity artifacts, at a group and single-subject level, using a range of previously published and new diagnostic measures. The Wavelet Despike is able to accommodate the substantial spatial and temporal heterogeneity of motion artifacts and can consequently remove a range of high and low frequency artifacts from fMRI time series, that may be linearly or non-linearly related to physical movements. Our methods are demonstrated by the analysis of three cohorts of resting-state fMRI data, including two high-motion datasets: a previously published dataset on children (N = 22) and a new dataset on adults with stimulant drug dependence (N = 40). We conclude that there is a real risk of motion-related bias in connectivity analysis of fMRI data, but that this risk is generally manageable, by effective time series denoising strategies designed to attenuate synchronized signal transients induced by abrupt head movements. The Wavelet Despiking software described in this article is freely available for download at www
Statistical criteria for characterizing irradiance time series.

Energy Technology Data Exchange (ETDEWEB)

Stein, Joshua S.; Ellis, Abraham; Hansen, Clifford W.

2010-10-01

We propose and examine several statistical criteria for characterizing time series of solar irradiance. Time series of irradiance are used in analyses that seek to quantify the performance of photovoltaic (PV) power systems over time. Time series of irradiance are either measured or are simulated using models. Simulations of irradiance are often calibrated to or generated from statistics for observed irradiance and simulations are validated by comparing the simulation output to the observed irradiance. Criteria used in this comparison should derive from the context of the analyses in which the simulated irradiance is to be used. We examine three statistics that characterize time series and their use as criteria for comparing time series. We demonstrate these statistics using observed irradiance data recorded in August 2007 in Las Vegas, Nevada, and in June 2009 in Albuquerque, New Mexico.
Homogenising time series: beliefs, dogmas and facts

Science.gov (United States)

Domonkos, P.

2011-06-01

In the recent decades various homogenisation methods have been developed, but the real effects of their application on time series are still not known sufficiently. The ongoing COST action HOME (COST ES0601) is devoted to reveal the real impacts of homogenisation methods more detailed and with higher confidence than earlier. As a part of the COST activity, a benchmark dataset was built whose characteristics approach well the characteristics of real networks of observed time series. This dataset offers much better opportunity than ever before to test the wide variety of homogenisation methods, and analyse the real effects of selected theoretical recommendations. Empirical results show that real observed time series usually include several inhomogeneities of different sizes. Small inhomogeneities often have similar statistical characteristics than natural changes caused by climatic variability, thus the pure application of the classic theory that change-points of observed time series can be found and corrected one-by-one is impossible. However, after homogenisation the linear trends, seasonal changes and long-term fluctuations of time series are usually much closer to the reality than in raw time series. Some problems around detecting multiple structures of inhomogeneities, as well as that of time series comparisons within homogenisation procedures are discussed briefly in the study.
Ocean time-series near Bermuda: Hydrostation S and the US JGOFS Bermuda Atlantic time-series study

Science.gov (United States)

Michaels, Anthony F.; Knap, Anthony H.

1992-01-01

Bermuda is the site of two ocean time-series programs. At Hydrostation S, the ongoing biweekly profiles of temperature, salinity and oxygen now span 37 years. This is one of the longest open-ocean time-series data sets and provides a view of decadal scale variability in ocean processes. In 1988, the U.S. JGOFS Bermuda Atlantic Time-series Study began a wide range of measurements at a frequency of 14-18 cruises each year to understand temporal variability in ocean biogeochemistry. On each cruise, the data range from chemical analyses of discrete water samples to data from electronic packages of hydrographic and optics sensors. In addition, a range of biological and geochemical rate measurements are conducted that integrate over time-periods of minutes to days. This sampling strategy yields a reasonable resolution of the major seasonal patterns and of decadal scale variability. The Sargasso Sea also has a variety of episodic production events on scales of days to weeks and these are only poorly resolved. In addition, there is a substantial amount of mesoscale variability in this region and some of the perceived temporal patterns are caused by the intersection of the biweekly sampling with the natural spatial variability. In the Bermuda time-series programs, we have added a series of additional cruises to begin to assess these other sources of variation and their impacts on the interpretation of the main time-series record. However, the adequate resolution of higher frequency temporal patterns will probably require the introduction of new sampling strategies and some emerging technologies such as biogeochemical moorings and autonomous underwater vehicles.
Multivariate Time Series Decomposition into Oscillation Components.

Science.gov (United States)

Matsuda, Takeru; Komaki, Fumiyasu

2017-08-01

Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.
Forecasting Enrollments with Fuzzy Time Series.

Science.gov (United States)

Song, Qiang; Chissom, Brad S.

The concept of fuzzy time series is introduced and used to forecast the enrollment of a university. Fuzzy time series, an aspect of fuzzy set theory, forecasts enrollment using a first-order time-invariant model. To evaluate the model, the conventional linear regression technique is applied and the predicted values obtained are compared to the…
Relative effects of statistical preprocessing and postprocessing on a regional hydrological ensemble prediction system

Science.gov (United States)

Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso

2018-03-01

The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases
Forecasting Cryptocurrencies Financial Time Series

DEFF Research Database (Denmark)

Catania, Leopoldo; Grassi, Stefano; Ravazzolo, Francesco

2018-01-01

This paper studies the predictability of cryptocurrencies time series. We compare several alternative univariate and multivariate models in point and density forecasting of four of the most capitalized series: Bitcoin, Litecoin, Ripple and Ethereum. We apply a set of crypto–predictors and rely...
Forecasting Cryptocurrencies Financial Time Series

OpenAIRE

Catania, Leopoldo; Grassi, Stefano; Ravazzolo, Francesco

2018-01-01

This paper studies the predictability of cryptocurrencies time series. We compare several alternative univariate and multivariate models in point and density forecasting of four of the most capitalized series: Bitcoin, Litecoin, Ripple and Ethereum. We apply a set of crypto–predictors and rely on Dynamic Model Averaging to combine a large set of univariate Dynamic Linear Models and several multivariate Vector Autoregressive models with different forms of time variation. We find statistical si...
Time series modeling, computation, and inference

CERN Document Server

Prado, Raquel

2010-01-01

The authors systematically develop a state-of-the-art analysis and modeling of time series. … this book is well organized and well written. The authors present various statistical models for engineers to solve problems in time series analysis. Readers no doubt will learn state-of-the-art techniques from this book.-Hsun-Hsien Chang, Computing Reviews, March 2012My favorite chapters were on dynamic linear models and vector AR and vector ARMA models.-William Seaver, Technometrics, August 2011… a very modern entry to the field of time-series modelling, with a rich reference list of the current lit
Time Series Analysis Forecasting and Control

CERN Document Server

Box, George E P; Reinsel, Gregory C

2011-01-01

A modernized new edition of one of the most trusted books on time series analysis. Since publication of the first edition in 1970, Time Series Analysis has served as one of the most influential and prominent works on the subject. This new edition maintains its balanced presentation of the tools for modeling and analyzing time series and also introduces the latest developments that have occurred n the field over the past decade through applications from areas such as business, finance, and engineering. The Fourth Edition provides a clearly written exploration of the key methods for building, cl
Research on pre-processing of QR Code

Science.gov (United States)

Sun, Haixing; Xia, Haojie; Dong, Ning

2013-10-01

QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.
Costationarity of Locally Stationary Time Series Using costat

OpenAIRE

Cardinali, Alessandro; Nason, Guy P.

2013-01-01

This article describes the R package costat. This package enables a user to (i) perform a test for time series stationarity; (ii) compute and plot time-localized autocovariances, and (iii) to determine and explore any costationary relationship between two locally stationary time series. Two locally stationary time series are said to be costationary if there exists two time-varying combination functions such that the linear combination of the two series with the functions produces another time...
Discrete pre-processing step effects in registration-based pipelines, a preliminary volumetric study on T1-weighted images.

Science.gov (United States)

Muncy, Nathan M; Hedges-Muncy, Ariana M; Kirwan, C Brock

2017-01-01

Pre-processing MRI scans prior to performing volumetric analyses is common practice in MRI studies. As pre-processing steps adjust the voxel intensities, the space in which the scan exists, and the amount of data in the scan, it is possible that the steps have an effect on the volumetric output. To date, studies have compared between and not within pipelines, and so the impact of each step is unknown. This study aims to quantify the effects of pre-processing steps on volumetric measures in T1-weighted scans within a single pipeline. It was our hypothesis that pre-processing steps would significantly impact ROI volume estimations. One hundred fifteen participants from the OASIS dataset were used, where each participant contributed three scans. All scans were then pre-processed using a step-wise pipeline. Bilateral hippocampus, putamen, and middle temporal gyrus volume estimations were assessed following each successive step, and all data were processed by the same pipeline 5 times. Repeated-measures analyses tested for a main effects of pipeline step, scan-rescan (for MRI scanner consistency) and repeated pipeline runs (for algorithmic consistency). A main effect of pipeline step was detected, and interestingly an interaction between pipeline step and ROI exists. No effect for either scan-rescan or repeated pipeline run was detected. We then supply a correction for noise in the data resulting from pre-processing.

Optimization of miRNA-seq data preprocessing.

Science.gov (United States)

Tam, Shirley; Tsao, Ming-Sound; McPherson, John D

2015-11-01

The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments. © The Author 2015. Published by Oxford University Press.
Detecting nonlinear structure in time series

International Nuclear Information System (INIS)

Theiler, J.

1991-01-01

We describe an approach for evaluating the statistical significance of evidence for nonlinearity in a time series. The formal application of our method requires the careful statement of a null hypothesis which characterizes a candidate linear process, the generation of an ensemble of ''surrogate'' data sets which are similar to the original time series but consistent with the null hypothesis, and the computation of a discriminating statistic for the original and for each of the surrogate data sets. The idea is to test the original time series against the null hypothesis by checking whether the discriminating statistic computed for the original time series differs significantly from the statistics computed for each of the surrogate sets. While some data sets very cleanly exhibit low-dimensional chaos, there are many cases where the evidence is sketchy and difficult to evaluate. We hope to provide a framework within which such claims of nonlinearity can be evaluated. 5 refs., 4 figs
Introduction to time series and forecasting

CERN Document Server

Brockwell, Peter J

2016-01-01

This book is aimed at the reader who wishes to gain a working knowledge of time series and forecasting methods as applied to economics, engineering and the natural and social sciences. It assumes knowledge only of basic calculus, matrix algebra and elementary statistics. This third edition contains detailed instructions for the use of the professional version of the Windows-based computer package ITSM2000, now available as a free download from the Springer Extras website. The logic and tools of time series model-building are developed in detail. Numerous exercises are included and the software can be used to analyze and forecast data sets of the user's own choosing. The book can also be used in conjunction with other time series packages such as those included in R. The programs in ITSM2000 however are menu-driven and can be used with minimal investment of time in the computational details. The core of the book covers stationary processes, ARMA and ARIMA processes, multivariate time series and state-space mod...
Pre-Processing and Modeling Tools for Bigdata

Directory of Open Access Journals (Sweden)

Hashem Hadi

2016-09-01

Full Text Available Modeling tools and operators help the user / developer to identify the processing field on the top of the sequence and to send into the computing module only the data related to the requested result. The remaining data is not relevant and it will slow down the processing. The biggest challenge nowadays is to get high quality processing results with a reduced computing time and costs. To do so, we must review the processing sequence, by adding several modeling tools. The existing processing models do not take in consideration this aspect and focus on getting high calculation performances which will increase the computing time and costs. In this paper we provide a study of the main modeling tools for BigData and a new model based on pre-processing.
TIME SERIES ANALYSIS USING A UNIQUE MODEL OF TRANSFORMATION

Directory of Open Access Journals (Sweden)

Goran Klepac

2007-12-01

Full Text Available REFII1 model is an authorial mathematical model for time series data mining. The main purpose of that model is to automate time series analysis, through a unique transformation model of time series. An advantage of this approach of time series analysis is the linkage of different methods for time series analysis, linking traditional data mining tools in time series, and constructing new algorithms for analyzing time series. It is worth mentioning that REFII model is not a closed system, which means that we have a finite set of methods. At first, this is a model for transformation of values of time series, which prepares data used by different sets of methods based on the same model of transformation in a domain of problem space. REFII model gives a new approach in time series analysis based on a unique model of transformation, which is a base for all kind of time series analysis. The advantage of REFII model is its possible application in many different areas such as finance, medicine, voice recognition, face recognition and text mining.
Frontiers in Time Series and Financial Econometrics

OpenAIRE

Ling, S.; McAleer, M.J.; Tong, H.

2015-01-01

__Abstract__ Two of the fastest growing frontiers in econometrics and quantitative finance are time series and financial econometrics. Significant theoretical contributions to financial econometrics have been made by experts in statistics, econometrics, mathematics, and time series analysis. The purpose of this special issue of the journal on “Frontiers in Time Series and Financial Econometrics” is to highlight several areas of research by leading academics in which novel methods have contrib...
Normalization: A Preprocessing Stage

OpenAIRE

Patro, S. Gopal Krishna; Sahu, Kishore Kumar

2015-01-01

As we know that the normalization is a pre-processing stage of any type problem statement. Especially normalization takes important role in the field of soft computing, cloud computing etc. for manipulation of data like scale down or scale up the range of data before it becomes used for further stage. There are so many normalization techniques are there namely Min-Max normalization, Z-score normalization and Decimal scaling normalization. So by referring these normalization techniques we are ...
Scale-dependent intrinsic entropies of complex time series.

Science.gov (United States)

Yeh, Jia-Rong; Peng, Chung-Kang; Huang, Norden E

2016-04-13

Multi-scale entropy (MSE) was developed as a measure of complexity for complex time series, and it has been applied widely in recent years. The MSE algorithm is based on the assumption that biological systems possess the ability to adapt and function in an ever-changing environment, and these systems need to operate across multiple temporal and spatial scales, such that their complexity is also multi-scale and hierarchical. Here, we present a systematic approach to apply the empirical mode decomposition algorithm, which can detrend time series on various time scales, prior to analysing a signal's complexity by measuring the irregularity of its dynamics on multiple time scales. Simulated time series of fractal Gaussian noise and human heartbeat time series were used to study the performance of this new approach. We show that our method can successfully quantify the fractal properties of the simulated time series and can accurately distinguish modulations in human heartbeat time series in health and disease. © 2016 The Author(s).
Elements of nonlinear time series analysis and forecasting

CERN Document Server

De Gooijer, Jan G

2017-01-01

This book provides an overview of the current state-of-the-art of nonlinear time series analysis, richly illustrated with examples, pseudocode algorithms and real-world applications. Avoiding a “theorem-proof” format, it shows concrete applications on a variety of empirical time series. The book can be used in graduate courses in nonlinear time series and at the same time also includes interesting material for more advanced readers. Though it is largely self-contained, readers require an understanding of basic linear time series concepts, Markov chains and Monte Carlo simulation methods. The book covers time-domain and frequency-domain methods for the analysis of both univariate and multivariate (vector) time series. It makes a clear distinction between parametric models on the one hand, and semi- and nonparametric models/methods on the other. This offers the reader the option of concentrating exclusively on one of these nonlinear time series analysis methods. To make the book as user friendly as possible...
An Energy-Based Similarity Measure for Time Series

Directory of Open Access Journals (Sweden)

Pierre Brunagel

2007-11-01

Full Text Available A new similarity measure, called SimilB, for time series analysis, based on the cross-ÃŽÂ¨B-energy operator (2004, is introduced. ÃŽÂ¨B is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED or the Pearson correlation coefficient (CC, SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of ÃŽÂ¨B are presented. Particularly, we show that ÃŽÂ¨B as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures.
An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI.

Directory of Open Access Journals (Sweden)

Nathan W Churchill

Full Text Available BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the "pipeline" significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard "fixed" preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each, demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets.
Detecting chaos in irregularly sampled time series.

Science.gov (United States)

Kulp, C W

2013-09-01

Recently, Wiebe and Virgin [Chaos 22, 013136 (2012)] developed an algorithm which detects chaos by analyzing a time series' power spectrum which is computed using the Discrete Fourier Transform (DFT). Their algorithm, like other time series characterization algorithms, requires that the time series be regularly sampled. Real-world data, however, are often irregularly sampled, thus, making the detection of chaotic behavior difficult or impossible with those methods. In this paper, a characterization algorithm is presented, which effectively detects chaos in irregularly sampled time series. The work presented here is a modification of Wiebe and Virgin's algorithm and uses the Lomb-Scargle Periodogram (LSP) to compute a series' power spectrum instead of the DFT. The DFT is not appropriate for irregularly sampled time series. However, the LSP is capable of computing the frequency content of irregularly sampled data. Furthermore, a new method of analyzing the power spectrum is developed, which can be useful for differentiating between chaotic and non-chaotic behavior. The new characterization algorithm is successfully applied to irregularly sampled data generated by a model as well as data consisting of observations of variable stars.
Building Chaotic Model From Incomplete Time Series

Science.gov (United States)

Siek, Michael; Solomatine, Dimitri

2010-05-01

This paper presents a number of novel techniques for building a predictive chaotic model from incomplete time series. A predictive chaotic model is built by reconstructing the time-delayed phase space from observed time series and the prediction is made by a global model or adaptive local models based on the dynamical neighbors found in the reconstructed phase space. In general, the building of any data-driven models depends on the completeness and quality of the data itself. However, the completeness of the data availability can not always be guaranteed since the measurement or data transmission is intermittently not working properly due to some reasons. We propose two main solutions dealing with incomplete time series: using imputing and non-imputing methods. For imputing methods, we utilized the interpolation methods (weighted sum of linear interpolations, Bayesian principle component analysis and cubic spline interpolation) and predictive models (neural network, kernel machine, chaotic model) for estimating the missing values. After imputing the missing values, the phase space reconstruction and chaotic model prediction are executed as a standard procedure. For non-imputing methods, we reconstructed the time-delayed phase space from observed time series with missing values. This reconstruction results in non-continuous trajectories. However, the local model prediction can still be made from the other dynamical neighbors reconstructed from non-missing values. We implemented and tested these methods to construct a chaotic model for predicting storm surges at Hoek van Holland as the entrance of Rotterdam Port. The hourly surge time series is available for duration of 1990-1996. For measuring the performance of the proposed methods, a synthetic time series with missing values generated by a particular random variable to the original (complete) time series is utilized. There exist two main performance measures used in this work: (1) error measures between the actual
Multivariate Time Series Search

Data.gov (United States)

National Aeronautics and Space Administration — Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical...
Analysing Stable Time Series

National Research Council Canada - National Science Library

Adler, Robert

1997-01-01

We describe how to take a stable, ARMA, time series through the various stages of model identification, parameter estimation, and diagnostic checking, and accompany the discussion with a goodly number...
Optimal HRF and smoothing parameters for fMRI time series within an autoregressive modeling framework.

Science.gov (United States)

Galka, Andreas; Siniatchkin, Michael; Stephani, Ulrich; Groening, Kristina; Wolff, Stephan; Bosch-Bayard, Jorge; Ozaki, Tohru

2010-12-01

The analysis of time series obtained by functional magnetic resonance imaging (fMRI) may be approached by fitting predictive parametric models, such as nearest-neighbor autoregressive models with exogeneous input (NNARX). As a part of the modeling procedure, it is possible to apply instantaneous linear transformations to the data. Spatial smoothing, a common preprocessing step, may be interpreted as such a transformation. The autoregressive parameters may be constrained, such that they provide a response behavior that corresponds to the canonical haemodynamic response function (HRF). We present an algorithm for estimating the parameters of the linear transformations and of the HRF within a rigorous maximum-likelihood framework. Using this approach, an optimal amount of both the spatial smoothing and the HRF can be estimated simultaneously for a given fMRI data set. An example from a motor-task experiment is discussed. It is found that, for this data set, weak, but non-zero, spatial smoothing is optimal. Furthermore, it is demonstrated that activated regions can be estimated within the maximum-likelihood framework.
Neural Network Models for Time Series Forecasts

OpenAIRE

Tim Hill; Marcus O'Connor; William Remus

1996-01-01

Neural networks have been advocated as an alternative to traditional statistical forecasting methods. In the present experiment, time series forecasts produced by neural networks are compared with forecasts from six statistical time series methods generated in a major forecasting competition (Makridakis et al. [Makridakis, S., A. Anderson, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. Parzen, R. Winkler. 1982. The accuracy of extrapolation (time series) methods: Results of a ...
IceTrendr: a linear time-series approach to monitoring glacier environments using Landsat

Science.gov (United States)

Nelson, P.; Kennedy, R. E.; Nolin, A. W.; Hughes, J. M.; Braaten, J.

2017-12-01

Arctic glaciers in Alaska and Canada have experienced some of the greatest ice mass loss of any region in recent decades. A challenge to understanding these changing ecosystems, however, is developing globally-consistent, multi-decadal monitoring of glacier ice. We present a toolset and approach that captures, labels, and maps glacier change for use in climate science, hydrology, and Earth science education using Landsat Time Series (LTS). The core step is "temporal segmentation," wherein a yearly LTS is cleaned using pre-processing steps, converted to a snow/ice index, and then simplified into the salient shape of the change trajectory ("temporal signature") using linear segmentation. Such signatures can be characterized as simple `stable' or `transition of glacier ice to rock' to more complex multi-year changes like `transition of glacier ice to debris-covered glacier ice to open water to bare rock to vegetation'. This pilot study demonstrates the potential for interactively mapping, visualizing, and labeling glacier changes. What is truly innovative is that IceTrendr not only maps the changes but also uses expert knowledge to label the changes and such labels can be applied to other glaciers exhibiting statistically similar temporal signatures. Our key findings are that the IceTrendr concept and software can provide important functionality for glaciologists and educators interested in studying glacier changes during the Landsat TM timeframe (1984-present). Issues of concern with using dense Landsat time-series approaches for glacier monitoring include many missing images during the period 1984-1995 and that automated cloud mask are challenged and require the user to manually identify cloud-free images. IceTrendr is much more than just a simple "then and now" approach to glacier mapping. This process is a means of integrating the power of computing, remote sensing, and expert knowledge to "tell the story" of glacier changes.
Time Series Observations in the North Indian Ocean

Digital Repository Service at National Institute of Oceanography (India)

Shenoy, D.M.; Naik, H.; Kurian, S.; Naqvi, S.W.A.; Khare, N.

Ocean and the ongoing time series study (Candolim Time Series; CaTS) off Goa. In addition, this article also focuses on the new time series initiative in the Arabian Sea and the Bay of Bengal under Sustained Indian Ocean Biogeochemistry and Ecosystem...
Effect of microaerobic fermentation in preprocessing fibrous lignocellulosic materials.

Science.gov (United States)

Alattar, Manar Arica; Green, Terrence R; Henry, Jordan; Gulca, Vitalie; Tizazu, Mikias; Bergstrom, Robby; Popa, Radu

2012-06-01

Amending soil with organic matter is common in agricultural and logging practices. Such amendments have benefits to soil fertility and crop yields. These benefits may be increased if material is preprocessed before introduction into soil. We analyzed the efficiency of microaerobic fermentation (MF), also referred to as Bokashi, in preprocessing fibrous lignocellulosic (FLC) organic materials using varying produce amendments and leachate treatments. Adding produce amendments increased leachate production and fermentation rates and decreased the biological oxygen demand of the leachate. Continuously draining leachate without returning it to the fermentors led to acidification and decreased concentrations of polysaccharides (PS) in leachates. PS fragmentation and the production of soluble metabolites and gases stabilized in fermentors in about 2-4 weeks. About 2 % of the carbon content was lost as CO(2). PS degradation rates, upon introduction of processed materials into soil, were similar to unfermented FLC. Our results indicate that MF is insufficient for adequate preprocessing of FLC material.

Geometric noise reduction for multivariate time series.

Science.gov (United States)

Mera, M Eugenia; Morán, Manuel

2006-03-01

We propose an algorithm for the reduction of observational noise in chaotic multivariate time series. The algorithm is based on a maximum likelihood criterion, and its goal is to reduce the mean distance of the points of the cleaned time series to the attractor. We give evidence of the convergence of the empirical measure associated with the cleaned time series to the underlying invariant measure, implying the possibility to predict the long run behavior of the true dynamics.
BRITS: Bidirectional Recurrent Imputation for Time Series

OpenAIRE

Cao, Wei; Wang, Dong; Li, Jian; Zhou, Hao; Li, Lei; Li, Yitan

2018-01-01

Time series are widely used as signals in many classification/regression tasks. It is ubiquitous that time series contains many missing values. Given multiple correlated time series data, how to fill in missing values and to predict their class labels? Existing imputation methods often impose strong assumptions of the underlying data generating process, such as linear dynamics in the state space. In this paper, we propose BRITS, a novel method based on recurrent neural networks for missing va...
Efficient Algorithms for Segmentation of Item-Set Time Series

Science.gov (United States)

Chundi, Parvathi; Rosenkrantz, Daniel J.

We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.
Studies on time series applications in environmental sciences

CERN Document Server

Bărbulescu, Alina

2016-01-01

Time series analysis and modelling represent a large study field, implying the approach from the perspective of the time and frequency, with applications in different domains. Modelling hydro-meteorological time series is difficult due to the characteristics of these series, as long range dependence, spatial dependence, the correlation with other series. Continuous spatial data plays an important role in planning, risk assessment and decision making in environmental management. In this context, in this book we present various statistical tests and modelling techniques used for time series analysis, as well as applications to hydro-meteorological series from Dobrogea, a region situated in the south-eastern part of Romania, less studied till now. Part of the results are accompanied by their R code. .
Performance of Pre-processing Schemes with Imperfect Channel State Information

DEFF Research Database (Denmark)

Christensen, Søren Skovgaard; Kyritsi, Persa; De Carvalho, Elisabeth

2006-01-01

Pre-processing techniques have several benefits when the CSI is perfect. In this work we investigate three linear pre-processing filters, assuming imperfect CSI caused by noise degradation and channel temporal variation. Results indicate, that the LMMSE filter achieves the lowest BER and the high......Pre-processing techniques have several benefits when the CSI is perfect. In this work we investigate three linear pre-processing filters, assuming imperfect CSI caused by noise degradation and channel temporal variation. Results indicate, that the LMMSE filter achieves the lowest BER...... and the highest SINR when the CSI is perfect, whereas the simple matched filter may be a good choice when the CSI is imperfect. Additionally the results give insight into the inherent trade-off between robustness against CSI imperfections and spatial focusing ability....
The 1989 ENDF pre-processing codes

International Nuclear Information System (INIS)

Cullen, D.E.; McLaughlin, P.K.

1989-12-01

This document summarizes the 1989 version of the ENDF pre-processing codes which are required for processing evaluated nuclear data coded in the format ENDF-4, ENDF-5, or ENDF-6. The codes are available from the IAEA Nuclear Data Section, free of charge upon request. (author)
Global Population Density Grid Time Series Estimates

Data.gov (United States)

National Aeronautics and Space Administration — Global Population Density Grid Time Series Estimates provide a back-cast time series of population density grids based on the year 2000 population grid from SEDAC's...
Prediction and Geometry of Chaotic Time Series

National Research Council Canada - National Science Library

Leonardi, Mary

1997-01-01

This thesis examines the topic of chaotic time series. An overview of chaos, dynamical systems, and traditional approaches to time series analysis is provided, followed by an examination of state space reconstruction...
Impact of functional MRI data preprocessing pipeline on default-mode network detectability in patients with disorders of consciousness

Directory of Open Access Journals (Sweden)

Adrian eAndronache

2013-08-01

Full Text Available An emerging application of resting-state functional MRI is the study of patients with disorders of consciousness (DoC, where integrity of default-mode network (DMN activity is associated to the clinical level of preservation of consciousness. Due to the inherent inability to follow verbal instructions, arousal induced by scanning noise and postural pain, these patients tend to exhibit substantial levels of movement. This results in spurious, non-neural fluctuations of the blood-oxygen level-dependent (BOLD signal, which impair the evaluation of residual functional connectivity. Here, the effect of data preprocessing choices on the detectability of the DMN was systematically evaluated in a representative cohort of 30 clinically and etiologically heterogeneous DoC patients and 33 healthy controls. Starting from a standard preprocessing pipeline, additional steps were gradually inserted, namely band-pass filtering, removal of co-variance with the movement vectors, removal of co-variance with the global brain parenchyma signal, rejection of realignment outlier volumes and ventricle masking. Both independent-component analysis (ICA and seed-based analysis (SBA were performed, and DMN detectability was assessed quantitatively as well as visually. The results of the present study strongly show that the detection of DMN activity in the sub-optimal fMRI series acquired on DoC patients is contingent on the use of adequate filtering steps. ICA and SBA are differently affected but give convergent findings for high-grade preprocessing. We propose that future studies in this area should adopt the described preprocessing procedures as a minimum standard to reduce the probability of wrongly inferring that DMN activity is absent.
Sensor-Generated Time Series Events: A Definition Language

Science.gov (United States)

Anguera, Aurea; Lara, Juan A.; Lizcano, David; Martínez, Maria Aurora; Pazos, Juan

2012-01-01

There are now a great many domains where information is recorded by sensors over a limited time period or on a permanent basis. This data flow leads to sequences of data known as time series. In many domains, like seismography or medicine, time series analysis focuses on particular regions of interest, known as events, whereas the remainder of the time series contains hardly any useful information. In these domains, there is a need for mechanisms to identify and locate such events. In this paper, we propose an events definition language that is general enough to be used to easily and naturally define events in time series recorded by sensors in any domain. The proposed language has been applied to the definition of time series events generated within the branch of medicine dealing with balance-related functions in human beings. A device, called posturograph, is used to study balance-related functions. The platform has four sensors that record the pressure intensity being exerted on the platform, generating four interrelated time series. As opposed to the existing ad hoc proposals, the results confirm that the proposed language is valid, that is generally applicable and accurate, for identifying the events contained in the time series.
Correlation and multifractality in climatological time series

International Nuclear Information System (INIS)

Pedron, I T

2010-01-01

Climate can be described by statistical analysis of mean values of atmospheric variables over a period. It is possible to detect correlations in climatological time series and to classify its behavior. In this work the Hurst exponent, which can characterize correlation and persistence in time series, is obtained by using the Detrended Fluctuation Analysis (DFA) method. Data series of temperature, precipitation, humidity, solar radiation, wind speed, maximum squall, atmospheric pressure and randomic series are studied. Furthermore, the multifractality of such series is analyzed applying the Multifractal Detrended Fluctuation Analysis (MF-DFA) method. The results indicate presence of correlation (persistent character) in all climatological series and multifractality as well. A larger set of data, and longer, could provide better results indicating the universality of the exponents.
Time Series Forecasting with Missing Values

Directory of Open Access Journals (Sweden)

Shin-Fu Wu

2015-11-01

Full Text Available Time series prediction has become more popular in various kinds of applications such as weather prediction, control engineering, financial analysis, industrial monitoring, etc. To deal with real-world problems, we are often faced with missing values in the data due to sensor malfunctions or human errors. Traditionally, the missing values are simply omitted or replaced by means of imputation methods. However, omitting those missing values may cause temporal discontinuity. Imputation methods, on the other hand, may alter the original time series. In this study, we propose a novel forecasting method based on least squares support vector machine (LSSVM. We employ the input patterns with the temporal information which is defined as local time index (LTI. Time series data as well as local time indexes are fed to LSSVM for doing forecasting without imputation. We compare the forecasting performance of our method with other imputation methods. Experimental results show that the proposed method is promising and is worth further investigations.
Reconstruction of ensembles of coupled time-delay systems from time series.

Science.gov (United States)

Sysoev, I V; Prokhorov, M D; Ponomarenko, V I; Bezruchko, B P

2014-06-01

We propose a method to recover from time series the parameters of coupled time-delay systems and the architecture of couplings between them. The method is based on a reconstruction of model delay-differential equations and estimation of statistical significance of couplings. It can be applied to networks composed of nonidentical nodes with an arbitrary number of unidirectional and bidirectional couplings. We test our method on chaotic and periodic time series produced by model equations of ensembles of diffusively coupled time-delay systems in the presence of noise, and apply it to experimental time series obtained from electronic oscillators with delayed feedback coupled by resistors.
The analysis of time series: an introduction

National Research Council Canada - National Science Library

Chatfield, Christopher

1989-01-01

.... A variety of practical examples are given to support the theory. The book covers a wide range of time-series topics, including probability models for time series, Box-Jenkins forecasting, spectral analysis, linear systems and system identification...
New indicator for optimal preprocessing and wavelength selection of near-infrared spectra

NARCIS (Netherlands)

Skibsted, E. T. S.; Boelens, H. F. M.; Westerhuis, J. A.; Witte, D. T.; Smilde, A. K.

2004-01-01

Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing
Time series modeling in traffic safety research.

Science.gov (United States)

Lavrenz, Steven M; Vlahogianni, Eleni I; Gkritza, Konstantina; Ke, Yue

2018-08-01

The use of statistical models for analyzing traffic safety (crash) data has been well-established. However, time series techniques have traditionally been underrepresented in the corresponding literature, due to challenges in data collection, along with a limited knowledge of proper methodology. In recent years, new types of high-resolution traffic safety data, especially in measuring driver behavior, have made time series modeling techniques an increasingly salient topic of study. Yet there remains a dearth of information to guide analysts in their use. This paper provides an overview of the state of the art in using time series models in traffic safety research, and discusses some of the fundamental techniques and considerations in classic time series modeling. It also presents ongoing and future opportunities for expanding the use of time series models, and explores newer modeling techniques, including computational intelligence models, which hold promise in effectively handling ever-larger data sets. The information contained herein is meant to guide safety researchers in understanding this broad area of transportation data analysis, and provide a framework for understanding safety trends that can influence policy-making. Copyright © 2017 Elsevier Ltd. All rights reserved.
Time series prediction: statistical and neural techniques

Science.gov (United States)

Zahirniak, Daniel R.; DeSimio, Martin P.

1996-03-01

In this paper we compare the performance of nonlinear neural network techniques to those of linear filtering techniques in the prediction of time series. Specifically, we compare the results of using the nonlinear systems, known as multilayer perceptron and radial basis function neural networks, with the results obtained using the conventional linear Wiener filter, Kalman filter and Widrow-Hoff adaptive filter in predicting future values of stationary and non- stationary time series. Our results indicate the performance of each type of system is heavily dependent upon the form of the time series being predicted and the size of the system used. In particular, the linear filters perform adequately for linear or near linear processes while the nonlinear systems perform better for nonlinear processes. Since the linear systems take much less time to be developed, they should be tried prior to using the nonlinear systems when the linearity properties of the time series process are unknown.
The inverse Numerical Computer Program FLUX-BOT for estimating Vertical Water Fluxes from Temperature Time-Series.

Science.gov (United States)

Trauth, N.; Schmidt, C.; Munz, M.

2016-12-01

Heat as a natural tracer to quantify water fluxes between groundwater and surface water has evolved to a standard hydrological method. Typically, time series of temperatures in the surface water and in the sediment are observed and are subsequently evaluated by a vertical 1D representation of heat transport by advection and dispersion. Several analytical solutions as well as their implementation into user-friendly software exist in order to estimate water fluxes from the observed temperatures. Analytical solutions can be easily implemented but assumptions on the boundary conditions have to be made a priori, e.g. sinusoidal upper temperature boundary. Numerical models offer more flexibility and can handle temperature data which is characterized by irregular variations such as storm-event induced temperature changes and thus cannot readily be incorporated in analytical solutions. This also reduced the effort of data preprocessing such as the extraction of the diurnal temperature variation. We developed a software to estimate water FLUXes Based On Temperatures- FLUX-BOT. FLUX-BOT is a numerical code written in MATLAB which is intended to calculate vertical water fluxes in saturated sediments, based on the inversion of measured temperature time series observed at multiple depths. It applies a cell-centered Crank-Nicolson implicit finite difference scheme to solve the one-dimensional heat advection-conduction equation. Besides its core inverse numerical routines, FLUX-BOT includes functions visualizing the results and functions for performing uncertainty analysis. We provide applications of FLUX-BOT to generic as well as to measured temperature data to demonstrate its performance.
Mapping Impervious Surface Expansion using Medium-resolution Satellite Image Time Series: A Case Study in the Yangtze River Delta, China

Science.gov (United States)

Gao, Feng; DeColstoun, Eric Brown; Ma, Ronghua; Weng, Qihao; Masek, Jeffrey G.; Chen, Jin; Pan, Yaozhong; Song, Conghe

2012-01-01

Cities have been expanding rapidly worldwide, especially over the past few decades. Mapping the dynamic expansion of impervious surface in both space and time is essential for an improved understanding of the urbanization process, land-cover and land-use change, and their impacts on the environment. Landsat and other medium-resolution satellites provide the necessary spatial details and temporal frequency for mapping impervious surface expansion over the past four decades. Since the US Geological Survey opened the historical record of the Landsat image archive for free access in 2008, the decades-old bottleneck of data limitation has gone. Remote-sensing scientists are now rich with data, and the challenge is how to make best use of this precious resource. In this article, we develop an efficient algorithm to map the continuous expansion of impervious surface using a time series of four decades of medium-resolution satellite images. The algorithm is based on a supervised classification of the time-series image stack using a decision tree. Each imerpervious class represents urbanization starting in a different image. The algorithm also allows us to remove inconsistent training samples because impervious expansion is not reversible during the study period. The objective is to extract a time series of complete and consistent impervious surface maps from a corresponding times series of images collected from multiple sensors, and with a minimal amount of image preprocessing effort. The approach was tested in the lower Yangtze River Delta region, one of the fastest urban growth areas in China. Results from nearly four decades of medium-resolution satellite data from the Landsat Multispectral Scanner (MSS), Thematic Mapper (TM), Enhanced Thematic Mapper plus (ETM+) and China-Brazil Earth Resources Satellite (CBERS) show a consistent urbanization process that is consistent with economic development plans and policies. The time-series impervious spatial extent maps derived
Time-series-analysis techniques applied to nuclear-material accounting

International Nuclear Information System (INIS)

Pike, D.H.; Morrison, G.W.; Downing, D.J.

1982-05-01

This document is designed to introduce the reader to the applications of Time Series Analysis techniques to Nuclear Material Accountability data. Time series analysis techniques are designed to extract information from a collection of random variables ordered by time by seeking to identify any trends, patterns, or other structure in the series. Since nuclear material accountability data is a time series, one can extract more information using time series analysis techniques than by using other statistical techniques. Specifically, the objective of this document is to examine the applicability of time series analysis techniques to enhance loss detection of special nuclear materials. An introductory section examines the current industry approach which utilizes inventory differences. The error structure of inventory differences is presented. Time series analysis techniques discussed include the Shewhart Control Chart, the Cumulative Summation of Inventory Differences Statistics (CUSUM) and the Kalman Filter and Linear Smoother

Evaluation of a Stereo Music Preprocessing Scheme for Cochlear Implant Users.

Science.gov (United States)

Buyens, Wim; van Dijk, Bas; Moonen, Marc; Wouters, Jan

2018-01-01

Although for most cochlear implant (CI) users good speech understanding is reached (at least in quiet environments), the perception and the appraisal of music are generally unsatisfactory. The improvement in music appraisal was evaluated in CI participants by using a stereo music preprocessing scheme implemented on a take-home device, in a comfortable listening environment. The preprocessing allowed adjusting the balance among vocals/bass/drums and other instruments, and was evaluated for different genres of music. The correlation between the preferred settings and the participants' speech and pitch detection performance was investigated. During the initial visit preceding the take-home test, the participants' speech-in-noise perception and pitch detection performance were measured, and a questionnaire about their music involvement was completed. The take-home device was provided, including the stereo music preprocessing scheme and seven playlists with six songs each. The participants were asked to adjust the balance by means of a turning wheel to make the music sound most enjoyable, and to repeat this three times for all songs. Twelve postlingually deafened CI users participated in the study. The data were collected by means of a take-home device, which preserved all the preferred settings for the different songs. Statistical analysis was done with a Friedman test (with post hoc Wilcoxon signed-rank test) to check the effect of "Genre." The correlations were investigated with Pearson's and Spearman's correlation coefficients. All participants preferred a balance significantly different from the original balance. Differences across participants were observed which could not be explained by perceptual abilities. An effect of "Genre" was found, showing significantly smaller preferred deviation from the original balance for Golden Oldies compared to the other genres. The stereo music preprocessing scheme showed an improvement in music appraisal with complex music and
Clinical and epidemiological rounds. Time series

Directory of Open Access Journals (Sweden)

León-Álvarez, Alba Luz

2016-07-01

Full Text Available Analysis of time series is a technique that implicates the study of individuals or groups observed in successive moments in time. This type of analysis allows the study of potential causal relationships between different variables that change over time and relate to each other. It is the most important technique to make inferences about the future, predicting, on the basis or what has happened in the past and it is applied in different disciplines of knowledge. Here we discuss different components of time series, the analysis technique and specific examples in health research.
Comparison of multivariate preprocessing techniques as applied to electronic tongue based pattern classification for black tea

International Nuclear Information System (INIS)

Palit, Mousumi; Tudu, Bipan; Bhattacharyya, Nabarun; Dutta, Ankur; Dutta, Pallab Kumar; Jana, Arun; Bandyopadhyay, Rajib; Chatterjee, Anutosh

2010-01-01

In an electronic tongue, preprocessing on raw data precedes pattern analysis and choice of the appropriate preprocessing technique is crucial for the performance of the pattern classifier. While attempting to classify different grades of black tea using a voltammetric electronic tongue, different preprocessing techniques have been explored and a comparison of their performances is presented in this paper. The preprocessing techniques are compared first by a quantitative measurement of separability followed by principle component analysis; and then two different supervised pattern recognition models based on neural networks are used to evaluate the performance of the preprocessing techniques.
Integer-valued time series

NARCIS (Netherlands)

van den Akker, R.

2007-01-01

This thesis adresses statistical problems in econometrics. The first part contributes statistical methodology for nonnegative integer-valued time series. The second part of this thesis discusses semiparametric estimation in copula models and develops semiparametric lower bounds for a large class of
Robust Forecasting of Non-Stationary Time Series

NARCIS (Netherlands)

Croux, C.; Fried, R.; Gijbels, I.; Mahieu, K.

2010-01-01

This paper proposes a robust forecasting method for non-stationary time series. The time series is modelled using non-parametric heteroscedastic regression, and fitted by a localized MM-estimator, combining high robustness and large efficiency. The proposed method is shown to produce reliable
A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series.

Science.gov (United States)

Patel, Ameera X; Kundu, Prantik; Rubinov, Mikail; Jones, P Simon; Vértes, Petra E; Ersche, Karen D; Suckling, John; Bullmore, Edward T

2014-07-15

The impact of in-scanner head movement on functional magnetic resonance imaging (fMRI) signals has long been established as undesirable. These effects have been traditionally corrected by methods such as linear regression of head movement parameters. However, a number of recent independent studies have demonstrated that these techniques are insufficient to remove motion confounds, and that even small movements can spuriously bias estimates of functional connectivity. Here we propose a new data-driven, spatially-adaptive, wavelet-based method for identifying, modeling, and removing non-stationary events in fMRI time series, caused by head movement, without the need for data scrubbing. This method involves the addition of just one extra step, the Wavelet Despike, in standard pre-processing pipelines. With this method, we demonstrate robust removal of a range of different motion artifacts and motion-related biases including distance-dependent connectivity artifacts, at a group and single-subject level, using a range of previously published and new diagnostic measures. The Wavelet Despike is able to accommodate the substantial spatial and temporal heterogeneity of motion artifacts and can consequently remove a range of high and low frequency artifacts from fMRI time series, that may be linearly or non-linearly related to physical movements. Our methods are demonstrated by the analysis of three cohorts of resting-state fMRI data, including two high-motion datasets: a previously published dataset on children (N=22) and a new dataset on adults with stimulant drug dependence (N=40). We conclude that there is a real risk of motion-related bias in connectivity analysis of fMRI data, but that this risk is generally manageable, by effective time series denoising strategies designed to attenuate synchronized signal transients induced by abrupt head movements. The Wavelet Despiking software described in this article is freely available for download at www
Characterizing time series via complexity-entropy curves

Science.gov (United States)

Ribeiro, Haroldo V.; Jauregui, Max; Zunino, Luciano; Lenzi, Ervin K.

2017-06-01

The search for patterns in time series is a very common task when dealing with complex systems. This is usually accomplished by employing a complexity measure such as entropies and fractal dimensions. However, such measures usually only capture a single aspect of the system dynamics. Here, we propose a family of complexity measures for time series based on a generalization of the complexity-entropy causality plane. By replacing the Shannon entropy by a monoparametric entropy (Tsallis q entropy) and after considering the proper generalization of the statistical complexity (q complexity), we build up a parametric curve (the q -complexity-entropy curve) that is used for characterizing and classifying time series. Based on simple exact results and numerical simulations of stochastic processes, we show that these curves can distinguish among different long-range, short-range, and oscillating correlated behaviors. Also, we verify that simulated chaotic and stochastic time series can be distinguished based on whether these curves are open or closed. We further test this technique in experimental scenarios related to chaotic laser intensity, stock price, sunspot, and geomagnetic dynamics, confirming its usefulness. Finally, we prove that these curves enhance the automatic classification of time series with long-range correlations and interbeat intervals of healthy subjects and patients with heart disease.
Complex network approach to fractional time series

Energy Technology Data Exchange (ETDEWEB)

Manshour, Pouya [Physics Department, Persian Gulf University, Bushehr 75169 (Iran, Islamic Republic of)

2015-10-15

In order to extract correlation information inherited in stochastic time series, the visibility graph algorithm has been recently proposed, by which a time series can be mapped onto a complex network. We demonstrate that the visibility algorithm is not an appropriate one to study the correlation aspects of a time series. We then employ the horizontal visibility algorithm, as a much simpler one, to map fractional processes onto complex networks. The degree distributions are shown to have parabolic exponential forms with Hurst dependent fitting parameter. Further, we take into account other topological properties such as maximum eigenvalue of the adjacency matrix and the degree assortativity, and show that such topological quantities can also be used to predict the Hurst exponent, with an exception for anti-persistent fractional Gaussian noises. To solve this problem, we take into account the Spearman correlation coefficient between nodes' degrees and their corresponding data values in the original time series.
Introduction to time series analysis and forecasting

CERN Document Server

Montgomery, Douglas C; Kulahci, Murat

2008-01-01

An accessible introduction to the most current thinking in and practicality of forecasting techniques in the context of time-oriented data. Analyzing time-oriented data and forecasting are among the most important problems that analysts face across many fields, ranging from finance and economics to production operations and the natural sciences. As a result, there is a widespread need for large groups of people in a variety of fields to understand the basic concepts of time series analysis and forecasting. Introduction to Time Series Analysis and Forecasting presents the time series analysis branch of applied statistics as the underlying methodology for developing practical forecasts, and it also bridges the gap between theory and practice by equipping readers with the tools needed to analyze time-oriented data and construct useful, short- to medium-term, statistically based forecasts.
Value of Distributed Preprocessing of Biomass Feedstocks to a Bioenergy Industry

Energy Technology Data Exchange (ETDEWEB)

Christopher T Wright

2006-07-01

Biomass preprocessing is one of the primary operations in the feedstock assembly system and the front-end of a biorefinery. Its purpose is to chop, grind, or otherwise format the biomass into a suitable feedstock for conversion to ethanol and other bioproducts. Many variables such as equipment cost and efficiency, and feedstock moisture content, particle size, bulk density, compressibility, and flowability affect the location and implementation of this unit operation. Previous conceptual designs show this operation to be located at the front-end of the biorefinery. However, data are presented that show distributed preprocessing at the field-side or in a fixed preprocessing facility can provide significant cost benefits by producing a higher value feedstock with improved handling, transporting, and merchandising potential. In addition, data supporting the preferential deconstruction of feedstock materials due to their bio-composite structure identifies the potential for significant improvements in equipment efficiencies and compositional quality upgrades. Theses data are collected from full-scale low and high capacity hammermill grinders with various screen sizes. Multiple feedstock varieties with a range of moisture values were used in the preprocessing tests. The comparative values of the different grinding configurations, feedstock varieties, and moisture levels are assessed through post-grinding analysis of the different particle fractions separated with a medium-scale forage particle separator and a Rototap separator. The results show that distributed preprocessing produces a material that has bulk flowable properties and fractionation benefits that can improve the ease of transporting, handling and conveying the material to the biorefinery and improve the biochemical and thermochemical conversion processes.
Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance.

Science.gov (United States)

Liu, Yongli; Chen, Jingli; Wu, Shuai; Liu, Zhizhong; Chao, Hao

2018-01-01

Clustering time series data is of great significance since it could extract meaningful statistics and other characteristics. Especially in biomedical engineering, outstanding clustering algorithms for time series may help improve the health level of people. Considering data scale and time shifts of time series, in this paper, we introduce two incremental fuzzy clustering algorithms based on a Dynamic Time Warping (DTW) distance. For recruiting Single-Pass and Online patterns, our algorithms could handle large-scale time series data by splitting it into a set of chunks which are processed sequentially. Besides, our algorithms select DTW to measure distance of pair-wise time series and encourage higher clustering accuracy because DTW could determine an optimal match between any two time series by stretching or compressing segments of temporal data. Our new algorithms are compared to some existing prominent incremental fuzzy clustering algorithms on 12 benchmark time series datasets. The experimental results show that the proposed approaches could yield high quality clusters and were better than all the competitors in terms of clustering accuracy.
Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance

Science.gov (United States)

Chen, Jingli; Wu, Shuai; Liu, Zhizhong; Chao, Hao

2018-01-01

Clustering time series data is of great significance since it could extract meaningful statistics and other characteristics. Especially in biomedical engineering, outstanding clustering algorithms for time series may help improve the health level of people. Considering data scale and time shifts of time series, in this paper, we introduce two incremental fuzzy clustering algorithms based on a Dynamic Time Warping (DTW) distance. For recruiting Single-Pass and Online patterns, our algorithms could handle large-scale time series data by splitting it into a set of chunks which are processed sequentially. Besides, our algorithms select DTW to measure distance of pair-wise time series and encourage higher clustering accuracy because DTW could determine an optimal match between any two time series by stretching or compressing segments of temporal data. Our new algorithms are compared to some existing prominent incremental fuzzy clustering algorithms on 12 benchmark time series datasets. The experimental results show that the proposed approaches could yield high quality clusters and were better than all the competitors in terms of clustering accuracy. PMID:29795600
The foundations of modern time series analysis

CERN Document Server

Mills, Terence C

2011-01-01

This book develops the analysis of Time Series from its formal beginnings in the 1890s through to the publication of Box and Jenkins' watershed publication in 1970, showing how these methods laid the foundations for the modern techniques of Time Series analysis that are in use today.
EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data

Science.gov (United States)

D'Amico, Giuseppe; Amodeo, Aldo; Mattis, Ina; Freudenthaler, Volker; Pappalardo, Gelsomina

2016-02-01

In this paper we describe an automatic tool for the pre-processing of aerosol lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of ELPP, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of ELPP is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of ELPP. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. ELPP has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.
Pre-processing for Triangulation of Probabilistic Networks

NARCIS (Netherlands)

Bodlaender, H.L.; Koster, A.M.C.A.; Eijkhof, F. van den; Gaag, L.C. van der

2001-01-01

The currently most efficient algorithm for inference with a probabilistic network builds upon a triangulation of a networks graph. In this paper, we show that pre-processing can help in finding good triangulations for probabilistic networks, that is, triangulations with a minimal maximum
Time series clustering in large data sets

Directory of Open Access Journals (Sweden)

Jiří Fejfar

2011-01-01

Full Text Available The clustering of time series is a widely researched area. There are many methods for dealing with this task. We are actually using the Self-organizing map (SOM with the unsupervised learning algorithm for clustering of time series. After the first experiment (Fejfar, Weinlichová, Šťastný, 2009 it seems that the whole concept of the clustering algorithm is correct but that we have to perform time series clustering on much larger dataset to obtain more accurate results and to find the correlation between configured parameters and results more precisely. The second requirement arose in a need for a well-defined evaluation of results. It seems useful to use sound recordings as instances of time series again. There are many recordings to use in digital libraries, many interesting features and patterns can be found in this area. We are searching for recordings with the similar development of information density in this experiment. It can be used for musical form investigation, cover songs detection and many others applications.The objective of the presented paper is to compare clustering results made with different parameters of feature vectors and the SOM itself. We are describing time series in a simplistic way evaluating standard deviations for separated parts of recordings. The resulting feature vectors are clustered with the SOM in batch training mode with different topologies varying from few neurons to large maps.There are other algorithms discussed, usable for finding similarities between time series and finally conclusions for further research are presented. We also present an overview of the related actual literature and projects.
Comparative performance evaluation of transform coding in image pre-processing

Science.gov (United States)

Menon, Vignesh V.; NB, Harikrishnan; Narayanan, Gayathri; CK, Niveditha

2017-07-01

We are in the midst of a communication transmute which drives the development as largely as dissemination of pioneering communication systems with ever-increasing fidelity and resolution. Distinguishable researches have been appreciative in image processing techniques crazed by a growing thirst for faster and easier encoding, storage and transmission of visual information. In this paper, the researchers intend to throw light on many techniques which could be worn at the transmitter-end in order to ease the transmission and reconstruction of the images. The researchers investigate the performance of different image transform coding schemes used in pre-processing, their comparison, and effectiveness, the necessary and sufficient conditions, properties and complexity in implementation. Whimsical by prior advancements in image processing techniques, the researchers compare various contemporary image pre-processing frameworks- Compressed Sensing, Singular Value Decomposition, Integer Wavelet Transform on performance. The paper exposes the potential of Integer Wavelet transform to be an efficient pre-processing scheme.
Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

Science.gov (United States)

Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

2014-07-01

The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Lag space estimation in time series modelling

DEFF Research Database (Denmark)

Goutte, Cyril

1997-01-01

The purpose of this article is to investigate some techniques for finding the relevant lag-space, i.e. input information, for time series modelling. This is an important aspect of time series modelling, as it conditions the design of the model through the regressor vector a.k.a. the input layer...
Time-series prediction and applications a machine intelligence approach

CERN Document Server

Konar, Amit

2017-01-01

This book presents machine learning and type-2 fuzzy sets for the prediction of time-series with a particular focus on business forecasting applications. It also proposes new uncertainty management techniques in an economic time-series using type-2 fuzzy sets for prediction of the time-series at a given time point from its preceding value in fluctuating business environments. It employs machine learning to determine repetitively occurring similar structural patterns in the time-series and uses stochastic automaton to predict the most probabilistic structure at a given partition of the time-series. Such predictions help in determining probabilistic moves in a stock index time-series Primarily written for graduate students and researchers in computer science, the book is equally useful for researchers/professionals in business intelligence and stock index prediction. A background of undergraduate level mathematics is presumed, although not mandatory, for most of the sections. Exercises with tips are provided at...

A Time Series Forecasting Method

Directory of Open Access Journals (Sweden)

Wang Zhao-Yu

2017-01-01

Full Text Available This paper proposes a novel time series forecasting method based on a weighted self-constructing clustering technique. The weighted self-constructing clustering processes all the data patterns incrementally. If a data pattern is not similar enough to an existing cluster, it forms a new cluster of its own. However, if a data pattern is similar enough to an existing cluster, it is removed from the cluster it currently belongs to and added to the most similar cluster. During the clustering process, weights are learned for each cluster. Given a series of time-stamped data up to time t, we divide it into a set of training patterns. By using the weighted self-constructing clustering, the training patterns are grouped into a set of clusters. To estimate the value at time t + 1, we find the k nearest neighbors of the input pattern and use these k neighbors to decide the estimation. Experimental results are shown to demonstrate the effectiveness of the proposed approach.
Stochastic nature of series of waiting times

Science.gov (United States)

Anvari, Mehrnaz; Aghamohammadi, Cina; Dashti-Naserabadi, H.; Salehi, E.; Behjat, E.; Qorbani, M.; Khazaei Nezhad, M.; Zirak, M.; Hadjihosseini, Ali; Peinke, Joachim; Tabar, M. Reza Rahimi

2013-06-01

Although fluctuations in the waiting time series have been studied for a long time, some important issues such as its long-range memory and its stochastic features in the presence of nonstationarity have so far remained unstudied. Here we find that the “waiting times” series for a given increment level have long-range correlations with Hurst exponents belonging to the interval 1/2time distribution. We find that the logarithmic difference of waiting times series has a short-range correlation, and then we study its stochastic nature using the Markovian method and determine the corresponding Kramers-Moyal coefficients. As an example, we analyze the velocity fluctuations in high Reynolds number turbulence and determine the level dependence of Markov time scales, as well as the drift and diffusion coefficients. We show that the waiting time distributions exhibit power law tails, and we were able to model the distribution with a continuous time random walk.
Spectral analysis of time series of events: effect of respiration on heart rate in neonates

International Nuclear Information System (INIS)

Van Drongelen, Wim; Williams, Amber L; Lasky, Robert E

2009-01-01

Certain types of biomedical processes such as the heart rate generator can be considered as signals that are sampled by the occurring events, i.e. QRS complexes. This sampling property generates problems for the evaluation of spectral parameters of such signals. First, the irregular occurrence of heart beats creates an unevenly sampled data set which must either be pre-processed (e.g. by using trace binning or interpolation) prior to spectral analysis, or analyzed with specialized methods (e.g. Lomb's algorithm). Second, the average occurrence of events determines the Nyquist limit for the sampled time series. Here we evaluate different types of spectral analysis of recordings of neonatal heart rate. Coupling between respiration and heart rate and the detection of heart rate itself are emphasized. We examine both standard and data adaptive frequency bands of heart rate signals generated by models of coupled oscillators and recorded data sets from neonates. We find that an important spectral artifact occurs due to a mirror effect around the Nyquist limit of half the average heart rate. Further we conclude that the presence of respiratory coupling can only be detected under low noise conditions and if a data-adaptive respiratory band is used
Efficient Approximate OLAP Querying Over Time Series

DEFF Research Database (Denmark)

Perera, Kasun Baruhupolage Don Kasun Sanjeewa; Hahmann, Martin; Lehner, Wolfgang

2016-01-01

The ongoing trend for data gathering not only produces larger volumes of data, but also increases the variety of recorded data types. Out of these, especially time series, e.g. various sensor readings, have attracted attention in the domains of business intelligence and decision making. As OLAP...... queries play a major role in these domains, it is desirable to also execute them on time series data. While this is not a problem on the conceptual level, it can become a bottleneck with regards to query run-time. In general, processing OLAP queries gets more computationally intensive as the volume...... of data grows. This is a particular problem when querying time series data, which generally contains multiple measures recorded at fine time granularities. Usually, this issue is addressed either by scaling up hardware or by employing workload based query optimization techniques. However, these solutions...
A Dynamic Fuzzy Cluster Algorithm for Time Series

Directory of Open Access Journals (Sweden)

Min Ji

2013-01-01

clustering time series by introducing the definition of key point and improving FCM algorithm. The proposed algorithm works by determining those time series whose class labels are vague and further partitions them into different clusters over time. The main advantage of this approach compared with other existing algorithms is that the property of some time series belonging to different clusters over time can be partially revealed. Results from simulation-based experiments on geographical data demonstrate the excellent performance and the desired results have been obtained. The proposed algorithm can be applied to solve other clustering problems in data mining.
A novel weight determination method for time series data aggregation

Science.gov (United States)

Xu, Paiheng; Zhang, Rong; Deng, Yong

2017-09-01

Aggregation in time series is of great importance in time series smoothing, predicting and other time series analysis process, which makes it crucial to address the weights in times series correctly and reasonably. In this paper, a novel method to obtain the weights in time series is proposed, in which we adopt induced ordered weighted aggregation (IOWA) operator and visibility graph averaging (VGA) operator and linearly combine the weights separately generated by the two operator. The IOWA operator is introduced to the weight determination of time series, through which the time decay factor is taken into consideration. The VGA operator is able to generate weights with respect to the degree distribution in the visibility graph constructed from the corresponding time series, which reflects the relative importance of vertices in time series. The proposed method is applied to two practical datasets to illustrate its merits. The aggregation of Construction Cost Index (CCI) demonstrates the ability of proposed method to smooth time series, while the aggregation of The Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) illustrate how proposed method maintain the variation tendency of original data.
A New Indicator for Optimal Preprocessing and Wavelengths Selection of Near-Infrared Spectra

NARCIS (Netherlands)

Skibsted, E.; Boelens, H.F.M.; Westerhuis, J.A.; Witte, D.T.; Smilde, A.K.

2004-01-01

Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing
Foundations of Sequence-to-Sequence Modeling for Time Series

OpenAIRE

Kuznetsov, Vitaly; Mariet, Zelda

2018-01-01

The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practiti...
Development and integration of block operations for data invariant automation of digital preprocessing and analysis of biological and biomedical Raman spectra.

Science.gov (United States)

Schulze, H Georg; Turner, Robin F B

2015-06-01

High-throughput information extraction from large numbers of Raman spectra is becoming an increasingly taxing problem due to the proliferation of new applications enabled using advances in instrumentation. Fortunately, in many of these applications, the entire process can be automated, yielding reproducibly good results with significant time and cost savings. Information extraction consists of two stages, preprocessing and analysis. We focus here on the preprocessing stage, which typically involves several steps, such as calibration, background subtraction, baseline flattening, artifact removal, smoothing, and so on, before the resulting spectra can be further analyzed. Because the results of some of these steps can affect the performance of subsequent ones, attention must be given to the sequencing of steps, the compatibility of these sequences, and the propensity of each step to generate spectral distortions. We outline here important considerations to effect full automation of Raman spectral preprocessing: what is considered full automation; putative general principles to effect full automation; the proper sequencing of processing and analysis steps; conflicts and circularities arising from sequencing; and the need for, and approaches to, preprocessing quality control. These considerations are discussed and illustrated with biological and biomedical examples reflecting both successful and faulty preprocessing.
Climate Prediction Center (CPC) Global Precipitation Time Series

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — The global precipitation time series provides time series charts showing observations of daily precipitation as well as accumulated precipitation compared to normal...
Climate Prediction Center (CPC) Global Temperature Time Series

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — The global temperature time series provides time series charts using station based observations of daily temperature. These charts provide information about the...
Recurrent Neural Network Applications for Astronomical Time Series

Science.gov (United States)

Protopapas, Pavlos

2017-06-01

The benefits of good predictive models in astronomy lie in early event prediction systems and effective resource allocation. Current time series methods applicable to regular time series have not evolved to generalize for irregular time series. In this talk, I will describe two Recurrent Neural Network methods, Long Short-Term Memory (LSTM) and Echo State Networks (ESNs) for predicting irregular time series. Feature engineering along with a non-linear modeling proved to be an effective predictor. For noisy time series, the prediction is improved by training the network on error realizations using the error estimates from astronomical light curves. In addition to this, we propose a new neural network architecture to remove correlation from the residuals in order to improve prediction and compensate for the noisy data. Finally, I show how to set hyperparameters for a stable and performant solution correctly. In this work, we circumvent this obstacle by optimizing ESN hyperparameters using Bayesian optimization with Gaussian Process priors. This automates the tuning procedure, enabling users to employ the power of RNN without needing an in-depth understanding of the tuning procedure.
Classification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets

Directory of Open Access Journals (Sweden)

Hoefsloot Huub CJ

2009-05-01

Full Text Available Abstract Background Mass spectrometry is increasingly being used to discover proteins or protein profiles associated with disease. Experimental design of mass-spectrometry studies has come under close scrutiny and the importance of strict protocols for sample collection is now understood. However, the question of how best to process the large quantities of data generated is still unanswered. Main challenges for the analysis are the choice of proper pre-processing and classification methods. While these two issues have been investigated in isolation, we propose to use the classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Results Two in-house generated clinical SELDI-TOF MS datasets are used in this study as an example of high throughput mass-spectrometry data. We perform a systematic comparison of two commonly used pre-processing methods as implemented in Ciphergen ProteinChip Software and in the Cromwell package. With respect to reproducibility, Ciphergen and Cromwell pre-processing are largely comparable. We find that the overlap between peaks detected by either Ciphergen ProteinChip Software or Cromwell is large. This is especially the case for the more stringent peak detection settings. Moreover, similarity of the estimated intensities between matched peaks is high. We evaluate the pre-processing methods using five different classification methods. Classification is done in a double cross-validation protocol using repeated random sampling to obtain an unbiased estimate of classification accuracy. No pre-processing method significantly outperforms the other for all peak detection settings evaluated. Conclusion We use classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Both pre-processing methods lead to similar classification results on an ovarian cancer and a Gaucher disease dataset. However, the settings for pre-processing
Standardized Access and Processing of Multi-Source Earth Observation Time-Series Data within a Regional Data Middleware

Science.gov (United States)

Eberle, J.; Schmullius, C.

2017-12-01

Increasing archives of global satellite data present a new challenge to handle multi-source satellite data in a user-friendly way. Any user is confronted with different data formats and data access services. In addition the handling of time-series data is complex as an automated processing and execution of data processing steps is needed to supply the user with the desired product for a specific area of interest. In order to simplify the access to data archives of various satellite missions and to facilitate the subsequent processing, a regional data and processing middleware has been developed. The aim of this system is to provide standardized and web-based interfaces to multi-source time-series data for individual regions on Earth. For further use and analysis uniform data formats and data access services are provided. Interfaces to data archives of the sensor MODIS (NASA) as well as the satellites Landsat (USGS) and Sentinel (ESA) have been integrated in the middleware. Various scientific algorithms, such as the calculation of trends and breakpoints of time-series data, can be carried out on the preprocessed data on the basis of uniform data management. Jupyter Notebooks are linked to the data and further processing can be conducted directly on the server using Python and the statistical language R. In addition to accessing EO data, the middleware is also used as an intermediary between the user and external databases (e.g., Flickr, YouTube). Standardized web services as specified by OGC are provided for all tools of the middleware. Currently, the use of cloud services is being researched to bring algorithms to the data. As a thematic example, an operational monitoring of vegetation phenology is being implemented on the basis of various optical satellite data and validation data from the German Weather Service. Other examples demonstrate the monitoring of wetlands focusing on automated discovery and access of Landsat and Sentinel data for local areas.
Impact of data transformation and preprocessing in supervised ...

African Journals Online (AJOL)

Impact of data transformation and preprocessing in supervised learning ... Nowadays, the ideas of integrating machine learning techniques in power system has ... The proposed algorithm used Python-based split train and k-fold model ...
Transition Icons for Time-Series Visualization and Exploratory Analysis.

Science.gov (United States)

Nickerson, Paul V; Baharloo, Raheleh; Wanigatunga, Amal A; Manini, Todd M; Tighe, Patrick J; Rashidi, Parisa

2018-03-01

The modern healthcare landscape has seen the rapid emergence of techniques and devices that temporally monitor and record physiological signals. The prevalence of time-series data within the healthcare field necessitates the development of methods that can analyze the data in order to draw meaningful conclusions. Time-series behavior is notoriously difficult to intuitively understand due to its intrinsic high-dimensionality, which is compounded in the case of analyzing groups of time series collected from different patients. Our framework, which we call transition icons, renders common patterns in a visual format useful for understanding the shared behavior within groups of time series. Transition icons are adept at detecting and displaying subtle differences and similarities, e.g., between measurements taken from patients receiving different treatment strategies or stratified by demographics. We introduce various methods that collectively allow for exploratory analysis of groups of time series, while being free of distribution assumptions and including simple heuristics for parameter determination. Our technique extracts discrete transition patterns from symbolic aggregate approXimation representations, and compiles transition frequencies into a bag of patterns constructed for each group. These transition frequencies are normalized and aligned in icon form to intuitively display the underlying patterns. We demonstrate the transition icon technique for two time-series datasets-postoperative pain scores, and hip-worn accelerometer activity counts. We believe transition icons can be an important tool for researchers approaching time-series data, as they give rich and intuitive information about collective time-series behaviors.
Multifractal analysis of visibility graph-based Ito-related connectivity time series.

Science.gov (United States)

Czechowski, Zbigniew; Lovallo, Michele; Telesca, Luciano

2016-02-01

In this study, we investigate multifractal properties of connectivity time series resulting from the visibility graph applied to normally distributed time series generated by the Ito equations with multiplicative power-law noise. We show that multifractality of the connectivity time series (i.e., the series of numbers of links outgoing any node) increases with the exponent of the power-law noise. The multifractality of the connectivity time series could be due to the width of connectivity degree distribution that can be related to the exit time of the associated Ito time series. Furthermore, the connectivity time series are characterized by persistence, although the original Ito time series are random; this is due to the procedure of visibility graph that, connecting the values of the time series, generates persistence but destroys most of the nonlinear correlations. Moreover, the visibility graph is sensitive for detecting wide "depressions" in input time series.
Mathematical foundations of time series analysis a concise introduction

CERN Document Server

Beran, Jan

2017-01-01

This book provides a concise introduction to the mathematical foundations of time series analysis, with an emphasis on mathematical clarity. The text is reduced to the essential logical core, mostly using the symbolic language of mathematics, thus enabling readers to very quickly grasp the essential reasoning behind time series analysis. It appeals to anybody wanting to understand time series in a precise, mathematical manner. It is suitable for graduate courses in time series analysis but is equally useful as a reference work for students and researchers alike.
Time series analysis in the social sciences the fundamentals

CERN Document Server

Shin, Youseop

2017-01-01

Times Series Analysis in the Social Sciences is a practical and highly readable introduction written exclusively for students and researchers whose mathematical background is limited to basic algebra. The book focuses on fundamental elements of time series analysis that social scientists need to understand so they can employ time series analysis for their research and practice. Through step-by-step explanations and using monthly violent crime rates as case studies, this book explains univariate time series from the preliminary visual analysis through the modeling of seasonality, trends, and re
Data imputation analysis for Cosmic Rays time series

Science.gov (United States)

Fernandes, R. C.; Lucio, P. S.; Fernandez, J. H.

2017-05-01

The occurrence of missing data concerning Galactic Cosmic Rays time series (GCR) is inevitable since loss of data is due to mechanical and human failure or technical problems and different periods of operation of GCR stations. The aim of this study was to perform multiple dataset imputation in order to depict the observational dataset. The study has used the monthly time series of GCR Climax (CLMX) and Roma (ROME) from 1960 to 2004 to simulate scenarios of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90% of missing data compared to observed ROME series, with 50 replicates. Then, the CLMX station as a proxy for allocation of these scenarios was used. Three different methods for monthly dataset imputation were selected: AMÉLIA II - runs the bootstrap Expectation Maximization algorithm, MICE - runs an algorithm via Multivariate Imputation by Chained Equations and MTSDI - an Expectation Maximization algorithm-based method for imputation of missing values in multivariate normal time series. The synthetic time series compared with the observed ROME series has also been evaluated using several skill measures as such as RMSE, NRMSE, Agreement Index, R, R2, F-test and t-test. The results showed that for CLMX and ROME, the R2 and R statistics were equal to 0.98 and 0.96, respectively. It was observed that increases in the number of gaps generate loss of quality of the time series. Data imputation was more efficient with MTSDI method, with negligible errors and best skill coefficients. The results suggest a limit of about 60% of missing data for imputation, for monthly averages, no more than this. It is noteworthy that CLMX, ROME and KIEL stations present no missing data in the target period. This methodology allowed reconstructing 43 time series.

Preprocessing of A-scan GPR data based on energy features

Science.gov (United States)

Dogan, Mesut; Turhan-Sayan, Gonul

2016-05-01

There is an increasing demand for noninvasive real-time detection and classification of buried objects in various civil and military applications. The problem of detection and annihilation of landmines is particularly important due to strong safety concerns. The requirement for a fast real-time decision process is as important as the requirements for high detection rates and low false alarm rates. In this paper, we introduce and demonstrate a computationally simple, timeefficient, energy-based preprocessing approach that can be used in ground penetrating radar (GPR) applications to eliminate reflections from the air-ground boundary and to locate the buried objects, simultaneously, at one easy step. The instantaneous power signals, the total energy values and the cumulative energy curves are extracted from the A-scan GPR data. The cumulative energy curves, in particular, are shown to be useful to detect the presence and location of buried objects in a fast and simple way while preserving the spectral content of the original A-scan data for further steps of physics-based target classification. The proposed method is demonstrated using the GPR data collected at the facilities of IPA Defense, Ankara at outdoor test lanes. Cylindrically shaped plastic containers were buried in fine-medium sand to simulate buried landmines. These plastic containers were half-filled by ammonium nitrate including metal pins. Results of this pilot study are demonstrated to be highly promising to motivate further research for the use of energy-based preprocessing features in landmine detection problem.
Algorithm for Compressing Time-Series Data

Science.gov (United States)

Hawkins, S. Edward, III; Darlington, Edward Hugo

2012-01-01

An algorithm based on Chebyshev polynomials effects lossy compression of time-series data or other one-dimensional data streams (e.g., spectral data) that are arranged in blocks for sequential transmission. The algorithm was developed for use in transmitting data from spacecraft scientific instruments to Earth stations. In spite of its lossy nature, the algorithm preserves the information needed for scientific analysis. The algorithm is computationally simple, yet compresses data streams by factors much greater than two. The algorithm is not restricted to spacecraft or scientific uses: it is applicable to time-series data in general. The algorithm can also be applied to general multidimensional data that have been converted to time-series data, a typical example being image data acquired by raster scanning. However, unlike most prior image-data-compression algorithms, this algorithm neither depends on nor exploits the two-dimensional spatial correlations that are generally present in images. In order to understand the essence of this compression algorithm, it is necessary to understand that the net effect of this algorithm and the associated decompression algorithm is to approximate the original stream of data as a sequence of finite series of Chebyshev polynomials. For the purpose of this algorithm, a block of data or interval of time for which a Chebyshev polynomial series is fitted to the original data is denoted a fitting interval. Chebyshev approximation has two properties that make it particularly effective for compressing serial data streams with minimal loss of scientific information: The errors associated with a Chebyshev approximation are nearly uniformly distributed over the fitting interval (this is known in the art as the "equal error property"); and the maximum deviations of the fitted Chebyshev polynomial from the original data have the smallest possible values (this is known in the art as the "min-max property").
Modeling of Volatility with Non-linear Time Series Model

OpenAIRE

Kim Song Yon; Kim Mun Chol

2013-01-01

In this paper, non-linear time series models are used to describe volatility in financial time series data. To describe volatility, two of the non-linear time series are combined into form TAR (Threshold Auto-Regressive Model) with AARCH (Asymmetric Auto-Regressive Conditional Heteroskedasticity) error term and its parameter estimation is studied.
Layered Ensemble Architecture for Time Series Forecasting.

Science.gov (United States)

Rahman, Md Mustafizur; Islam, Md Monirul; Murase, Kazuyuki; Yao, Xin

2016-01-01

Time series forecasting (TSF) has been widely used in many application areas such as science, engineering, and finance. The phenomena generating time series are usually unknown and information available for forecasting is only limited to the past values of the series. It is, therefore, necessary to use an appropriate number of past values, termed lag, for forecasting. This paper proposes a layered ensemble architecture (LEA) for TSF problems. Our LEA consists of two layers, each of which uses an ensemble of multilayer perceptron (MLP) networks. While the first ensemble layer tries to find an appropriate lag, the second ensemble layer employs the obtained lag for forecasting. Unlike most previous work on TSF, the proposed architecture considers both accuracy and diversity of the individual networks in constructing an ensemble. LEA trains different networks in the ensemble by using different training sets with an aim of maintaining diversity among the networks. However, it uses the appropriate lag and combines the best trained networks to construct the ensemble. This indicates LEAs emphasis on accuracy of the networks. The proposed architecture has been tested extensively on time series data of neural network (NN)3 and NN5 competitions. It has also been tested on several standard benchmark time series data. In terms of forecasting accuracy, our experimental results have revealed clearly that LEA is better than other ensemble and nonensemble methods.
A simpler method of preprocessing MALDI-TOF MS data for differential biomarker analysis: stem cell and melanoma cancer studies

Directory of Open Access Journals (Sweden)

Tong Dong L

2011-09-01

Full Text Available Abstract Introduction Raw spectral data from matrix-assisted laser desorption/ionisation time-of-flight (MALDI-TOF with MS profiling techniques usually contains complex information not readily providing biological insight into disease. The association of identified features within raw data to a known peptide is extremely difficult. Data preprocessing to remove uncertainty characteristics in the data is normally required before performing any further analysis. This study proposes an alternative yet simple solution to preprocess raw MALDI-TOF-MS data for identification of candidate marker ions. Two in-house MALDI-TOF-MS data sets from two different sample sources (melanoma serum and cord blood plasma are used in our study. Method Raw MS spectral profiles were preprocessed using the proposed approach to identify peak regions in the spectra. The preprocessed data was then analysed using bespoke machine learning algorithms for data reduction and ion selection. Using the selected ions, an ANN-based predictive model was constructed to examine the predictive power of these ions for classification. Results Our model identified 10 candidate marker ions for both data sets. These ion panels achieved over 90% classification accuracy on blind validation data. Receiver operating characteristics analysis was performed and the area under the curve for melanoma and cord blood classifiers was 0.991 and 0.986, respectively. Conclusion The results suggest that our data preprocessing technique removes unwanted characteristics of the raw data, while preserving the predictive components of the data. Ion identification analysis can be carried out using MALDI-TOF-MS data with the proposed data preprocessing technique coupled with bespoke algorithms for data reduction and ion selection.
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI.

Science.gov (United States)

Gargouri, Fatma; Kallel, Fathi; Delphine, Sebastien; Ben Hamida, Ahmed; Lehéricy, Stéphane; Valabregue, Romain

2018-01-01

Resting state functional MRI (rs-fMRI) is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step) and the scr (where we applied realignment, tCompCor and smoothing as a final step) strategies had the highest mean values of global efficiency (eg) . Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step), had the highest mean local efficiency (el) values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency.
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI

Directory of Open Access Journals (Sweden)

Fatma Gargouri

2018-02-01

Full Text Available Resting state functional MRI (rs-fMRI is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step and the scr (where we applied realignment, tCompCor and smoothing as a final step strategies had the highest mean values of global efficiency (eg. Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step, had the highest mean local efficiency (el values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency.
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI

Science.gov (United States)

Gargouri, Fatma; Kallel, Fathi; Delphine, Sebastien; Ben Hamida, Ahmed; Lehéricy, Stéphane; Valabregue, Romain

2018-01-01

Resting state functional MRI (rs-fMRI) is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step) and the scr (where we applied realignment, tCompCor and smoothing as a final step) strategies had the highest mean values of global efficiency (eg). Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step), had the highest mean local efficiency (el) values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency. PMID:29497372
Data pre-processing for web log mining: Case study of commercial bank website usage analysis

Directory of Open Access Journals (Sweden)

Jozef Kapusta

2013-01-01

Full Text Available We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper.
A window-based time series feature extraction method.

Science.gov (United States)

Katircioglu-Öztürk, Deniz; Güvenir, H Altay; Ravens, Ursula; Baykal, Nazife

2017-10-01

This study proposes a robust similarity score-based time series feature extraction method that is termed as Window-based Time series Feature ExtraCtion (WTC). Specifically, WTC generates domain-interpretable results and involves significantly low computational complexity thereby rendering itself useful for densely sampled and populated time series datasets. In this study, WTC is applied to a proprietary action potential (AP) time series dataset on human cardiomyocytes and three precordial leads from a publicly available electrocardiogram (ECG) dataset. This is followed by comparing WTC in terms of predictive accuracy and computational complexity with shapelet transform and fast shapelet transform (which constitutes an accelerated variant of the shapelet transform). The results indicate that WTC achieves a slightly higher classification performance with significantly lower execution time when compared to its shapelet-based alternatives. With respect to its interpretable features, WTC has a potential to enable medical experts to explore definitive common trends in novel datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Prewhitening of hydroclimatic time series? Implications for inferred change and variability across time scales

Science.gov (United States)

Razavi, Saman; Vogel, Richard

2018-02-01

Prewhitening, the process of eliminating or reducing short-term stochastic persistence to enable detection of deterministic change, has been extensively applied to time series analysis of a range of geophysical variables. Despite the controversy around its utility, methodologies for prewhitening time series continue to be a critical feature of a variety of analyses including: trend detection of hydroclimatic variables and reconstruction of climate and/or hydrology through proxy records such as tree rings. With a focus on the latter, this paper presents a generalized approach to exploring the impact of a wide range of stochastic structures of short- and long-term persistence on the variability of hydroclimatic time series. Through this approach, we examine the impact of prewhitening on the inferred variability of time series across time scales. We document how a focus on prewhitened, residual time series can be misleading, as it can drastically distort (or remove) the structure of variability across time scales. Through examples with actual data, we show how such loss of information in prewhitened time series of tree rings (so-called "residual chronologies") can lead to the underestimation of extreme conditions in climate and hydrology, particularly droughts, reconstructed for centuries preceding the historical period.
Thinning: A Preprocessing Technique for an OCR System for the Brahmi Script

Directory of Open Access Journals (Sweden)

H. K. Anasuya Devi

2006-12-01

Full Text Available In this paper we study the methodology employed for preprocessing the archaeological images. We present the various algorithms used in the low level processing stage of image analysis for Optical Character Recognition System for Brahmi Script. The image preprocessing technique covered in this paper include Thinning method. We also try to analyze the results obtained by the pixel-level processing algorithms.
The scheme and research of TV series multidimensional comprehensive evaluation on cross-platform

Science.gov (United States)

Chai, Jianping; Bai, Xuesong; Zhou, Hongjun; Yin, Fulian

2016-10-01

As for shortcomings of the comprehensive evaluation system on traditional TV programs such as single data source, ignorance of new media as well as the high time cost and difficulty of making surveys, a new evaluation of TV series is proposed in this paper, which has a perspective in cross-platform multidimensional evaluation after broadcasting. This scheme considers the data directly collected from cable television and the Internet as research objects. It's based on TOPSIS principle, after preprocessing and calculation of the data, they become primary indicators that reflect different profiles of the viewing of TV series. Then after the process of reasonable empowerment and summation by the six methods(PCA, AHP, etc.), the primary indicators form the composite indices on different channels or websites. The scheme avoids the inefficiency and difficulty of survey and marking; At the same time, it not only reflects different dimensions of viewing, but also combines TV media and new media, completing the multidimensional comprehensive evaluation of TV series on cross-platform.
DTW-APPROACH FOR UNCORRELATED MULTIVARIATE TIME SERIES IMPUTATION

OpenAIRE

Phan , Thi-Thu-Hong; Poisson Caillault , Emilie; Bigand , André; Lefebvre , Alain

2017-01-01

International audience; Missing data are inevitable in almost domains of applied sciences. Data analysis with missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). Some well-known methods for multivariate time series imputation require high correlations between series or their features. In this paper , we propose an approach based on the shape-behaviour relation in low/un-correlated multivariate time series under an assumption of...
Variable Selection in Time Series Forecasting Using Random Forests

Directory of Open Access Journals (Sweden)

Hristos Tyralis

2017-10-01

Full Text Available Time series forecasting using machine learning algorithms has gained popularity recently. Random forest is a machine learning algorithm implemented in time series forecasting; however, most of its forecasting properties have remained unexplored. Here we focus on assessing the performance of random forests in one-step forecasting using two large datasets of short time series with the aim to suggest an optimal set of predictor variables. Furthermore, we compare its performance to benchmarking methods. The first dataset is composed by 16,000 simulated time series from a variety of Autoregressive Fractionally Integrated Moving Average (ARFIMA models. The second dataset consists of 135 mean annual temperature time series. The highest predictive performance of RF is observed when using a low number of recent lagged predictor variables. This outcome could be useful in relevant future applications, with the prospect to achieve higher predictive accuracy.
Trend time-series modeling and forecasting with neural networks.

Science.gov (United States)

Qi, Min; Zhang, G Peter

2008-05-01

Despite its great importance, there has been no general consensus on how to model the trends in time-series data. Compared to traditional approaches, neural networks (NNs) have shown some promise in time-series forecasting. This paper investigates how to best model trend time series using NNs. Four different strategies (raw data, raw data with time index, detrending, and differencing) are used to model various trend patterns (linear, nonlinear, deterministic, stochastic, and breaking trend). We find that with NNs differencing often gives meritorious results regardless of the underlying data generating processes (DGPs). This finding is also confirmed by the real gross national product (GNP) series.
Segmentation of Nonstationary Time Series with Geometric Clustering

DEFF Research Database (Denmark)

Bocharov, Alexei; Thiesson, Bo

2013-01-01

We introduce a non-parametric method for segmentation in regimeswitching time-series models. The approach is based on spectral clustering of target-regressor tuples and derives a switching regression tree, where regime switches are modeled by oblique splits. Such models can be learned efficiently...... from data, where clustering is used to propose one single split candidate at each split level. We use the class of ART time series models to serve as illustration, but because of the non-parametric nature of our segmentation approach, it readily generalizes to a wide range of time-series models that go...
Non-parametric characterization of long-term rainfall time series

Science.gov (United States)

Tiwari, Harinarayan; Pandey, Brij Kishor

2018-03-01

The statistical study of rainfall time series is one of the approaches for efficient hydrological system design. Identifying, and characterizing long-term rainfall time series could aid in improving hydrological systems forecasting. In the present study, eventual statistics was applied for the long-term (1851-2006) rainfall time series under seven meteorological regions of India. Linear trend analysis was carried out using Mann-Kendall test for the observed rainfall series. The observed trend using the above-mentioned approach has been ascertained using the innovative trend analysis method. Innovative trend analysis has been found to be a strong tool to detect the general trend of rainfall time series. Sequential Mann-Kendall test has also been carried out to examine nonlinear trends of the series. The partial sum of cumulative deviation test is also found to be suitable to detect the nonlinear trend. Innovative trend analysis, sequential Mann-Kendall test and partial cumulative deviation test have potential to detect the general as well as nonlinear trend for the rainfall time series. Annual rainfall analysis suggests that the maximum changes in mean rainfall is 11.53% for West Peninsular India, whereas the maximum fall in mean rainfall is 7.8% for the North Mountainous Indian region. The innovative trend analysis method is also capable of finding the number of change point available in the time series. Additionally, we have performed von Neumann ratio test and cumulative deviation test to estimate the departure from homogeneity. Singular spectrum analysis has been applied in this study to evaluate the order of departure from homogeneity in the rainfall time series. Monsoon season (JS) of North Mountainous India and West Peninsular India zones has higher departure from homogeneity and singular spectrum analysis shows the results to be in coherence with the same.
Time Series Decomposition into Oscillation Components and Phase Estimation.

Science.gov (United States)

Matsuda, Takeru; Komaki, Fumiyasu

2017-02-01

Many time series are naturally considered as a superposition of several oscillation components. For example, electroencephalogram (EEG) time series include oscillation components such as alpha, beta, and gamma. We propose a method for decomposing time series into such oscillation components using state-space models. Based on the concept of random frequency modulation, gaussian linear state-space models for oscillation components are developed. In this model, the frequency of an oscillator fluctuates by noise. Time series decomposition is accomplished by this model like the Bayesian seasonal adjustment method. Since the model parameters are estimated from data by the empirical Bayes' method, the amplitudes and the frequencies of oscillation components are determined in a data-driven manner. Also, the appropriate number of oscillation components is determined with the Akaike information criterion (AIC). In this way, the proposed method provides a natural decomposition of the given time series into oscillation components. In neuroscience, the phase of neural time series plays an important role in neural information processing. The proposed method can be used to estimate the phase of each oscillation component and has several advantages over a conventional method based on the Hilbert transform. Thus, the proposed method enables an investigation of the phase dynamics of time series. Numerical results show that the proposed method succeeds in extracting intermittent oscillations like ripples and detecting the phase reset phenomena. We apply the proposed method to real data from various fields such as astronomy, ecology, tidology, and neuroscience.
Introduction to time series analysis and forecasting

CERN Document Server

Montgomery, Douglas C; Kulahci, Murat

2015-01-01

Praise for the First Edition ""…[t]he book is great for readers who need to apply the methods and models presented but have little background in mathematics and statistics."" -MAA Reviews Thoroughly updated throughout, Introduction to Time Series Analysis and Forecasting, Second Edition presents the underlying theories of time series analysis that are needed to analyze time-oriented data and construct real-world short- to medium-term statistical forecasts. Authored by highly-experienced academics and professionals in engineering statistics, the Second Edition features discussions on both

Multi-Scale Dissemination of Time Series Data

DEFF Research Database (Denmark)

Guo, Qingsong; Zhou, Yongluan; Su, Li

2013-01-01

In this paper, we consider the problem of continuous dissemination of time series data, such as sensor measurements, to a large number of subscribers. These subscribers fall into multiple subscription levels, where each subscription level is specified by the bandwidth constraint of a subscriber......, which is an abstract indicator for both the physical limits and the amount of data that the subscriber would like to handle. To handle this problem, we propose a system framework for multi-scale time series data dissemination that employs a typical tree-based dissemination network and existing time...
PRACTICAL RECOMMENDATIONS OF DATA PREPROCESSING AND GEOSPATIAL MEASURES FOR OPTIMIZING THE NEUROLOGICAL AND OTHER PEDIATRIC EMERGENCIES MANAGEMENT

Directory of Open Access Journals (Sweden)

Ionela MANIU

2017-08-01

Full Text Available Time management, optimal and timed determination of emergency severity as well as optimizing the use of available human and material resources are crucial areas of emergency services. A starting point for achieving these optimizations can be considered the analysis and preprocess of real data from the emergency services. The benefits of performing this method consist in exposing more useful structures to data modelling algorithms which consequently will reduce overfitting and improves accuracy. This paper aims to offer practical recommendations for data preprocessing measures including feature selection and discretization of numeric attributes regarding age, duration of the case, season, period, week period (workday, weekend and geospatial location of neurological and other pediatric emergencies. An analytical, retrospective study was conducted on a sample consisting of 933 pediatric cases, from UPU-SMURD Sibiu, 01.01.2014 – 27.02.2017 period.
RADON CONCENTRATION TIME SERIES MODELING AND APPLICATION DISCUSSION.

Science.gov (United States)

Stránský, V; Thinová, L

2017-11-01

In the year 2010 a continual radon measurement was established at Mladeč Caves in the Czech Republic using a continual radon monitor RADIM3A. In order to model radon time series in the years 2010-15, the Box-Jenkins Methodology, often used in econometrics, was applied. Because of the behavior of radon concentrations (RCs), a seasonal integrated, autoregressive moving averages model with exogenous variables (SARIMAX) has been chosen to model the measured time series. This model uses the time series seasonality, previously acquired values and delayed atmospheric parameters, to forecast RC. The developed model for RC time series is called regARIMA(5,1,3). Model residuals could be retrospectively compared with seismic evidence of local or global earthquakes, which occurred during the RCs measurement. This technique enables us to asses if continuously measured RC could serve an earthquake precursor. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Similarity estimators for irregular and age uncertain time series

Science.gov (United States)

Rehfeld, K.; Kurths, J.

2013-09-01

Paleoclimate time series are often irregularly sampled and age uncertain, which is an important technical challenge to overcome for successful reconstruction of past climate variability and dynamics. Visual comparison and interpolation-based linear correlation approaches have been used to infer dependencies from such proxy time series. While the first is subjective, not measurable and not suitable for the comparison of many datasets at a time, the latter introduces interpolation bias, and both face difficulties if the underlying dependencies are nonlinear. In this paper we investigate similarity estimators that could be suitable for the quantitative investigation of dependencies in irregular and age uncertain time series. We compare the Gaussian-kernel based cross correlation (gXCF, Rehfeld et al., 2011) and mutual information (gMI, Rehfeld et al., 2013) against their interpolation-based counterparts and the new event synchronization function (ESF). We test the efficiency of the methods in estimating coupling strength and coupling lag numerically, using ensembles of synthetic stalagmites with short, autocorrelated, linear and nonlinearly coupled proxy time series, and in the application to real stalagmite time series. In the linear test case coupling strength increases are identified consistently for all estimators, while in the nonlinear test case the correlation-based approaches fail. The lag at which the time series are coupled is identified correctly as the maximum of the similarity functions in around 60-55% (in the linear case) to 53-42% (for the nonlinear processes) of the cases when the dating of the synthetic stalagmite is perfectly precise. If the age uncertainty increases beyond 5% of the time series length, however, the true coupling lag is not identified more often than the others for which the similarity function was estimated. Age uncertainty contributes up to half of the uncertainty in the similarity estimation process. Time series irregularity
Similarity estimators for irregular and age-uncertain time series

Science.gov (United States)

Rehfeld, K.; Kurths, J.

2014-01-01

Paleoclimate time series are often irregularly sampled and age uncertain, which is an important technical challenge to overcome for successful reconstruction of past climate variability and dynamics. Visual comparison and interpolation-based linear correlation approaches have been used to infer dependencies from such proxy time series. While the first is subjective, not measurable and not suitable for the comparison of many data sets at a time, the latter introduces interpolation bias, and both face difficulties if the underlying dependencies are nonlinear. In this paper we investigate similarity estimators that could be suitable for the quantitative investigation of dependencies in irregular and age-uncertain time series. We compare the Gaussian-kernel-based cross-correlation (gXCF, Rehfeld et al., 2011) and mutual information (gMI, Rehfeld et al., 2013) against their interpolation-based counterparts and the new event synchronization function (ESF). We test the efficiency of the methods in estimating coupling strength and coupling lag numerically, using ensembles of synthetic stalagmites with short, autocorrelated, linear and nonlinearly coupled proxy time series, and in the application to real stalagmite time series. In the linear test case, coupling strength increases are identified consistently for all estimators, while in the nonlinear test case the correlation-based approaches fail. The lag at which the time series are coupled is identified correctly as the maximum of the similarity functions in around 60-55% (in the linear case) to 53-42% (for the nonlinear processes) of the cases when the dating of the synthetic stalagmite is perfectly precise. If the age uncertainty increases beyond 5% of the time series length, however, the true coupling lag is not identified more often than the others for which the similarity function was estimated. Age uncertainty contributes up to half of the uncertainty in the similarity estimation process. Time series irregularity
Robust Forecasting of Non-Stationary Time Series

OpenAIRE

Croux, C.; Fried, R.; Gijbels, I.; Mahieu, K.

2010-01-01

This paper proposes a robust forecasting method for non-stationary time series. The time series is modelled using non-parametric heteroscedastic regression, and fitted by a localized MM-estimator, combining high robustness and large efficiency. The proposed method is shown to produce reliable forecasts in the presence of outliers, non-linearity, and heteroscedasticity. In the absence of outliers, the forecasts are only slightly less precise than those based on a localized Least Squares estima...
Time Series Econometrics for the 21st Century

Science.gov (United States)

Hansen, Bruce E.

2017-01-01

The field of econometrics largely started with time series analysis because many early datasets were time-series macroeconomic data. As the field developed, more cross-sectional and longitudinal datasets were collected, which today dominate the majority of academic empirical research. In nonacademic (private sector, central bank, and governmental)…
Effectiveness of firefly algorithm based neural network in time series ...

African Journals Online (AJOL)

Effectiveness of firefly algorithm based neural network in time series forecasting. ... In the experiments, three well known time series were used to evaluate the performance. Results obtained were compared with ... Keywords: Time series, Artificial Neural Network, Firefly Algorithm, Particle Swarm Optimization, Overfitting ...
Time Series Analysis of Insar Data: Methods and Trends

Science.gov (United States)

Osmanoglu, Batuhan; Sunar, Filiz; Wdowinski, Shimon; Cano-Cabral, Enrique

2015-01-01

Time series analysis of InSAR data has emerged as an important tool for monitoring and measuring the displacement of the Earth's surface. Changes in the Earth's surface can result from a wide range of phenomena such as earthquakes, volcanoes, landslides, variations in ground water levels, and changes in wetland water levels. Time series analysis is applied to interferometric phase measurements, which wrap around when the observed motion is larger than one-half of the radar wavelength. Thus, the spatio-temporal ''unwrapping" of phase observations is necessary to obtain physically meaningful results. Several different algorithms have been developed for time series analysis of InSAR data to solve for this ambiguity. These algorithms may employ different models for time series analysis, but they all generate a first-order deformation rate, which can be compared to each other. However, there is no single algorithm that can provide optimal results in all cases. Since time series analyses of InSAR data are used in a variety of applications with different characteristics, each algorithm possesses inherently unique strengths and weaknesses. In this review article, following a brief overview of InSAR technology, we discuss several algorithms developed for time series analysis of InSAR data using an example set of results for measuring subsidence rates in Mexico City.
TargetSearch--a Bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data.

Science.gov (United States)

Cuadros-Inostroza, Alvaro; Caldana, Camila; Redestig, Henning; Kusano, Miyako; Lisec, Jan; Peña-Cortés, Hugo; Willmitzer, Lothar; Hannah, Matthew A

2009-12-16

Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.
Interpretation of a compositional time series

Science.gov (United States)

Tolosana-Delgado, R.; van den Boogaart, K. G.

2012-04-01

Common methods for multivariate time series analysis use linear operations, from the definition of a time-lagged covariance/correlation to the prediction of new outcomes. However, when the time series response is a composition (a vector of positive components showing the relative importance of a set of parts in a total, like percentages and proportions), then linear operations are afflicted of several problems. For instance, it has been long recognised that (auto/cross-)correlations between raw percentages are spurious, more dependent on which other components are being considered than on any natural link between the components of interest. Also, a long-term forecast of a composition in models with a linear trend will ultimately predict negative components. In general terms, compositional data should not be treated in a raw scale, but after a log-ratio transformation (Aitchison, 1986: The statistical analysis of compositional data. Chapman and Hill). This is so because the information conveyed by a compositional data is relative, as stated in their definition. The principle of working in coordinates allows to apply any sort of multivariate analysis to a log-ratio transformed composition, as long as this transformation is invertible. This principle is of full application to time series analysis. We will discuss how results (both auto/cross-correlation functions and predictions) can be back-transformed, viewed and interpreted in a meaningful way. One view is to use the exhaustive set of all possible pairwise log-ratios, which allows to express the results into D(D - 1)/2 separate, interpretable sets of one-dimensional models showing the behaviour of each possible pairwise log-ratios. Another view is the interpretation of estimated coefficients or correlations back-transformed in terms of compositions. These two views are compatible and complementary. These issues are illustrated with time series of seasonal precipitation patterns at different rain gauges of the USA
Capturing Structure Implicitly from Time-Series having Limited Data

OpenAIRE

Emaasit, Daniel; Johnson, Matthew

2018-01-01

Scientific fields such as insider-threat detection and highway-safety planning often lack sufficient amounts of time-series data to estimate statistical models for the purpose of scientific discovery. Moreover, the available limited data are quite noisy. This presents a major challenge when estimating time-series models that are robust to overfitting and have well-calibrated uncertainty estimates. Most of the current literature in these fields involve visualizing the time-series for noticeabl...
Pre-processing by data augmentation for improved ellipse fitting.

Science.gov (United States)

Kumar, Pankaj; Belchamber, Erika R; Miklavcic, Stanley J

2018-01-01

Ellipse fitting is a highly researched and mature topic. Surprisingly, however, no existing method has thus far considered the data point eccentricity in its ellipse fitting procedure. Here, we introduce the concept of eccentricity of a data point, in analogy with the idea of ellipse eccentricity. We then show empirically that, irrespective of ellipse fitting method used, the root mean square error (RMSE) of a fit increases with the eccentricity of the data point set. The main contribution of the paper is based on the hypothesis that if the data point set were pre-processed to strategically add additional data points in regions of high eccentricity, then the quality of a fit could be improved. Conditional validity of this hypothesis is demonstrated mathematically using a model scenario. Based on this confirmation we propose an algorithm that pre-processes the data so that data points with high eccentricity are replicated. The improvement of ellipse fitting is then demonstrated empirically in real-world application of 3D reconstruction of a plant root system for phenotypic analysis. The degree of improvement for different underlying ellipse fitting methods as a function of data noise level is also analysed. We show that almost every method tested, irrespective of whether it minimizes algebraic error or geometric error, shows improvement in the fit following data augmentation using the proposed pre-processing algorithm.
A Stereo Music Preprocessing Scheme for Cochlear Implant Users.

Science.gov (United States)

Buyens, Wim; van Dijk, Bas; Wouters, Jan; Moonen, Marc

2015-10-01

Listening to music is still one of the more challenging aspects of using a cochlear implant (CI) for most users. Simple musical structures, a clear rhythm/beat, and lyrics that are easy to follow are among the top factors contributing to music appreciation for CI users. Modifying the audio mix of complex music potentially improves music enjoyment in CI users. A stereo music preprocessing scheme is described in which vocals, drums, and bass are emphasized based on the representation of the harmonic and the percussive components in the input spectrogram, combined with the spatial allocation of instruments in typical stereo recordings. The scheme is assessed with postlingually deafened CI subjects (N = 7) using pop/rock music excerpts with different complexity levels. The scheme is capable of modifying relative instrument level settings, with the aim of improving music appreciation in CI users, and allows individual preference adjustments. The assessment with CI subjects confirms the preference for more emphasis on vocals, drums, and bass as offered by the preprocessing scheme, especially for songs with higher complexity. The stereo music preprocessing scheme has the potential to improve music enjoyment in CI users by modifying the audio mix in widespread (stereo) music recordings. Since music enjoyment in CI users is generally poor, this scheme can assist the music listening experience of CI users as a training or rehabilitation tool.
Self-affinity in the dengue fever time series

Science.gov (United States)

Azevedo, S. M.; Saba, H.; Miranda, J. G. V.; Filho, A. S. Nascimento; Moret, M. A.

2016-06-01

Dengue is a complex public health problem that is common in tropical and subtropical regions. This disease has risen substantially in the last three decades, and the physical symptoms depict the self-affine behavior of the occurrences of reported dengue cases in Bahia, Brazil. This study uses detrended fluctuation analysis (DFA) to verify the scale behavior in a time series of dengue cases and to evaluate the long-range correlations that are characterized by the power law α exponent for different cities in Bahia, Brazil. The scaling exponent (α) presents different long-range correlations, i.e. uncorrelated, anti-persistent, persistent and diffusive behaviors. The long-range correlations highlight the complex behavior of the time series of this disease. The findings show that there are two distinct types of scale behavior. In the first behavior, the time series presents a persistent α exponent for a one-month period. For large periods, the time series signal approaches subdiffusive behavior. The hypothesis of the long-range correlations in the time series of the occurrences of reported dengue cases was validated. The observed self-affinity is useful as a forecasting tool for future periods through extrapolation of the α exponent behavior. This complex system has a higher predictability in a relatively short time (approximately one month), and it suggests a new tool in epidemiological control strategies. However, predictions for large periods using DFA are hidden by the subdiffusive behavior.
On the plurality of times: disunified time and the A-series | Nefdt ...

African Journals Online (AJOL)

Then, I attempt to show that disunified time is a problem for a semantics based on the A-series since A-truthmakers are hard to come by in a universe of temporally disconnected time-series. Finally, I provide a novel argument showing that presentists should be particularly fearful of such a universe. South African Journal of ...
Time-series modeling of long-term weight self-monitoring data.

Science.gov (United States)

Helander, Elina; Pavel, Misha; Jimison, Holly; Korhonen, Ilkka

2015-08-01

Long-term self-monitoring of weight is beneficial for weight maintenance, especially after weight loss. Connected weight scales accumulate time series information over long term and hence enable time series analysis of the data. The analysis can reveal individual patterns, provide more sensitive detection of significant weight trends, and enable more accurate and timely prediction of weight outcomes. However, long term self-weighing data has several challenges which complicate the analysis. Especially, irregular sampling, missing data, and existence of periodic (e.g. diurnal and weekly) patterns are common. In this study, we apply time series modeling approach on daily weight time series from two individuals and describe information that can be extracted from this kind of data. We study the properties of weight time series data, missing data and its link to individuals behavior, periodic patterns and weight series segmentation. Being able to understand behavior through weight data and give relevant feedback is desired to lead to positive intervention on health behaviors.
Time series prediction of apple scab using meteorological ...

African Journals Online (AJOL)

A new prediction model for the early warning of apple scab is proposed in this study. The method is based on artificial intelligence and time series prediction. The infection period of apple scab was evaluated as the time series prediction model instead of summation of wetness duration. Also, the relations of different ...
Characterization of time series via Rényi complexity-entropy curves

Science.gov (United States)

Jauregui, M.; Zunino, L.; Lenzi, E. K.; Mendes, R. S.; Ribeiro, H. V.

2018-05-01

One of the most useful tools for distinguishing between chaotic and stochastic time series is the so-called complexity-entropy causality plane. This diagram involves two complexity measures: the Shannon entropy and the statistical complexity. Recently, this idea has been generalized by considering the Tsallis monoparametric generalization of the Shannon entropy, yielding complexity-entropy curves. These curves have proven to enhance the discrimination among different time series related to stochastic and chaotic processes of numerical and experimental nature. Here we further explore these complexity-entropy curves in the context of the Rényi entropy, which is another monoparametric generalization of the Shannon entropy. By combining the Rényi entropy with the proper generalization of the statistical complexity, we associate a parametric curve (the Rényi complexity-entropy curve) with a given time series. We explore this approach in a series of numerical and experimental applications, demonstrating the usefulness of this new technique for time series analysis. We show that the Rényi complexity-entropy curves enable the differentiation among time series of chaotic, stochastic, and periodic nature. In particular, time series of stochastic nature are associated with curves displaying positive curvature in a neighborhood of their initial points, whereas curves related to chaotic phenomena have a negative curvature; finally, periodic time series are represented by vertical straight lines.
Quantifying Selection with Pool-Seq Time Series Data.

Science.gov (United States)

Taus, Thomas; Futschik, Andreas; Schlötterer, Christian

2017-11-01

Allele frequency time series data constitute a powerful resource for unraveling mechanisms of adaptation, because the temporal dimension captures important information about evolutionary forces. In particular, Evolve and Resequence (E&R), the whole-genome sequencing of replicated experimentally evolving populations, is becoming increasingly popular. Based on computer simulations several studies proposed experimental parameters to optimize the identification of the selection targets. No such recommendations are available for the underlying parameters selection strength and dominance. Here, we introduce a highly accurate method to estimate selection parameters from replicated time series data, which is fast enough to be applied on a genome scale. Using this new method, we evaluate how experimental parameters can be optimized to obtain the most reliable estimates for selection parameters. We show that the effective population size (Ne) and the number of replicates have the largest impact. Because the number of time points and sequencing coverage had only a minor effect, we suggest that time series analysis is feasible without major increase in sequencing costs. We anticipate that time series analysis will become routine in E&R studies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Transformation-cost time-series method for analyzing irregularly sampled data.

Science.gov (United States)

Ozken, Ibrahim; Eroglu, Deniz; Stemler, Thomas; Marwan, Norbert; Bagci, G Baris; Kurths, Jürgen

2015-06-01

Irregular sampling of data sets is one of the challenges often encountered in time-series analysis, since traditional methods cannot be applied and the frequently used interpolation approach can corrupt the data and bias the subsequence analysis. Here we present the TrAnsformation-Cost Time-Series (TACTS) method, which allows us to analyze irregularly sampled data sets without degenerating the quality of the data set. Instead of using interpolation we consider time-series segments and determine how close they are to each other by determining the cost needed to transform one segment into the following one. Using a limited set of operations-with associated costs-to transform the time series segments, we determine a new time series, that is our transformation-cost time series. This cost time series is regularly sampled and can be analyzed using standard methods. While our main interest is the analysis of paleoclimate data, we develop our method using numerical examples like the logistic map and the Rössler oscillator. The numerical data allows us to test the stability of our method against noise and for different irregular samplings. In addition we provide guidance on how to choose the associated costs based on the time series at hand. The usefulness of the TACTS method is demonstrated using speleothem data from the Secret Cave in Borneo that is a good proxy for paleoclimatic variability in the monsoon activity around the maritime continent.
Transformation-cost time-series method for analyzing irregularly sampled data

Science.gov (United States)

Ozken, Ibrahim; Eroglu, Deniz; Stemler, Thomas; Marwan, Norbert; Bagci, G. Baris; Kurths, Jürgen

2015-06-01

Irregular sampling of data sets is one of the challenges often encountered in time-series analysis, since traditional methods cannot be applied and the frequently used interpolation approach can corrupt the data and bias the subsequence analysis. Here we present the TrAnsformation-Cost Time-Series (TACTS) method, which allows us to analyze irregularly sampled data sets without degenerating the quality of the data set. Instead of using interpolation we consider time-series segments and determine how close they are to each other by determining the cost needed to transform one segment into the following one. Using a limited set of operations—with associated costs—to transform the time series segments, we determine a new time series, that is our transformation-cost time series. This cost time series is regularly sampled and can be analyzed using standard methods. While our main interest is the analysis of paleoclimate data, we develop our method using numerical examples like the logistic map and the Rössler oscillator. The numerical data allows us to test the stability of our method against noise and for different irregular samplings. In addition we provide guidance on how to choose the associated costs based on the time series at hand. The usefulness of the TACTS method is demonstrated using speleothem data from the Secret Cave in Borneo that is a good proxy for paleoclimatic variability in the monsoon activity around the maritime continent.
A multidisciplinary database for geophysical time series management

Science.gov (United States)

Montalto, P.; Aliotta, M.; Cassisi, C.; Prestifilippo, M.; Cannata, A.

2013-12-01

The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.
Web Log Pre-processing and Analysis for Generation of Learning Profiles in Adaptive E-learning

Directory of Open Access Journals (Sweden)

Radhika M. Pai

2016-03-01

Full Text Available Adaptive E-learning Systems (AESs enhance the efficiency of online courses in education by providing personalized contents and user interfaces that changes according to learner’s requirements and usage patterns. This paper presents the approach to generate learning profile of each learner which helps to identify the learning styles and provide Adaptive User Interface which includes adaptive learning components and learning material. The proposed method analyzes the captured web usage data to identify the learning profile of the learners. The learning profiles are identified by an algorithmic approach that is based on the frequency of accessing the materials and the time spent on the various learning components on the portal. The captured log data is pre-processed and converted into standard XML format to generate learners sequence data corresponding to the different sessions and time spent. The learning style model adopted in this approach is Felder-Silverman Learning Style Model (FSLSM. This paper also presents the analysis of learner’s activities, preprocessed XML files and generated sequences.
Web Log Pre-processing and Analysis for Generation of Learning Profiles in Adaptive E-learning

Directory of Open Access Journals (Sweden)

Radhika M. Pai

2016-04-01

Full Text Available Adaptive E-learning Systems (AESs enhance the efficiency of online courses in education by providing personalized contents and user interfaces that changes according to learner’s requirements and usage patterns. This paper presents the approach to generate learning profile of each learner which helps to identify the learning styles and provide Adaptive User Interface which includes adaptive learning components and learning material. The proposed method analyzes the captured web usage data to identify the learning profile of the learners. The learning profiles are identified by an algorithmic approach that is based on the frequency of accessing the materials and the time spent on the various learning components on the portal. The captured log data is pre-processed and converted into standard XML format to generate learners sequence data corresponding to the different sessions and time spent. The learning style model adopted in this approach is Felder-Silverman Learning Style Model (FSLSM. This paper also presents the analysis of learner’s activities, preprocessed XML files and generated sequences.
Modeling financial time series with S-plus

CERN Document Server

Zivot, Eric

2003-01-01

The field of financial econometrics has exploded over the last decade This book represents an integration of theory, methods, and examples using the S-PLUS statistical modeling language and the S+FinMetrics module to facilitate the practice of financial econometrics This is the first book to show the power of S-PLUS for the analysis of time series data It is written for researchers and practitioners in the finance industry, academic researchers in economics and finance, and advanced MBA and graduate students in economics and finance Readers are assumed to have a basic knowledge of S-PLUS and a solid grounding in basic statistics and time series concepts Eric Zivot is an associate professor and Gary Waterman Distinguished Scholar in the Economics Department at the University of Washington, and is co-director of the nascent Professional Master's Program in Computational Finance He regularly teaches courses on econometric theory, financial econometrics and time series econometrics, and is the recipient of the He...
Application of Time Series Analysis in Determination of Lag Time in Jahanbin Basin

Directory of Open Access Journals (Sweden)

Seied Yahya Mirzaee

2005-11-01

One of the important issues that have significant role in study of hydrology of basin is determination of lag time. Lag time has significant role in hydrological studies. Quantity of rainfall related lag time depends on several factors, such as permeability, vegetation cover, catchments slope, rainfall intensity, storm duration and type of rain. Determination of lag time is important parameter in many projects such as dam design and also water resource studies. Lag time of basin could be calculated using various methods. One of these methods is time series analysis of spectral density. The analysis is based on fouries series. The time series is approximated with Sinuous and Cosines functions. In this method harmonically significant quantities with individual frequencies are presented. Spectral density under multiple time series could be used to obtain basin lag time for annual runoff and short-term rainfall fluctuation. A long lag time could be due to snowmelt as well as melting ice due to rainfalls in freezing days. In this research the lag time of Jahanbin basin has been determined using spectral density method. The catchments is subjected to both rainfall and snowfall. For short term rainfall fluctuation with a return period 2, 3, 4 months, the lag times were found 0.18, 0.5 and 0.083 month, respectively.
Modeling Time Series Data for Supervised Learning

Science.gov (United States)

Baydogan, Mustafa Gokce

2012-01-01

Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning…
Empirical method to measure stochasticity and multifractality in nonlinear time series

Science.gov (United States)

Lin, Chih-Hao; Chang, Chia-Seng; Li, Sai-Ping

2013-12-01

An empirical algorithm is used here to study the stochastic and multifractal nature of nonlinear time series. A parameter can be defined to quantitatively measure the deviation of the time series from a Wiener process so that the stochasticity of different time series can be compared. The local volatility of the time series under study can be constructed using this algorithm, and the multifractal structure of the time series can be analyzed by using this local volatility. As an example, we employ this method to analyze financial time series from different stock markets. The result shows that while developed markets evolve very much like an Ito process, the emergent markets are far from efficient. Differences about the multifractal structures and leverage effects between developed and emergent markets are discussed. The algorithm used here can be applied in a similar fashion to study time series of other complex systems.
Clinical time series prediction: Toward a hierarchical dynamical system framework.

Science.gov (United States)

Liu, Zitao; Hauskrecht, Milos

2015-09-01

Developing machine learning and data mining algorithms for building temporal models of clinical time series is important for understanding of the patient condition, the dynamics of a disease, effect of various patient management interventions and clinical decision making. In this work, we propose and develop a novel hierarchical framework for modeling clinical time series data of varied length and with irregularly sampled observations. Our hierarchical dynamical system framework for modeling clinical time series combines advantages of the two temporal modeling approaches: the linear dynamical system and the Gaussian process. We model the irregularly sampled clinical time series by using multiple Gaussian process sequences in the lower level of our hierarchical framework and capture the transitions between Gaussian processes by utilizing the linear dynamical system. The experiments are conducted on the complete blood count (CBC) panel data of 1000 post-surgical cardiac patients during their hospitalization. Our framework is evaluated and compared to multiple baseline approaches in terms of the mean absolute prediction error and the absolute percentage error. We tested our framework by first learning the time series model from data for the patients in the training set, and then using it to predict future time series values for the patients in the test set. We show that our model outperforms multiple existing models in terms of its predictive accuracy. Our method achieved a 3.13% average prediction accuracy improvement on ten CBC lab time series when it was compared against the best performing baseline. A 5.25% average accuracy improvement was observed when only short-term predictions were considered. A new hierarchical dynamical system framework that lets us model irregularly sampled time series data is a promising new direction for modeling clinical time series and for improving their predictive performance. Copyright © 2014 Elsevier B.V. All rights reserved.
Clinical time series prediction: towards a hierarchical dynamical system framework

Science.gov (United States)

Liu, Zitao; Hauskrecht, Milos

2014-01-01

Objective Developing machine learning and data mining algorithms for building temporal models of clinical time series is important for understanding of the patient condition, the dynamics of a disease, effect of various patient management interventions and clinical decision making. In this work, we propose and develop a novel hierarchical framework for modeling clinical time series data of varied length and with irregularly sampled observations. Materials and methods Our hierarchical dynamical system framework for modeling clinical time series combines advantages of the two temporal modeling approaches: the linear dynamical system and the Gaussian process. We model the irregularly sampled clinical time series by using multiple Gaussian process sequences in the lower level of our hierarchical framework and capture the transitions between Gaussian processes by utilizing the linear dynamical system. The experiments are conducted on the complete blood count (CBC) panel data of 1000 post-surgical cardiac patients during their hospitalization. Our framework is evaluated and compared to multiple baseline approaches in terms of the mean absolute prediction error and the absolute percentage error. Results We tested our framework by first learning the time series model from data for the patient in the training set, and then applying the model in order to predict future time series values on the patients in the test set. We show that our model outperforms multiple existing models in terms of its predictive accuracy. Our method achieved a 3.13% average prediction accuracy improvement on ten CBC lab time series when it was compared against the best performing baseline. A 5.25% average accuracy improvement was observed when only short-term predictions were considered. Conclusion A new hierarchical dynamical system framework that lets us model irregularly sampled time series data is a promising new direction for modeling clinical time series and for improving their predictive
Turbulencelike Behavior of Seismic Time Series

International Nuclear Information System (INIS)

Manshour, P.; Saberi, S.; Sahimi, Muhammad; Peinke, J.; Pacheco, Amalio F.; Rahimi Tabar, M. Reza

2009-01-01

We report on a stochastic analysis of Earth's vertical velocity time series by using methods originally developed for complex hierarchical systems and, in particular, for turbulent flows. Analysis of the fluctuations of the detrended increments of the series reveals a pronounced transition in their probability density function from Gaussian to non-Gaussian. The transition occurs 5-10 hours prior to a moderate or large earthquake, hence representing a new and reliable precursor for detecting such earthquakes
Characterizing time series: when Granger causality triggers complex networks

Science.gov (United States)

Ge, Tian; Cui, Yindong; Lin, Wei; Kurths, Jürgen; Liu, Chong

2012-08-01

In this paper, we propose a new approach to characterize time series with noise perturbations in both the time and frequency domains by combining Granger causality and complex networks. We construct directed and weighted complex networks from time series and use representative network measures to describe their physical and topological properties. Through analyzing the typical dynamical behaviors of some physical models and the MIT-BIHMassachusetts Institute of Technology-Beth Israel Hospital. human electrocardiogram data sets, we show that the proposed approach is able to capture and characterize various dynamics and has much potential for analyzing real-world time series of rather short length.
Characterizing time series: when Granger causality triggers complex networks

International Nuclear Information System (INIS)

Ge Tian; Cui Yindong; Lin Wei; Liu Chong; Kurths, Jürgen

2012-01-01

In this paper, we propose a new approach to characterize time series with noise perturbations in both the time and frequency domains by combining Granger causality and complex networks. We construct directed and weighted complex networks from time series and use representative network measures to describe their physical and topological properties. Through analyzing the typical dynamical behaviors of some physical models and the MIT-BIH human electrocardiogram data sets, we show that the proposed approach is able to capture and characterize various dynamics and has much potential for analyzing real-world time series of rather short length. (paper)
Multivariate time series analysis with R and financial applications

CERN Document Server

Tsay, Ruey S

2013-01-01

Since the publication of his first book, Analysis of Financial Time Series, Ruey Tsay has become one of the most influential and prominent experts on the topic of time series. Different from the traditional and oftentimes complex approach to multivariate (MV) time series, this sequel book emphasizes structural specification, which results in simplified parsimonious VARMA modeling and, hence, eases comprehension. Through a fundamental balance between theory and applications, the book supplies readers with an accessible approach to financial econometric models and their applications to real-worl
Measurements of spatial population synchrony: influence of time series transformations.

Science.gov (United States)

Chevalier, Mathieu; Laffaille, Pascal; Ferdy, Jean-Baptiste; Grenouillet, Gaël

2015-09-01

Two mechanisms have been proposed to explain spatial population synchrony: dispersal among populations, and the spatial correlation of density-independent factors (the "Moran effect"). To identify which of these two mechanisms is driving spatial population synchrony, time series transformations (TSTs) of abundance data have been used to remove the signature of one mechanism, and highlight the effect of the other. However, several issues with TSTs remain, and to date no consensus has emerged about how population time series should be handled in synchrony studies. Here, by using 3131 time series involving 34 fish species found in French rivers, we computed several metrics commonly used in synchrony studies to determine whether a large-scale climatic factor (temperature) influenced fish population dynamics at the regional scale, and to test the effect of three commonly used TSTs (detrending, prewhitening and a combination of both) on these metrics. We also tested whether the influence of TSTs on time series and population synchrony levels was related to the features of the time series using both empirical and simulated time series. For several species, and regardless of the TST used, we evidenced a Moran effect on freshwater fish populations. However, these results were globally biased downward by TSTs which reduced our ability to detect significant signals. Depending on the species and the features of the time series, we found that TSTs could lead to contradictory results, regardless of the metric considered. Finally, we suggest guidelines on how population time series should be processed in synchrony studies.
TargetSearch - a Bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data

Science.gov (United States)

2009-01-01

Background Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. Results We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. Conclusions TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data. PMID:20015393
TargetSearch - a Bioconductor package for the efficient preprocessing of GC-MS metabolite profiling data

Directory of Open Access Journals (Sweden)

Lisec Jan

2009-12-01

Full Text Available Abstract Background Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS. The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. Results We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. Conclusions TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.
Stochastic time series analysis of hydrology data for water resources

Science.gov (United States)

Sathish, S.; Khadar Babu, S. K.

2017-11-01

The prediction to current publication of stochastic time series analysis in hydrology and seasonal stage. The different statistical tests for predicting the hydrology time series on Thomas-Fiering model. The hydrology time series of flood flow have accept a great deal of consideration worldwide. The concentration of stochastic process areas of time series analysis method are expanding with develop concerns about seasonal periods and global warming. The recent trend by the researchers for testing seasonal periods in the hydrologic flowseries using stochastic process on Thomas-Fiering model. The present article proposed to predict the seasonal periods in hydrology using Thomas-Fiering model.
Nonlinear time series analysis of the human electrocardiogram

International Nuclear Information System (INIS)

Perc, Matjaz

2005-01-01

We analyse the human electrocardiogram with simple nonlinear time series analysis methods that are appropriate for graduate as well as undergraduate courses. In particular, attention is devoted to the notions of determinism and stationarity in physiological data. We emphasize that methods of nonlinear time series analysis can be successfully applied only if the studied data set originates from a deterministic stationary system. After positively establishing the presence of determinism and stationarity in the studied electrocardiogram, we calculate the maximal Lyapunov exponent, thus providing interesting insights into the dynamics of the human heart. Moreover, to facilitate interest and enable the integration of nonlinear time series analysis methods into the curriculum at an early stage of the educational process, we also provide user-friendly programs for each implemented method

Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis.

Science.gov (United States)

Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary

2014-11-01

Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A Technical Review on Biomass Processing: Densification, Preprocessing, Modeling and Optimization

Energy Technology Data Exchange (ETDEWEB)

Jaya Shankar Tumuluru; Christopher T. Wright

2010-06-01

It is now a well-acclaimed fact that burning fossil fuels and deforestation are major contributors to climate change. Biomass from plants can serve as an alternative renewable and carbon-neutral raw material for the production of bioenergy. Low densities of 40–60 kg/m3 for lignocellulosic and 200–400 kg/m3 for woody biomass limits their application for energy purposes. Prior to use in energy applications these materials need to be densified. The densified biomass can have bulk densities over 10 times the raw material helping to significantly reduce technical limitations associated with storage, loading and transportation. Pelleting, briquetting, or extrusion processing are commonly used methods for densification. The aim of the present research is to develop a comprehensive review of biomass processing that includes densification, preprocessing, modeling and optimization. The specific objective include carrying out a technical review on (a) mechanisms of particle bonding during densification; (b) methods of densification including extrusion, briquetting, pelleting, and agglomeration; (c) effects of process and feedstock variables and biomass biochemical composition on the densification (d) effects of preprocessing such as grinding, preheating, steam explosion, and torrefaction on biomass quality and binding characteristics; (e) models for understanding the compression characteristics; and (f) procedures for response surface modeling and optimization.
Hidden Markov Models for Time Series An Introduction Using R

CERN Document Server

Zucchini, Walter

2009-01-01

Illustrates the flexibility of HMMs as general-purpose models for time series data. This work presents an overview of HMMs for analyzing time series data, from continuous-valued, circular, and multivariate series to binary data, bounded and unbounded counts and categorical observations.
Constructing ordinal partition transition networks from multivariate time series.

Science.gov (United States)

Zhang, Jiayang; Zhou, Jie; Tang, Ming; Guo, Heng; Small, Michael; Zou, Yong

2017-08-10

A growing number of algorithms have been proposed to map a scalar time series into ordinal partition transition networks. However, most observable phenomena in the empirical sciences are of a multivariate nature. We construct ordinal partition transition networks for multivariate time series. This approach yields weighted directed networks representing the pattern transition properties of time series in velocity space, which hence provides dynamic insights of the underling system. Furthermore, we propose a measure of entropy to characterize ordinal partition transition dynamics, which is sensitive to capturing the possible local geometric changes of phase space trajectories. We demonstrate the applicability of pattern transition networks to capture phase coherence to non-coherence transitions, and to characterize paths to phase synchronizations. Therefore, we conclude that the ordinal partition transition network approach provides complementary insight to the traditional symbolic analysis of nonlinear multivariate time series.
Permutation entropy of finite-length white-noise time series.

Science.gov (United States)

Little, Douglas J; Kane, Deb M

2016-08-01

Permutation entropy (PE) is commonly used to discriminate complex structure from white noise in a time series. While the PE of white noise is well understood in the long time-series limit, analysis in the general case is currently lacking. Here the expectation value and variance of white-noise PE are derived as functions of the number of ordinal pattern trials, N, and the embedding dimension, D. It is demonstrated that the probability distribution of the white-noise PE converges to a χ^{2} distribution with D!-1 degrees of freedom as N becomes large. It is further demonstrated that the PE variance for an arbitrary time series can be estimated as the variance of a related metric, the Kullback-Leibler entropy (KLE), allowing the qualitative N≫D! condition to be recast as a quantitative estimate of the N required to achieve a desired PE calculation precision. Application of this theory to statistical inference is demonstrated in the case of an experimentally obtained noise series, where the probability of obtaining the observed PE value was calculated assuming a white-noise time series. Standard statistical inference can be used to draw conclusions whether the white-noise null hypothesis can be accepted or rejected. This methodology can be applied to other null hypotheses, such as discriminating whether two time series are generated from different complex system states.
Multiresolution analysis of Bursa Malaysia KLCI time series

Science.gov (United States)

Ismail, Mohd Tahir; Dghais, Amel Abdoullah Ahmed

2017-05-01

In general, a time series is simply a sequence of numbers collected at regular intervals over a period. Financial time series data processing is concerned with the theory and practice of processing asset price over time, such as currency, commodity data, and stock market data. The primary aim of this study is to understand the fundamental characteristics of selected financial time series by using the time as well as the frequency domain analysis. After that prediction can be executed for the desired system for in sample forecasting. In this study, multiresolution analysis which the assist of discrete wavelet transforms (DWT) and maximal overlap discrete wavelet transform (MODWT) will be used to pinpoint special characteristics of Bursa Malaysia KLCI (Kuala Lumpur Composite Index) daily closing prices and return values. In addition, further case study discussions include the modeling of Bursa Malaysia KLCI using linear ARIMA with wavelets to address how multiresolution approach improves fitting and forecasting results.
Modelling bursty time series

International Nuclear Information System (INIS)

Vajna, Szabolcs; Kertész, János; Tóth, Bálint

2013-01-01

Many human-related activities show power-law decaying interevent time distribution with exponents usually varying between 1 and 2. We study a simple task-queuing model, which produces bursty time series due to the non-trivial dynamics of the task list. The model is characterized by a priority distribution as an input parameter, which describes the choice procedure from the list. We give exact results on the asymptotic behaviour of the model and we show that the interevent time distribution is power-law decaying for any kind of input distributions that remain normalizable in the infinite list limit, with exponents tunable between 1 and 2. The model satisfies a scaling law between the exponents of interevent time distribution (β) and autocorrelation function (α): α + β = 2. This law is general for renewal processes with power-law decaying interevent time distribution. We conclude that slowly decaying autocorrelation function indicates long-range dependence only if the scaling law is violated. (paper)
Timing calibration and spectral cleaning of LOFAR time series data

NARCIS (Netherlands)

Corstanje, A.; Buitink, S.; Enriquez, J. E.; Falcke, H.; Horandel, J. R.; Krause, M.; Nelles, A.; Rachen, J. P.; Schellart, P.; Scholten, O.; ter Veen, S.; Thoudam, S.; Trinh, T. N. G.

We describe a method for spectral cleaning and timing calibration of short time series data of the voltage in individual radio interferometer receivers. It makes use of phase differences in fast Fourier transform (FFT) spectra across antenna pairs. For strong, localized terrestrial sources these are
Time series momentum and contrarian effects in the Chinese stock market

Science.gov (United States)

Shi, Huai-Long; Zhou, Wei-Xing

2017-10-01

This paper concentrates on the time series momentum or contrarian effects in the Chinese stock market. We evaluate the performance of the time series momentum strategy applied to major stock indices in mainland China and explore the relation between the performance of time series momentum strategies and some firm-specific characteristics. Our findings indicate that there is a time series momentum effect in the short run and a contrarian effect in the long run in the Chinese stock market. The performances of the time series momentum and contrarian strategies are highly dependent on the look-back and holding periods and firm-specific characteristics.
Time-Series Analysis: A Cautionary Tale

Science.gov (United States)

Damadeo, Robert

2015-01-01

Time-series analysis has often been a useful tool in atmospheric science for deriving long-term trends in various atmospherically important parameters (e.g., temperature or the concentration of trace gas species). In particular, time-series analysis has been repeatedly applied to satellite datasets in order to derive the long-term trends in stratospheric ozone, which is a critical atmospheric constituent. However, many of the potential pitfalls relating to the non-uniform sampling of the datasets were often ignored and the results presented by the scientific community have been unknowingly biased. A newly developed and more robust application of this technique is applied to the Stratospheric Aerosol and Gas Experiment (SAGE) II version 7.0 ozone dataset and the previous biases and newly derived trends are presented.
Characterizing interdependencies of multiple time series theory and applications

CERN Document Server

Hosoya, Yuzo; Takimoto, Taro; Kinoshita, Ryo

2017-01-01

This book introduces academic researchers and professionals to the basic concepts and methods for characterizing interdependencies of multiple time series in the frequency domain. Detecting causal directions between a pair of time series and the extent of their effects, as well as testing the non existence of a feedback relation between them, have constituted major focal points in multiple time series analysis since Granger introduced the celebrated definition of causality in view of prediction improvement. Causality analysis has since been widely applied in many disciplines. Although most analyses are conducted from the perspective of the time domain, a frequency domain method introduced in this book sheds new light on another aspect that disentangles the interdependencies between multiple time series in terms of long-term or short-term effects, quantitatively characterizing them. The frequency domain method includes the Granger noncausality test as a special case. Chapters 2 and 3 of the book introduce an i...
A perturbative approach for enhancing the performance of time series forecasting.

Science.gov (United States)

de Mattos Neto, Paulo S G; Ferreira, Tiago A E; Lima, Aranildo R; Vasconcelos, Germano C; Cavalcanti, George D C

2017-04-01

This paper proposes a method to perform time series prediction based on perturbation theory. The approach is based on continuously adjusting an initial forecasting model to asymptotically approximate a desired time series model. First, a predictive model generates an initial forecasting for a time series. Second, a residual time series is calculated as the difference between the original time series and the initial forecasting. If that residual series is not white noise, then it can be used to improve the accuracy of the initial model and a new predictive model is adjusted using residual series. The whole process is repeated until convergence or the residual series becomes white noise. The output of the method is then given by summing up the outputs of all trained predictive models in a perturbative sense. To test the method, an experimental investigation was conducted on six real world time series. A comparison was made with six other methods experimented and ten other results found in the literature. Results show that not only the performance of the initial model is significantly improved but also the proposed method outperforms the other results previously published. Copyright © 2017 Elsevier Ltd. All rights reserved.
Drunk driving detection based on classification of multivariate time series.

Science.gov (United States)

Li, Zhenlong; Jin, Xue; Zhao, Xiaohua

2015-09-01

This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
Evaluation of scaling invariance embedded in short time series.

Directory of Open Access Journals (Sweden)

Xue Pan

Full Text Available Scaling invariance of time series has been making great contributions in diverse research fields. But how to evaluate scaling exponent from a real-world series is still an open problem. Finite length of time series may induce unacceptable fluctuation and bias to statistical quantities and consequent invalidation of currently used standard methods. In this paper a new concept called correlation-dependent balanced estimation of diffusion entropy is developed to evaluate scale-invariance in very short time series with length ~10(2. Calculations with specified Hurst exponent values of 0.2,0.3,...,0.9 show that by using the standard central moving average de-trending procedure this method can evaluate the scaling exponents for short time series with ignorable bias (≤0.03 and sharp confidential interval (standard deviation ≤0.05. Considering the stride series from ten volunteers along an approximate oval path of a specified length, we observe that though the averages and deviations of scaling exponents are close, their evolutionary behaviors display rich patterns. It has potential use in analyzing physiological signals, detecting early warning signals, and so on. As an emphasis, the our core contribution is that by means of the proposed method one can estimate precisely shannon entropy from limited records.
Evaluation of scaling invariance embedded in short time series.

Science.gov (United States)

Pan, Xue; Hou, Lei; Stephen, Mutua; Yang, Huijie; Zhu, Chenping

2014-01-01

Scaling invariance of time series has been making great contributions in diverse research fields. But how to evaluate scaling exponent from a real-world series is still an open problem. Finite length of time series may induce unacceptable fluctuation and bias to statistical quantities and consequent invalidation of currently used standard methods. In this paper a new concept called correlation-dependent balanced estimation of diffusion entropy is developed to evaluate scale-invariance in very short time series with length ~10(2). Calculations with specified Hurst exponent values of 0.2,0.3,...,0.9 show that by using the standard central moving average de-trending procedure this method can evaluate the scaling exponents for short time series with ignorable bias (≤0.03) and sharp confidential interval (standard deviation ≤0.05). Considering the stride series from ten volunteers along an approximate oval path of a specified length, we observe that though the averages and deviations of scaling exponents are close, their evolutionary behaviors display rich patterns. It has potential use in analyzing physiological signals, detecting early warning signals, and so on. As an emphasis, the our core contribution is that by means of the proposed method one can estimate precisely shannon entropy from limited records.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.

Science.gov (United States)

Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi

2015-02-01

We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Comparison of data transformation procedures to enhance topographical accuracy in time-series analysis of the human EEG.

Science.gov (United States)

Hauk, O; Keil, A; Elbert, T; Müller, M M

2002-01-30

We describe a methodology to apply current source density (CSD) and minimum norm (MN) estimation as pre-processing tools for time-series analysis of single trial EEG data. The performance of these methods is compared for the case of wavelet time-frequency analysis of simulated gamma-band activity. A reasonable comparison of CSD and MN on the single trial level requires regularization such that the corresponding transformed data sets have similar signal-to-noise ratios (SNRs). For region-of-interest approaches, it should be possible to optimize the SNR for single estimates rather than for the whole distributed solution. An effective implementation of the MN method is described. Simulated data sets were created by modulating the strengths of a radial and a tangential test dipole with wavelets in the frequency range of the gamma band, superimposed with simulated spatially uncorrelated noise. The MN and CSD transformed data sets as well as the average reference (AR) representation were subjected to wavelet frequency-domain analysis, and power spectra were mapped for relevant frequency bands. For both CSD and MN, the influence of noise can be sufficiently suppressed by regularization to yield meaningful information, but only MN represents both radial and tangential dipole sources appropriately as single peaks. Therefore, when relating wavelet power spectrum topographies to their neuronal generators, MN should be preferred.
Geomechanical time series and its singularity spectrum analysis

Czech Academy of Sciences Publication Activity Database

Lyubushin, Alexei A.; Kaláb, Zdeněk; Lednická, Markéta

2012-01-01

Roč. 47, č. 1 (2012), s. 69-77 ISSN 1217-8977 R&D Projects: GA ČR GA105/09/0089 Institutional research plan: CEZ:AV0Z30860518 Keywords : geomechanical time series * singularity spectrum * time series segmentation * laser distance meter Subject RIV: DC - Siesmology, Volcanology, Earth Structure Impact factor: 0.347, year: 2012 http://www.akademiai.com/content/88v4027758382225/fulltext.pdf
Optimal preprocessing of serum and urine metabolomic data fusion for staging prostate cancer through design of experiment

International Nuclear Information System (INIS)

Zheng, Hong; Cai, Aimin; Zhou, Qi; Xu, Pengtao; Zhao, Liangcai; Li, Chen; Dong, Baijun; Gao, Hongchang

2017-01-01

Accurate classification of cancer stages will achieve precision treatment for cancer. Metabolomics presents biological phenotypes at the metabolite level and holds a great potential for cancer classification. Since metabolomic data can be obtained from different samples or analytical techniques, data fusion has been applied to improve classification accuracy. Data preprocessing is an essential step during metabolomic data analysis. Therefore, we developed an innovative optimization method to select a proper data preprocessing strategy for metabolomic data fusion using a design of experiment approach for improving the classification of prostate cancer (PCa) stages. In this study, urine and serum samples were collected from participants at five phases of PCa and analyzed using a 1 H NMR-based metabolomic approach. Partial least squares-discriminant analysis (PLS-DA) was used as a classification model and its performance was assessed by goodness of fit (R 2 ) and predictive ability (Q 2 ). Results show that data preprocessing significantly affect classification performance and depends on data properties. Using the fused metabolomic data from urine and serum, PLS-DA model with the optimal data preprocessing (R 2 = 0.729, Q 2 = 0.504, P < 0.0001) can effectively improve model performance and achieve a better classification result for PCa stages as compared with that without data preprocessing (R 2 = 0.139, Q 2 = 0.006, P = 0.450). Therefore, we propose that metabolomic data fusion integrated with an optimal data preprocessing strategy can significantly improve the classification of cancer stages for precision treatment. - Highlights: • NMR metabolomic analysis of body fluids can be used for staging prostate cancer. • Data preprocessing is an essential step for metabolomic analysis. • Data fusion improves information recovery for cancer classification. • Design of experiment achieves optimal preprocessing of metabolomic data fusion.
Pseudo-random bit generator based on lag time series

Science.gov (United States)

García-Martínez, M.; Campos-Cantón, E.

2014-12-01

In this paper, we present a pseudo-random bit generator (PRBG) based on two lag time series of the logistic map using positive and negative values in the bifurcation parameter. In order to hidden the map used to build the pseudo-random series we have used a delay in the generation of time series. These new series when they are mapped xn against xn+1 present a cloud of points unrelated to the logistic map. Finally, the pseudo-random sequences have been tested with the suite of NIST giving satisfactory results for use in stream ciphers.

Non-linear forecasting in high-frequency financial time series

Science.gov (United States)

Strozzi, F.; Zaldívar, J. M.

2005-08-01

A new methodology based on state space reconstruction techniques has been developed for trading in financial markets. The methodology has been tested using 18 high-frequency foreign exchange time series. The results are in apparent contradiction with the efficient market hypothesis which states that no profitable information about future movements can be obtained by studying the past prices series. In our (off-line) analysis positive gain may be obtained in all those series. The trading methodology is quite general and may be adapted to other financial time series. Finally, the steps for its on-line application are discussed.
Data preprocessing methods of FT-NIR spectral data for the classification cooking oil

Science.gov (United States)

Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli

2014-12-01

This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.
Analysis of JET ELMy time series

International Nuclear Information System (INIS)

Zvejnieks, G.; Kuzovkov, V.N.

2005-01-01

Full text: Achievement of the planned operational regime in the next generation tokamaks (such as ITER) still faces principal problems. One of the main challenges is obtaining the control of edge localized modes (ELMs), which should lead to both long plasma pulse times and reasonable divertor life time. In order to control ELMs the hypothesis was proposed by Degeling [1] that ELMs exhibit features of chaotic dynamics and thus a standard chaos control methods might be applicable. However, our findings which are based on the nonlinear autoregressive (NAR) model contradict this hypothesis for JET ELMy time-series. In turn, it means that ELM behavior is of a relaxation or random type. These conclusions coincide with our previous results obtained for ASDEX Upgrade time series [2]. [1] A.W. Degeling, Y.R. Martin, P.E. Bak, J. B.Lister, and X. Llobet, Plasma Phys. Control. Fusion 43, 1671 (2001). [2] G. Zvejnieks, V.N. Kuzovkov, O. Dumbrajs, A.W. Degeling, W. Suttrop, H. Urano, and H. Zohm, Physics of Plasmas 11, 5658 (2004)
The Statistical Analysis of Time Series

CERN Document Server

Anderson, T W

2011-01-01

The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences George
Analysis of time series and size of equivalent sample

International Nuclear Information System (INIS)

Bernal, Nestor; Molina, Alicia; Pabon, Daniel; Martinez, Jorge

2004-01-01

In a meteorological context, a first approach to the modeling of time series is to use models of autoregressive type. This allows one to take into account the meteorological persistence or temporal behavior, thereby identifying the memory of the analyzed process. This article seeks to pre-sent the concept of the size of an equivalent sample, which helps to identify in the data series sub periods with a similar structure. Moreover, in this article we examine the alternative of adjusting the variance of the series, keeping in mind its temporal structure, as well as an adjustment to the covariance of two time series. This article presents two examples, the first one corresponding to seven simulated series with autoregressive structure of first order, and the second corresponding to seven meteorological series of anomalies of the air temperature at the surface in two Colombian regions
Optimal production scheduling for energy efficiency improvement in biofuel feedstock preprocessing considering work-in-process particle separation

International Nuclear Information System (INIS)

Li, Lin; Sun, Zeyi; Yao, Xufeng; Wang, Donghai

2016-01-01

Biofuel is considered a promising alternative to traditional liquid transportation fuels. The large-scale substitution of biofuel can greatly enhance global energy security and mitigate greenhouse gas emissions. One major concern of the broad adoption of biofuel is the intensive energy consumption in biofuel manufacturing. This paper focuses on the energy efficiency improvement of biofuel feedstock preprocessing, a major process of cellulosic biofuel manufacturing. An improved scheme of the feedstock preprocessing considering work-in-process particle separation is introduced to reduce energy waste and improve energy efficiency. A scheduling model based on the improved scheme is also developed to identify an optimal production schedule that can minimize the energy consumption of the feedstock preprocessing under production target constraint. A numerical case study is used to illustrate the effectiveness of the proposed method. The research outcome is expected to improve the energy efficiency and enhance the environmental sustainability of biomass feedstock preprocessing. - Highlights: • A novel method to schedule production in biofuel feedstock preprocessing process. • Systems modeling approach is used. • Capable of optimize preprocessing to reduce energy waste and improve energy efficiency. • A numerical case is used to illustrate the effectiveness of the method. • Energy consumption per unit production can be significantly reduced.
Scalable Prediction of Energy Consumption using Incremental Time Series Clustering

Energy Technology Data Exchange (ETDEWEB)

Simmhan, Yogesh; Noor, Muhammad Usman

2013-10-09

Time series datasets are a canonical form of high velocity Big Data, and often generated by pervasive sensors, such as found in smart infrastructure. Performing predictive analytics on time series data can be computationally complex, and requires approximation techniques. In this paper, we motivate this problem using a real application from the smart grid domain. We propose an incremental clustering technique, along with a novel affinity score for determining cluster similarity, which help reduce the prediction error for cumulative time series within a cluster. We evaluate this technique, along with optimizations, using real datasets from smart meters, totaling ~700,000 data points, and show the efficacy of our techniques in improving the prediction error of time series data within polynomial time.
Monitoring Forest Dynamics in the Andean Amazon: The Applicability of Breakpoint Detection Methods Using Landsat Time-Series and Genetic Algorithms

Directory of Open Access Journals (Sweden)

Fabián Santos

2017-01-01

Full Text Available The Andean Amazon is an endangered biodiversity hot spot but its forest dynamics are less studied than those of the Amazon lowland and forests from middle or high latitudes. This is because its landscape variability, complex topography and cloudy conditions constitute a challenging environment for any remote-sensing assessment. Breakpoint detection with Landsat time-series data is an established robust approach for monitoring forest dynamics around the globe but has not been properly evaluated for implementation in the Andean Amazon. We analyzed breakpoint detection-generated forest dynamics in order to determine its limitations when applied to three different study areas located along an altitude gradient in the Andean Amazon in Ecuador. Using all available Landsat imagery for the period 1997–2016, we evaluated different pre-processing approaches, noise reduction techniques, and breakpoint detection algorithms. These procedures were integrated into a complex function called the processing chain generator. Calibration was not straightforward since it required us to define values for 24 parameters. To solve this problem, we implemented a novel approach using genetic algorithms. We calibrated the processing chain generator by applying a stratified training sampling and a reference dataset based on high resolution imagery. After the best calibration solution was found and the processing chain generator executed, we assessed accuracy and found that data gaps, inaccurate co-registration, radiometric variability in sensor calibration, unmasked cloud, and shadows can drastically affect the results, compromising the application of breakpoint detection in mountainous areas of the Andean Amazon. Moreover, since breakpoint detection analysis of landscape variability in the Andean Amazon requires a unique calibration of algorithms, the time required to optimize analysis could complicate its proper implementation and undermine its application for large
Forecasting with nonlinear time series models

DEFF Research Database (Denmark)

Kock, Anders Bredahl; Teräsvirta, Timo

In this paper, nonlinear models are restricted to mean nonlinear parametric models. Several such models popular in time series econo- metrics are presented and some of their properties discussed. This in- cludes two models based on universal approximators: the Kolmogorov- Gabor polynomial model...... applied to economic fore- casting problems, is briefly highlighted. A number of large published studies comparing macroeconomic forecasts obtained using different time series models are discussed, and the paper also contains a small simulation study comparing recursive and direct forecasts in a partic...... and two versions of a simple artificial neural network model. Techniques for generating multi-period forecasts from nonlinear models recursively are considered, and the direct (non-recursive) method for this purpose is mentioned as well. Forecasting with com- plex dynamic systems, albeit less frequently...
Nonparametric factor analysis of time series

OpenAIRE

Rodríguez-Poo, Juan M.; Linton, Oliver Bruce

1998-01-01

We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.
Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain

Directory of Open Access Journals (Sweden)

Michael Rzanny

2017-11-01

Full Text Available Abstract Background Automated species identification is a long term research subject. Contrary to flowers and fruits, leaves are available throughout most of the year. Offering margin and texture to characterize a species, they are the most studied organ for automated identification. Substantially matured machine learning techniques generate the need for more training data (aka leaf images. Researchers as well as enthusiasts miss guidance on how to acquire suitable training images in an efficient way. Methods In this paper, we systematically study nine image types and three preprocessing strategies. Image types vary in terms of in-situ image recording conditions: perspective, illumination, and background, while the preprocessing strategies compare non-preprocessed, cropped, and segmented images to each other. Per image type-preprocessing combination, we also quantify the manual effort required for their implementation. We extract image features using a convolutional neural network, classify species using the resulting feature vectors and discuss classification accuracy in relation to the required effort per combination. Results The most effective, non-destructive way to record herbaceous leaves is to take an image of the leaf’s top side. We yield the highest classification accuracy using destructive back light images, i.e., holding the plucked leaf against the sky for image acquisition. Cropping the image to the leaf’s boundary substantially improves accuracy, while precise segmentation yields similar accuracy at a substantially higher effort. The permanent use or disuse of a flash light has negligible effects. Imaging the typically stronger textured backside of a leaf does not result in higher accuracy, but notably increases the acquisition cost. Conclusions In conclusion, the way in which leaf images are acquired and preprocessed does have a substantial effect on the accuracy of the classifier trained on them. For the first time, this
Time Series Outlier Detection Based on Sliding Window Prediction

Directory of Open Access Journals (Sweden)

Yufeng Yu

2014-01-01

Full Text Available In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI, which can be calculated by the predicted value and confidence coefficient. The use of PCI as threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis.
Metagenomics meets time series analysis: unraveling microbial community dynamics

NARCIS (Netherlands)

Faust, K.; Lahti, L.M.; Gonze, D.; Vos, de W.M.; Raes, J.

2015-01-01

The recent increase in the number of microbial time series studies offers new insights into the stability and dynamics of microbial communities, from the world's oceans to human microbiota. Dedicated time series analysis tools allow taking full advantage of these data. Such tools can reveal periodic
Time series forecasting based on deep extreme learning machine

NARCIS (Netherlands)

Guo, Xuqi; Pang, Y.; Yan, Gaowei; Qiao, Tiezhu; Yang, Guang-Hong; Yang, Dan

2017-01-01

Multi-layer Artificial Neural Networks (ANN) has caught widespread attention as a new method for time series forecasting due to the ability of approximating any nonlinear function. In this paper, a new local time series prediction model is established with the nearest neighbor domain theory, in
False-nearest-neighbors algorithm and noise-corrupted time series

International Nuclear Information System (INIS)

Rhodes, C.; Morari, M.

1997-01-01

The false-nearest-neighbors (FNN) algorithm was originally developed to determine the embedding dimension for autonomous time series. For noise-free computer-generated time series, the algorithm does a good job in predicting the embedding dimension. However, the problem of predicting the embedding dimension when the time-series data are corrupted by noise was not fully examined in the original studies of the FNN algorithm. Here it is shown that with large data sets, even small amounts of noise can lead to incorrect prediction of the embedding dimension. Surprisingly, as the length of the time series analyzed by FNN grows larger, the cause of incorrect prediction becomes more pronounced. An analysis of the effect of noise on the FNN algorithm and a solution for dealing with the effects of noise are given here. Some results on the theoretically correct choice of the FNN threshold are also presented. copyright 1997 The American Physical Society
CauseMap: fast inference of causality from complex time series.

Science.gov (United States)

Maher, M Cyrus; Hernandez, Ryan D

2015-01-01

Background. Establishing health-related causal relationships is a central pursuit in biomedical research. Yet, the interdependent non-linearity of biological systems renders causal dynamics laborious and at times impractical to disentangle. This pursuit is further impeded by the dearth of time series that are sufficiently long to observe and understand recurrent patterns of flux. However, as data generation costs plummet and technologies like wearable devices democratize data collection, we anticipate a coming surge in the availability of biomedically-relevant time series data. Given the life-saving potential of these burgeoning resources, it is critical to invest in the development of open source software tools that are capable of drawing meaningful insight from vast amounts of time series data. Results. Here we present CauseMap, the first open source implementation of convergent cross mapping (CCM), a method for establishing causality from long time series data (≳25 observations). Compared to existing time series methods, CCM has the advantage of being model-free and robust to unmeasured confounding that could otherwise induce spurious associations. CCM builds on Takens' Theorem, a well-established result from dynamical systems theory that requires only mild assumptions. This theorem allows us to reconstruct high dimensional system dynamics using a time series of only a single variable. These reconstructions can be thought of as shadows of the true causal system. If reconstructed shadows can predict points from opposing time series, we can infer that the corresponding variables are providing views of the same causal system, and so are causally related. Unlike traditional metrics, this test can establish the directionality of causation, even in the presence of feedback loops. Furthermore, since CCM can extract causal relationships from times series of, e.g., a single individual, it may be a valuable tool to personalized medicine. We implement CCM in Julia, a
CauseMap: fast inference of causality from complex time series

Directory of Open Access Journals (Sweden)

M. Cyrus Maher

2015-03-01

Full Text Available Background. Establishing health-related causal relationships is a central pursuit in biomedical research. Yet, the interdependent non-linearity of biological systems renders causal dynamics laborious and at times impractical to disentangle. This pursuit is further impeded by the dearth of time series that are sufficiently long to observe and understand recurrent patterns of flux. However, as data generation costs plummet and technologies like wearable devices democratize data collection, we anticipate a coming surge in the availability of biomedically-relevant time series data. Given the life-saving potential of these burgeoning resources, it is critical to invest in the development of open source software tools that are capable of drawing meaningful insight from vast amounts of time series data.Results. Here we present CauseMap, the first open source implementation of convergent cross mapping (CCM, a method for establishing causality from long time series data (≳25 observations. Compared to existing time series methods, CCM has the advantage of being model-free and robust to unmeasured confounding that could otherwise induce spurious associations. CCM builds on Takens’ Theorem, a well-established result from dynamical systems theory that requires only mild assumptions. This theorem allows us to reconstruct high dimensional system dynamics using a time series of only a single variable. These reconstructions can be thought of as shadows of the true causal system. If reconstructed shadows can predict points from opposing time series, we can infer that the corresponding variables are providing views of the same causal system, and so are causally related. Unlike traditional metrics, this test can establish the directionality of causation, even in the presence of feedback loops. Furthermore, since CCM can extract causal relationships from times series of, e.g., a single individual, it may be a valuable tool to personalized medicine. We implement
Time domain series system definition and gear set reliability modeling

International Nuclear Information System (INIS)

Xie, Liyang; Wu, Ningxiang; Qian, Wenxue

2016-01-01

Time-dependent multi-configuration is a typical feature for mechanical systems such as gear trains and chain drives. As a series system, a gear train is distinct from a traditional series system, such as a chain, in load transmission path, system-component relationship, system functioning manner, as well as time-dependent system configuration. Firstly, the present paper defines time-domain series system to which the traditional series system reliability model is not adequate. Then, system specific reliability modeling technique is proposed for gear sets, including component (tooth) and subsystem (tooth-pair) load history description, material priori/posterior strength expression, time-dependent and system specific load-strength interference analysis, as well as statistically dependent failure events treatment. Consequently, several system reliability models are developed for gear sets with different tooth numbers in the scenario of tooth root material ultimate tensile strength failure. The application of the models is discussed in the last part, and the differences between the system specific reliability model and the traditional series system reliability model are illustrated by virtue of several numerical examples. - Highlights: • A new type of series system, i.e. time-domain multi-configuration series system is defined, that is of great significance to reliability modeling. • Multi-level statistical analysis based reliability modeling method is presented for gear transmission system. • Several system specific reliability models are established for gear set reliability estimation. • The differences between the traditional series system reliability model and the new model are illustrated.
Track Irregularity Time Series Analysis and Trend Forecasting

Directory of Open Access Journals (Sweden)

Jia Chaolong

2012-01-01

Full Text Available The combination of linear and nonlinear methods is widely used in the prediction of time series data. This paper analyzes track irregularity time series data by using gray incidence degree models and methods of data transformation, trying to find the connotative relationship between the time series data. In this paper, GM (1,1 is based on first-order, single variable linear differential equations; after an adaptive improvement and error correction, it is used to predict the long-term changing trend of track irregularity at a fixed measuring point; the stochastic linear AR, Kalman filtering model, and artificial neural network model are applied to predict the short-term changing trend of track irregularity at unit section. Both long-term and short-term changes prove that the model is effective and can achieve the expected accuracy.
PRESEE: an MDL/MML algorithm to time-series stream segmenting.

Science.gov (United States)

Xu, Kaikuo; Jiang, Yexi; Tang, Mingjie; Yuan, Changan; Tang, Changjie

2013-01-01

Time-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on the issue of ameliorating precision instead of paying much attention to the efficiency. Moreover, the performance of these algorithms depends heavily on parameters, which are hard for the users to set. In this paper, we propose PRESEE (parameter-free, real-time, and scalable time-series stream segmenting algorithm), which greatly improves the efficiency of time-series stream segmenting. PRESEE is based on both MDL (minimum description length) and MML (minimum message length) methods, which could segment the data automatically. To evaluate the performance of PRESEE, we conduct several experiments on time-series streams of different types and compare it with the state-of-art algorithm. The empirical results show that PRESEE is very efficient for real-time stream datasets by improving segmenting speed nearly ten times. The novelty of this algorithm is further demonstrated by the application of PRESEE in segmenting real-time stream datasets from ChinaFLUX sensor networks data stream.

Time-varying surrogate data to assess nonlinearity in nonstationary time series: application to heart rate variability.

Science.gov (United States)

Faes, Luca; Zhao, He; Chon, Ki H; Nollo, Giandomenico

2009-03-01

We propose a method to extend to time-varying (TV) systems the procedure for generating typical surrogate time series, in order to test the presence of nonlinear dynamics in potentially nonstationary signals. The method is based on fitting a TV autoregressive (AR) model to the original series and then regressing the model coefficients with random replacements of the model residuals to generate TV AR surrogate series. The proposed surrogate series were used in combination with a TV sample entropy (SE) discriminating statistic to assess nonlinearity in both simulated and experimental time series, in comparison with traditional time-invariant (TIV) surrogates combined with the TIV SE discriminating statistic. Analysis of simulated time series showed that using TIV surrogates, linear nonstationary time series may be erroneously regarded as nonlinear and weak TV nonlinearities may remain unrevealed, while the use of TV AR surrogates markedly increases the probability of a correct interpretation. Application to short (500 beats) heart rate variability (HRV) time series recorded at rest (R), after head-up tilt (T), and during paced breathing (PB) showed: 1) modifications of the SE statistic that were well interpretable with the known cardiovascular physiology; 2) significant contribution of nonlinear dynamics to HRV in all conditions, with significant increase during PB at 0.2 Hz respiration rate; and 3) a disagreement between TV AR surrogates and TIV surrogates in about a quarter of the series, suggesting that nonstationarity may affect HRV recordings and bias the outcome of the traditional surrogate-based nonlinearity test.
Local normalization: Uncovering correlations in non-stationary financial time series

Science.gov (United States)

Schäfer, Rudi; Guhr, Thomas

2010-09-01

The measurement of correlations between financial time series is of vital importance for risk management. In this paper we address an estimation error that stems from the non-stationarity of the time series. We put forward a method to rid the time series of local trends and variable volatility, while preserving cross-correlations. We test this method in a Monte Carlo simulation, and apply it to empirical data for the S&P 500 stocks.
Fuzzy time-series based on Fibonacci sequence for stock price forecasting

Science.gov (United States)

Chen, Tai-Liang; Cheng, Ching-Hsue; Jong Teoh, Hia

2007-07-01

Time-series models have been utilized to make reasonably accurate predictions in the areas of stock price movements, academic enrollments, weather, etc. For promoting the forecasting performance of fuzzy time-series models, this paper proposes a new model, which incorporates the concept of the Fibonacci sequence, the framework of Song and Chissom's model and the weighted method of Yu's model. This paper employs a 5-year period TSMC (Taiwan Semiconductor Manufacturing Company) stock price data and a 13-year period of TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) stock index data as experimental datasets. By comparing our forecasting performances with Chen's (Forecasting enrollments based on fuzzy time-series. Fuzzy Sets Syst. 81 (1996) 311-319), Yu's (Weighted fuzzy time-series models for TAIEX forecasting. Physica A 349 (2004) 609-624) and Huarng's (The application of neural networks to forecast fuzzy time series. Physica A 336 (2006) 481-491) models, we conclude that the proposed model surpasses in accuracy these conventional fuzzy time-series models.
Reproducible cancer biomarker discovery in SELDI-TOF MS using different pre-processing algorithms.

Directory of Open Access Journals (Sweden)

Jinfeng Zou

Full Text Available BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers.
Parameterizing unconditional skewness in models for financial time series

DEFF Research Database (Denmark)

He, Changli; Silvennoinen, Annastiina; Teräsvirta, Timo

In this paper we consider the third-moment structure of a class of time series models. It is often argued that the marginal distribution of financial time series such as returns is skewed. Therefore it is of importance to know what properties a model should possess if it is to accommodate...
A New Hybrid Model Based on Data Preprocessing and an Intelligent Optimization Algorithm for Electrical Power System Forecasting

Directory of Open Access Journals (Sweden)

Ping Jiang

2015-01-01

Full Text Available The establishment of electrical power system cannot only benefit the reasonable distribution and management in energy resources, but also satisfy the increasing demand for electricity. The electrical power system construction is often a pivotal part in the national and regional economic development plan. This paper constructs a hybrid model, known as the E-MFA-BP model, that can forecast indices in the electrical power system, including wind speed, electrical load, and electricity price. Firstly, the ensemble empirical mode decomposition can be applied to eliminate the noise of original time series data. After data preprocessing, the back propagation neural network model is applied to carry out the forecasting. Owing to the instability of its structure, the modified firefly algorithm is employed to optimize the weight and threshold values of back propagation to obtain a hybrid model with higher forecasting quality. Three experiments are carried out to verify the effectiveness of the model. Through comparison with other traditional well-known forecasting models, and models optimized by other optimization algorithms, the experimental results demonstrate that the hybrid model has the best forecasting performance.
Self-organising mixture autoregressive model for non-stationary time series modelling.

Science.gov (United States)

Ni, He; Yin, Hujun

2008-12-01

Modelling non-stationary time series has been a difficult task for both parametric and nonparametric methods. One promising solution is to combine the flexibility of nonparametric models with the simplicity of parametric models. In this paper, the self-organising mixture autoregressive (SOMAR) network is adopted as a such mixture model. It breaks time series into underlying segments and at the same time fits local linear regressive models to the clusters of segments. In such a way, a global non-stationary time series is represented by a dynamic set of local linear regressive models. Neural gas is used for a more flexible structure of the mixture model. Furthermore, a new similarity measure has been introduced in the self-organising network to better quantify the similarity of time series segments. The network can be used naturally in modelling and forecasting non-stationary time series. Experiments on artificial, benchmark time series (e.g. Mackey-Glass) and real-world data (e.g. numbers of sunspots and Forex rates) are presented and the results show that the proposed SOMAR network is effective and superior to other similar approaches.
The Prediction of Teacher Turnover Employing Time Series Analysis.

Science.gov (United States)

Costa, Crist H.

The purpose of this study was to combine knowledge of teacher demographic data with time-series forecasting methods to predict teacher turnover. Moving averages and exponential smoothing were used to forecast discrete time series. The study used data collected from the 22 largest school districts in Iowa, designated as FACT schools. Predictions…
Stacked Heterogeneous Neural Networks for Time Series Forecasting

Directory of Open Access Journals (Sweden)

Florin Leon

2010-01-01

Full Text Available A hybrid model for time series forecasting is proposed. It is a stacked neural network, containing one normal multilayer perceptron with bipolar sigmoid activation functions, and the other with an exponential activation function in the output layer. As shown by the case studies, the proposed stacked hybrid neural model performs well on a variety of benchmark time series. The combination of weights of the two stack components that leads to optimal performance is also studied.
Chaotic time series prediction: From one to another

International Nuclear Information System (INIS)

Zhao Pengfei; Xing Lei; Yu Jun

2009-01-01

In this Letter, a new local linear prediction model is proposed to predict a chaotic time series of a component x(t) by using the chaotic time series of another component y(t) in the same system with x(t). Our approach is based on the phase space reconstruction coming from the Takens embedding theorem. To illustrate our results, we present an example of Lorenz system and compare with the performance of the original local linear prediction model.
Grammar-based feature generation for time-series prediction

CERN Document Server

De Silva, Anthony Mihirana

2015-01-01

This book proposes a novel approach for time-series prediction using machine learning techniques with automatic feature generation. Application of machine learning techniques to predict time-series continues to attract considerable attention due to the difficulty of the prediction problems compounded by the non-linear and non-stationary nature of the real world time-series. The performance of machine learning techniques, among other things, depends on suitable engineering of features. This book proposes a systematic way for generating suitable features using context-free grammar. A number of feature selection criteria are investigated and a hybrid feature generation and selection algorithm using grammatical evolution is proposed. The book contains graphical illustrations to explain the feature generation process. The proposed approaches are demonstrated by predicting the closing price of major stock market indices, peak electricity load and net hourly foreign exchange client trade volume. The proposed method ...
Forecasting autoregressive time series under changing persistence

DEFF Research Database (Denmark)

Kruse, Robinson

Changing persistence in time series models means that a structural change from nonstationarity to stationarity or vice versa occurs over time. Such a change has important implications for forecasting, as negligence may lead to inaccurate model predictions. This paper derives generally applicable...
Recurrent Neural Networks for Multivariate Time Series with Missing Values.

Science.gov (United States)

Che, Zhengping; Purushotham, Sanjay; Cho, Kyunghyun; Sontag, David; Liu, Yan

2018-04-17

Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
Conditional time series forecasting with convolutional neural networks

NARCIS (Netherlands)

A. Borovykh (Anastasia); S.M. Bohte (Sander); C.W. Oosterlee (Cornelis)

2017-01-01

textabstractForecasting financial time series using past observations has been a significant topic of interest. While temporal relationships in the data exist, they are difficult to analyze and predict accurately due to the non-linear trends and noise present in the series. We propose to learn these
Time Series Analysis of Wheat Futures Reward in China

Institute of Scientific and Technical Information of China (English)

无

2005-01-01

Different from the fact that the main researches are focused on single futures contract and lack of the comparison of different periods, this paper described the statistical characteristics of wheat futures reward time series of Zhengzhou Commodity Exchange in recent three years. Besides the basic statistic analysis, the paper used the GARCH and EGARCH model to describe the time series which had the ARCH effect and analyzed the persistence of volatility shocks and the leverage effect. The results showed that compared with that of normal one,wheat futures reward series were abnormality, leptokurtic and thick tail distribution. The study also found that two-part of the reward series had no autocorrelation. Among the six correlative series, three ones presented the ARCH effect. By using of the Auto-regressive Distributed Lag Model, GARCH model and EGARCH model, the paper demonstrates the persistence of volatility shocks and the leverage effect on the wheat futures reward time series. The results reveal that on the one hand, the statistical characteristics of the wheat futures reward are similar to the aboard mature futures market as a whole. But on the other hand, the results reflect some shortages such as the immatureness and the over-control by the government in the Chinese future market.
forecasting with nonlinear time series model: a monte-carlo

African Journals Online (AJOL)

PUBLICATIONS1

erated recursively up to any step greater than one. For nonlinear time series model, point forecast for step one can be done easily like in the linear case but forecast for a step greater than or equal to ..... London. Franses, P. H. (1998). Time series models for business and Economic forecasting, Cam- bridge University press.
Time series analysis of temporal networks

Science.gov (United States)

Sikdar, Sandipan; Ganguly, Niloy; Mukherjee, Animesh

2016-01-01

A common but an important feature of all real-world networks is that they are temporal in nature, i.e., the network structure changes over time. Due to this dynamic nature, it becomes difficult to propose suitable growth models that can explain the various important characteristic properties of these networks. In fact, in many application oriented studies only knowing these properties is sufficient. For instance, if one wishes to launch a targeted attack on a network, this can be done even without the knowledge of the full network structure; rather an estimate of some of the properties is sufficient enough to launch the attack. We, in this paper show that even if the network structure at a future time point is not available one can still manage to estimate its properties. We propose a novel method to map a temporal network to a set of time series instances, analyze them and using a standard forecast model of time series, try to predict the properties of a temporal network at a later time instance. To our aim, we consider eight properties such as number of active nodes, average degree, clustering coefficient etc. and apply our prediction framework on them. We mainly focus on the temporal network of human face-to-face contacts and observe that it represents a stochastic process with memory that can be modeled as Auto-Regressive-Integrated-Moving-Average (ARIMA). We use cross validation techniques to find the percentage accuracy of our predictions. An important observation is that the frequency domain properties of the time series obtained from spectrogram analysis could be used to refine the prediction framework by identifying beforehand the cases where the error in prediction is likely to be high. This leads to an improvement of 7.96% (for error level ≤20%) in prediction accuracy on an average across all datasets. As an application we show how such prediction scheme can be used to launch targeted attacks on temporal networks. Contribution to the Topical Issue
The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

KAUST Repository

Euá n, Carolina; Ombao, Hernando; Ortega, Joaquí n

2018-01-01

We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms
Notes on economic time series analysis system theoretic perspectives

CERN Document Server

Aoki, Masanao

1983-01-01

In seminars and graduate level courses I have had several opportunities to discuss modeling and analysis of time series with economists and economic graduate students during the past several years. These experiences made me aware of a gap between what economic graduate students are taught about vector-valued time series and what is available in recent system literature. Wishing to fill or narrow the gap that I suspect is more widely spread than my personal experiences indicate, I have written these notes to augment and reor ganize materials I have given in these courses and seminars. I have endeavored to present, in as much a self-contained way as practicable, a body of results and techniques in system theory that I judge to be relevant and useful to economists interested in using time series in their research. I have essentially acted as an intermediary and interpreter of system theoretic results and perspectives in time series by filtering out non-essential details, and presenting coherent accounts of wha...
Dynamical analysis and visualization of tornadoes time series.

Directory of Open Access Journals (Sweden)

António M Lopes

Full Text Available In this paper we analyze the behavior of tornado time-series in the U.S. from the perspective of dynamical systems. A tornado is a violently rotating column of air extending from a cumulonimbus cloud down to the ground. Such phenomena reveal features that are well described by power law functions and unveil characteristics found in systems with long range memory effects. Tornado time series are viewed as the output of a complex system and are interpreted as a manifestation of its dynamics. Tornadoes are modeled as sequences of Dirac impulses with amplitude proportional to the events size. First, a collection of time series involving 64 years is analyzed in the frequency domain by means of the Fourier transform. The amplitude spectra are approximated by power law functions and their parameters are read as an underlying signature of the system dynamics. Second, it is adopted the concept of circular time and the collective behavior of tornadoes analyzed. Clustering techniques are then adopted to identify and visualize the emerging patterns.

Dynamical analysis and visualization of tornadoes time series.

Science.gov (United States)

Lopes, António M; Tenreiro Machado, J A

2015-01-01

In this paper we analyze the behavior of tornado time-series in the U.S. from the perspective of dynamical systems. A tornado is a violently rotating column of air extending from a cumulonimbus cloud down to the ground. Such phenomena reveal features that are well described by power law functions and unveil characteristics found in systems with long range memory effects. Tornado time series are viewed as the output of a complex system and are interpreted as a manifestation of its dynamics. Tornadoes are modeled as sequences of Dirac impulses with amplitude proportional to the events size. First, a collection of time series involving 64 years is analyzed in the frequency domain by means of the Fourier transform. The amplitude spectra are approximated by power law functions and their parameters are read as an underlying signature of the system dynamics. Second, it is adopted the concept of circular time and the collective behavior of tornadoes analyzed. Clustering techniques are then adopted to identify and visualize the emerging patterns.
"Observation Obscurer" - Time Series Viewer, Editor and Processor

Science.gov (United States)

Andronov, I. L.

The program is described, which contains a set of subroutines suitable for East viewing and interactive filtering and processing of regularly and irregularly spaced time series. Being a 32-bit DOS application, it may be used as a default fast viewer/editor of time series in any compute shell ("commander") or in Windows. It allows to view the data in the "time" or "phase" mode, to remove ("obscure") or filter outstanding bad points; to make scale transformations and smoothing using few methods (e.g. mean with phase binning, determination of the statistically opti- mal number of phase bins; "running parabola" (Andronov, 1997, As. Ap. Suppl, 125, 207) fit and to make time series analysis using some methods, e.g. correlation, autocorrelation and histogram analysis: determination of extrema etc. Some features have been developed specially for variable star observers, e.g. the barycentric correction, the creation and fast analysis of "OC" diagrams etc. The manual for "hot keys" is presented. The computer code was compiled with a 32-bit Free Pascal (www.freepascal.org).
Modelling road accidents: An approach using structural time series

Science.gov (United States)

Junus, Noor Wahida Md; Ismail, Mohd Tahir

2014-09-01

In this paper, the trend of road accidents in Malaysia for the years 2001 until 2012 was modelled using a structural time series approach. The structural time series model was identified using a stepwise method, and the residuals for each model were tested. The best-fitted model was chosen based on the smallest Akaike Information Criterion (AIC) and prediction error variance. In order to check the quality of the model, a data validation procedure was performed by predicting the monthly number of road accidents for the year 2012. Results indicate that the best specification of the structural time series model to represent road accidents is the local level with a seasonal model.
Multiscale Poincaré plots for visualizing the structure of heartbeat time series.

Science.gov (United States)

Henriques, Teresa S; Mariani, Sara; Burykin, Anton; Rodrigues, Filipa; Silva, Tiago F; Goldberger, Ary L

2016-02-09

Poincaré delay maps are widely used in the analysis of cardiac interbeat interval (RR) dynamics. To facilitate visualization of the structure of these time series, we introduce multiscale Poincaré (MSP) plots. Starting with the original RR time series, the method employs a coarse-graining procedure to create a family of time series, each of which represents the system's dynamics in a different time scale. Next, the Poincaré plots are constructed for the original and the coarse-grained time series. Finally, as an optional adjunct, color can be added to each point to represent its normalized frequency. We illustrate the MSP method on simulated Gaussian white and 1/f noise time series. The MSP plots of 1/f noise time series reveal relative conservation of the phase space area over multiple time scales, while those of white noise show a marked reduction in area. We also show how MSP plots can be used to illustrate the loss of complexity when heartbeat time series from healthy subjects are compared with those from patients with chronic (congestive) heart failure syndrome or with atrial fibrillation. This generalized multiscale approach to Poincaré plots may be useful in visualizing other types of time series.
Time series patterns and language support in DBMS

Science.gov (United States)

Telnarova, Zdenka

2017-07-01

This contribution is focused on pattern type Time Series as a rich in semantics representation of data. Some example of implementation of this pattern type in traditional Data Base Management Systems is briefly presented. There are many approaches how to manipulate with patterns and query patterns. Crucial issue can be seen in systematic approach to pattern management and specific pattern query language which takes into consideration semantics of patterns. Query language SQL-TS for manipulating with patterns is shown on Time Series data.
Two-fractal overlap time series: Earthquakes and market crashes

Indian Academy of Sciences (India)

velocity over the other and time series of stock prices. An anticipation method for some of the crashes have been proposed here, based on these observations. Keywords. Cantor set; time series; earthquake; market crash. PACS Nos 05.00; 02.50.-r; 64.60; 89.65.Gh; 95.75.Wx. 1. Introduction. Capturing dynamical patterns of ...
Nonlinear time series analysis with R

CERN Document Server

Huffaker, Ray; Rosa, Rodolfo

2017-01-01

In the process of data analysis, the investigator is often facing highly-volatile and random-appearing observed data. A vast body of literature shows that the assumption of underlying stochastic processes was not necessarily representing the nature of the processes under investigation and, when other tools were used, deterministic features emerged. Non Linear Time Series Analysis (NLTS) allows researchers to test whether observed volatility conceals systematic non linear behavior, and to rigorously characterize governing dynamics. Behavioral patterns detected by non linear time series analysis, along with scientific principles and other expert information, guide the specification of mechanistic models that serve to explain real-world behavior rather than merely reproducing it. Often there is a misconception regarding the complexity of the level of mathematics needed to understand and utilize the tools of NLTS (for instance Chaos theory). However, mathematics used in NLTS is much simpler than many other subjec...
InSAR Deformation Time Series Processed On-Demand in the Cloud

Science.gov (United States)

Horn, W. B.; Weeden, R.; Dimarchi, H.; Arko, S. A.; Hogenson, K.

2017-12-01

During this past year, ASF has developed a cloud-based on-demand processing system known as HyP3 (http://hyp3.asf.alaska.edu/), the Hybrid Pluggable Processing Pipeline, for Synthetic Aperture Radar (SAR) data. The system makes it easy for a user who doesn't have the time or inclination to install and use complex SAR processing software to leverage SAR data in their research or operations. One such processing algorithm is generation of a deformation time series product, which is a series of images representing ground displacements over time, which can be computed using a time series of interferometric SAR (InSAR) products. The set of software tools necessary to generate this useful product are difficult to install, configure, and use. Moreover, for a long time series with many images, the processing of just the interferograms can take days. Principally built by three undergraduate students at the ASF DAAC, the deformation time series processing relies the new Amazon Batch service, which enables processing of jobs with complex interconnected dependencies in a straightforward and efficient manner. In the case of generating a deformation time series product from a stack of single-look complex SAR images, the system uses Batch to serialize the up-front processing, interferogram generation, optional tropospheric correction, and deformation time series generation. The most time consuming portion is the interferogram generation, because even for a fairly small stack of images many interferograms need to be processed. By using AWS Batch, the interferograms are all generated in parallel; the entire process completes in hours rather than days. Additionally, the individual interferograms are saved in Amazon's cloud storage, so that when new data is acquired in the stack, an updated time series product can be generated with minimal addiitonal processing. This presentation will focus on the development techniques and enabling technologies that were used in developing the time
Summary of ENDF/B pre-processing codes

International Nuclear Information System (INIS)

Cullen, D.E.

1981-12-01

This document contains the summary documentation for the ENDF/B pre-processing codes: LINEAR, RECENT, SIGMA1, GROUPIE, EVALPLOT, MERGER, DICTION, CONVERT. This summary documentation is merely a copy of the comment cards that appear at the beginning of each programme; these comment cards always reflect the latest status of input options, etc. For the latest published documentation on the methods used in these codes see UCRL-50400, Vol.17 parts A-E, Lawrence Livermore Laboratory (1979)
Vector bilinear autoregressive time series model and its superiority ...

African Journals Online (AJOL)

In this research, a vector bilinear autoregressive time series model was proposed and used to model three revenue series (X1, X2, X3) . The “orders” of the three series were identified on the basis of the distribution of autocorrelation and partial autocorrelation functions and were used to construct the vector bilinear models.
25 years of time series forecasting

NARCIS (Netherlands)

de Gooijer, J.G.; Hyndman, R.J.

2006-01-01

We review the past 25 years of research into time series forecasting. In this silver jubilee issue, we naturally highlight results published in journals managed by the International Institute of Forecasters (Journal of Forecasting 1982-1985 and International Journal of Forecasting 1985-2005). During
Markov Trends in Macroeconomic Time Series

NARCIS (Netherlands)

R. Paap (Richard)

1997-01-01

textabstractMany macroeconomic time series are characterised by long periods of positive growth, expansion periods, and short periods of negative growth, recessions. A popular model to describe this phenomenon is the Markov trend, which is a stochastic segmented trend where the slope depends on the
Modeling seasonality in bimonthly time series

NARCIS (Netherlands)

Ph.H.B.F. Franses (Philip Hans)

1992-01-01

textabstractA recurring issue in modeling seasonal time series variables is the choice of the most adequate model for the seasonal movements. One selection method for quarterly data is proposed in Hylleberg et al. (1990). Market response models are often constructed for bimonthly variables, and
On clustering fMRI time series

DEFF Research Database (Denmark)

Goutte, Cyril; Toft, Peter Aundal; Rostrup, E.

1999-01-01

Analysis of fMRI time series is often performed by extracting one or more parameters for the individual voxels. Methods based, e.g., on various statistical tests are then used to yield parameters corresponding to probability of activation or activation strength. However, these methods do...
Parallel finite elements with domain decomposition and its pre-processing

International Nuclear Information System (INIS)

Yoshida, A.; Yagawa, G.; Hamada, S.

1993-01-01

This paper describes a parallel finite element analysis using a domain decomposition method, and the pre-processing for the parallel calculation. Computer simulations are about to replace experiments in various fields, and the scale of model to be simulated tends to be extremely large. On the other hand, computational environment has drastically changed in these years. Especially, parallel processing on massively parallel computers or computer networks is considered to be promising techniques. In order to achieve high efficiency on such parallel computation environment, large granularity of tasks, a well-balanced workload distribution are key issues. It is also important to reduce the cost of pre-processing in such parallel FEM. From the point of view, the authors developed the domain decomposition FEM with the automatic and dynamic task-allocation mechanism and the automatic mesh generation/domain subdivision system for it. (author)
FALSE DETERMINATIONS OF CHAOS IN SHORT NOISY TIME SERIES. (R828745)

Science.gov (United States)

A method (NEMG) proposed in 1992 for diagnosing chaos in noisy time series with 50 or fewer observations entails fitting the time series with an empirical function which predicts an observation in the series from previous observations, and then estimating the rate of divergenc...
Multiscale multifractal multiproperty analysis of financial time series based on Rényi entropy

Science.gov (United States)

Yujun, Yang; Jianping, Li; Yimei, Yang

This paper introduces a multiscale multifractal multiproperty analysis based on Rényi entropy (3MPAR) method to analyze short-range and long-range characteristics of financial time series, and then applies this method to the five time series of five properties in four stock indices. Combining the two analysis techniques of Rényi entropy and multifractal detrended fluctuation analysis (MFDFA), the 3MPAR method focuses on the curves of Rényi entropy and generalized Hurst exponent of five properties of four stock time series, which allows us to study more universal and subtle fluctuation characteristics of financial time series. By analyzing the curves of the Rényi entropy and the profiles of the logarithm distribution of MFDFA of five properties of four stock indices, the 3MPAR method shows some fluctuation characteristics of the financial time series and the stock markets. Then, it also shows a richer information of the financial time series by comparing the profile of five properties of four stock indices. In this paper, we not only focus on the multifractality of time series but also the fluctuation characteristics of the financial time series and subtle differences in the time series of different properties. We find that financial time series is far more complex than reported in some research works using one property of time series.
A Literature Survey of Early Time Series Classification and Deep Learning

OpenAIRE

Santos, Tiago; Kern, Roman

2017-01-01

This paper provides an overview of current literature on time series classification approaches, in particular of early time series classification. A very common and effective time series classification approach is the 1-Nearest Neighbor classier, with different distance measures such as the Euclidean or dynamic time warping distances. This paper starts by reviewing these baseline methods. More recently, with the gain in popularity in the application of deep neural networks to the eld of...
Signal Processing for Time-Series Functions on a Graph

Science.gov (United States)

2018-02-01

Figures Fig. 1 Time -series function on a fixed graph.............................................2 iv Approved for public release; distribution is...φi〉`2(V)φi (39) 6= f̄ (40) Instead, we simply recover the average of f over time . 13 Approved for public release; distribution is unlimited. This...ARL-TR-8276• FEB 2018 US Army Research Laboratory Signal Processing for Time -Series Functions on a Graph by Humberto Muñoz-Barona, Jean Vettel, and
Non-linear time series extreme events and integer value problems

CERN Document Server

Turkman, Kamil Feridun; Zea Bermudez, Patrícia

2014-01-01

This book offers a useful combination of probabilistic and statistical tools for analyzing nonlinear time series. Key features of the book include a study of the extremal behavior of nonlinear time series and a comprehensive list of nonlinear models that address different aspects of nonlinearity. Several inferential methods, including quasi likelihood methods, sequential Markov Chain Monte Carlo Methods and particle filters, are also included so as to provide an overall view of the available tools for parameter estimation for nonlinear models. A chapter on integer time series models based on several thinning operations, which brings together all recent advances made in this area, is also included. Readers should have attended a prior course on linear time series, and a good grasp of simulation-based inferential methods is recommended. This book offers a valuable resource for second-year graduate students and researchers in statistics and other scientific areas who need a basic understanding of nonlinear time ...

Learning of time series through neuron-to-neuron instruction

Energy Technology Data Exchange (ETDEWEB)

Miyazaki, Y [Department of Physics, Kyoto University, Kyoto 606-8502, (Japan); Kinzel, W [Institut fuer Theoretische Physik, Universitaet Wurzburg, 97074 Wurzburg (Germany); Shinomoto, S [Department of Physics, Kyoto University, Kyoto (Japan)

2003-02-07

A model neuron with delayline feedback connections can learn a time series generated by another model neuron. It has been known that some student neurons that have completed such learning under the instruction of a teacher's quasi-periodic sequence mimic the teacher's time series over a long interval, even after instruction has ceased. We found that in addition to such faithful students, there are unfaithful students whose time series eventually diverge exponentially from that of the teacher. In order to understand the circumstances that allow for such a variety of students, the orbit dimension was estimated numerically. The quasi-periodic orbits in question were found to be confined in spaces with dimensions significantly smaller than that of the full phase space.
Learning of time series through neuron-to-neuron instruction

International Nuclear Information System (INIS)

Miyazaki, Y; Kinzel, W; Shinomoto, S

2003-01-01

A model neuron with delayline feedback connections can learn a time series generated by another model neuron. It has been known that some student neurons that have completed such learning under the instruction of a teacher's quasi-periodic sequence mimic the teacher's time series over a long interval, even after instruction has ceased. We found that in addition to such faithful students, there are unfaithful students whose time series eventually diverge exponentially from that of the teacher. In order to understand the circumstances that allow for such a variety of students, the orbit dimension was estimated numerically. The quasi-periodic orbits in question were found to be confined in spaces with dimensions significantly smaller than that of the full phase space
Quirky patterns in time-series of estimates of recruitment could be artefacts

DEFF Research Database (Denmark)

Dickey-Collas, M.; Hinzen, N.T.; Nash, R.D.M.

2015-01-01

of recruitment time-series in databases is therefore not consistent across or within species and stocks. Caution is therefore required as perhaps the characteristics of the time-series of stock dynamics may be determined by the model used to generate them, rather than underlying ecological phenomena......The accessibility of databases of global or regional stock assessment outputs is leading to an increase in meta-analysis of the dynamics of fish stocks. In most of these analyses, each of the time-series is generally assumed to be directly comparable. However, the approach to stock assessment...... employed, and the associated modelling assumptions, can have an important influence on the characteristics of each time-series. We explore this idea by investigating recruitment time-series with three different recruitment parameterizations: a stock–recruitment model, a random-walk time-series model...
The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

KAUST Repository

Euán, Carolina

2018-04-12

We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.
Estimation of time-delayed mutual information and bias for irregularly and sparsely sampled time-series

International Nuclear Information System (INIS)

Albers, D.J.; Hripcsak, George

2012-01-01

Highlights: ► Time-delayed mutual information for irregularly sampled time-series. ► Estimation bias for the time-delayed mutual information calculation. ► Fast, simple, PDF estimator independent, time-delayed mutual information bias estimate. ► Quantification of data-set-size limits of the time-delayed mutual calculation. - Abstract: A method to estimate the time-dependent correlation via an empirical bias estimate of the time-delayed mutual information for a time-series is proposed. In particular, the bias of the time-delayed mutual information is shown to often be equivalent to the mutual information between two distributions of points from the same system separated by infinite time. Thus intuitively, estimation of the bias is reduced to estimation of the mutual information between distributions of data points separated by large time intervals. The proposed bias estimation techniques are shown to work for Lorenz equations data and glucose time series data of three patients from the Columbia University Medical Center database.
Inferring interdependencies from short time series

Indian Academy of Sciences (India)

Abstract. Complex networks provide an invaluable framework for the study of interlinked dynamical systems. In many cases, such networks are constructed from observed time series by first estimating the ...... does not quantify causal relations (unlike IOTA, or .... Africa_map_regions.svg, which is under public domain.
Stochastic modeling of hourly rainfall times series in Campania (Italy)

Science.gov (United States)

Giorgio, M.; Greco, R.

2009-04-01

Occurrence of flowslides and floods in small catchments is uneasy to predict, since it is affected by a number of variables, such as mechanical and hydraulic soil properties, slope morphology, vegetation coverage, rainfall spatial and temporal variability. Consequently, landslide risk assessment procedures and early warning systems still rely on simple empirical models based on correlation between recorded rainfall data and observed landslides and/or river discharges. Effectiveness of such systems could be improved by reliable quantitative rainfall prediction, which can allow gaining larger lead-times. Analysis of on-site recorded rainfall height time series represents the most effective approach for a reliable prediction of local temporal evolution of rainfall. Hydrological time series analysis is a widely studied field in hydrology, often carried out by means of autoregressive models, such as AR, ARMA, ARX, ARMAX (e.g. Salas [1992]). Such models gave the best results when applied to the analysis of autocorrelated hydrological time series, like river flow or level time series. Conversely, they are not able to model the behaviour of intermittent time series, like point rainfall height series usually are, especially when recorded with short sampling time intervals. More useful for this issue are the so-called DRIP (Disaggregated Rectangular Intensity Pulse) and NSRP (Neymann-Scott Rectangular Pulse) model [Heneker et al., 2001; Cowpertwait et al., 2002], usually adopted to generate synthetic point rainfall series. In this paper, the DRIP model approach is adopted, in which the sequence of rain storms and dry intervals constituting the structure of rainfall time series is modeled as an alternating renewal process. Final aim of the study is to provide a useful tool to implement an early warning system for hydrogeological risk management. Model calibration has been carried out with hourly rainfall hieght data provided by the rain gauges of Campania Region civil
Using forbidden ordinal patterns to detect determinism in irregularly sampled time series.

Science.gov (United States)

Kulp, C W; Chobot, J M; Niskala, B J; Needhammer, C J

2016-02-01

It is known that when symbolizing a time series into ordinal patterns using the Bandt-Pompe (BP) methodology, there will be ordinal patterns called forbidden patterns that do not occur in a deterministic series. The existence of forbidden patterns can be used to identify deterministic dynamics. In this paper, the ability to use forbidden patterns to detect determinism in irregularly sampled time series is tested on data generated from a continuous model system. The study is done in three parts. First, the effects of sampling time on the number of forbidden patterns are studied on regularly sampled time series. The next two parts focus on two types of irregular-sampling, missing data and timing jitter. It is shown that forbidden patterns can be used to detect determinism in irregularly sampled time series for low degrees of sampling irregularity (as defined in the paper). In addition, comments are made about the appropriateness of using the BP methodology to symbolize irregularly sampled time series.
Forecasting the Reference Evapotranspiration Using Time Series Model

Directory of Open Access Journals (Sweden)

H. Zare Abyaneh

2016-10-01

Full Text Available Introduction: Reference evapotranspiration is one of the most important factors in irrigation timing and field management. Moreover, reference evapotranspiration forecasting can play a vital role in future developments. Therefore in this study, the seasonal autoregressive integrated moving average (ARIMA model was used to forecast the reference evapotranspiration time series in the Esfahan, Semnan, Shiraz, Kerman, and Yazd synoptic stations. Materials and Methods: In the present study in all stations (characteristics of the synoptic stations are given in Table 1, the meteorological data, including mean, maximum and minimum air temperature, relative humidity, dry-and wet-bulb temperature, dew-point temperature, wind speed, precipitation, air vapor pressure and sunshine hours were collected from the Islamic Republic of Iran Meteorological Organization (IRIMO for the 41 years from 1965 to 2005. The FAO Penman-Monteith equation was used to calculate the monthly reference evapotranspiration in the five synoptic stations and the evapotranspiration time series were formed. The unit root test was used to identify whether the time series was stationary, then using the Box-Jenkins method, seasonal ARIMA models were applied to the sample data. Table 1. The geographical location and climate conditions of the synoptic stations Station\tGeographical location\tAltitude (m\tMean air temperature (°C\tMean precipitation (mm\tClimate, according to the De Martonne index classification Longitude (E\tLatitude (N Annual\tMin. and Max. Esfahan\t51° 40'\t32° 37'\t1550.4\t16.36\t9.4-23.3\t122\tArid Semnan\t53° 33'\t35° 35'\t1130.8\t18.0\t12.4-23.8\t140\tArid Shiraz\t52° 36'\t29° 32'\t1484\t18.0\t10.2-25.9\t324\tSemi-arid Kerman\t56° 58'\t30° 15'\t1753.8\t15.6\t6.7-24.6\t142\tArid Yazd\t54° 17'\t31° 54'\t1237.2\t19.2\t11.8-26.0\t61\tArid Results and Discussion: The monthly meteorological data were used as input for the Ref-ET software and monthly reference
Complexity testing techniques for time series data: A comprehensive literature review

International Nuclear Information System (INIS)

Tang, Ling; Lv, Huiling; Yang, Fengmei; Yu, Lean

2015-01-01

Highlights: • A literature review of complexity testing techniques for time series data is provided. • Complexity measurements can generally fall into fractality, methods derived from nonlinear dynamics and entropy. • Different types investigate time series data from different perspectives. • Measures, applications and future studies for each type are presented. - Abstract: Complexity may be one of the most important measurements for analysing time series data; it covers or is at least closely related to different data characteristics within nonlinear system theory. This paper provides a comprehensive literature review examining the complexity testing techniques for time series data. According to different features, the complexity measurements for time series data can be divided into three primary groups, i.e., fractality (mono- or multi-fractality) for self-similarity (or system memorability or long-term persistence), methods derived from nonlinear dynamics (via attractor invariants or diagram descriptions) for attractor properties in phase-space, and entropy (structural or dynamical entropy) for the disorder state of a nonlinear system. These estimations analyse time series dynamics from different perspectives but are closely related to or even dependent on each other at the same time. In particular, a weaker self-similarity, a more complex structure of attractor, and a higher-level disorder state of a system consistently indicate that the observed time series data are at a higher level of complexity. Accordingly, this paper presents a historical tour of the important measures and works for each group, as well as ground-breaking and recent applications and future research directions.
Complex dynamic in ecological time series

Science.gov (United States)

Peter Turchin; Andrew D. Taylor

1992-01-01

Although the possibility of complex dynamical behaviors-limit cycles, quasiperiodic oscillations, and aperiodic chaos-has been recognized theoretically, most ecologists are skeptical of their importance in nature. In this paper we develop a methodology for reconstructing endogenous (or deterministic) dynamics from ecological time series. Our method consists of fitting...
Time Series Modelling using Proc Varmax

DEFF Research Database (Denmark)

Milhøj, Anders

2007-01-01

In this paper it will be demonstrated how various time series problems could be met using Proc Varmax. The procedure is rather new and hence new features like cointegration, testing for Granger causality are included, but it also means that more traditional ARIMA modelling as outlined by Box...
SensL B-Series and C-Series silicon photomultipliers for time-of-flight positron emission tomography

Energy Technology Data Exchange (ETDEWEB)

O' Neill, K., E-mail: koneill@sensl.com; Jackson, C., E-mail: cjackson@sensl.com

2015-07-01

Silicon photomultipliers from SensL are designed for high performance, uniformity and low cost. They demonstrate peak photon detection efficiency of 41% at 420 nm, which is matched to the output spectrum of cerium doped lutetium orthosilicate. Coincidence resolving time of less than 220 ps is demonstrated. New process improvements have lead to the development of C-Series SiPM which reduces the dark noise by over an order of magnitude. In this paper we will show characterization test results which include photon detection efficiency, dark count rate, crosstalk probability, afterpulse probability and coincidence resolving time comparing B-Series to the newest pre-production C-Series. Additionally we will discuss the effect of silicon photomultiplier microcell size on coincidence resolving time allowing the optimal microcell size choice to be made for time of flight positron emission tomography systems.
Kriging Methodology and Its Development in Forecasting Econometric Time Series

Directory of Open Access Journals (Sweden)

Andrej Gajdoš

2017-03-01

Full Text Available One of the approaches for forecasting future values of a time series or unknown spatial data is kriging. The main objective of the paper is to introduce a general scheme of kriging in forecasting econometric time series using a family of linear regression time series models (shortly named as FDSLRM which apply regression not only to a trend but also to a random component of the observed time series. Simultaneously performing a Monte Carlo simulation study with a real electricity consumption dataset in the R computational langure and environment, we investigate the well-known problem of “negative” estimates of variance components when kriging predictions fail. Our following theoretical analysis, including also the modern apparatus of advanced multivariate statistics, gives us the formulation and proof of a general theorem about the explicit form of moments (up to sixth order for a Gaussian time series observation. This result provides a basis for further theoretical and computational research in the kriging methodology development.
Use of Time-Series, ARIMA Designs to Assess Program Efficacy.

Science.gov (United States)

Braden, Jeffery P.; And Others

1990-01-01

Illustrates use of time-series designs for determining efficacy of interventions with fictitious data describing drug-abuse prevention program. Discusses problems and procedures associated with time-series data analysis using Auto Regressive Integrated Moving Averages (ARIMA) models. Example illustrates application of ARIMA analysis for…
Preprocessing with Photoshop Software on Microscopic Images of A549 Cells in Epithelial-Mesenchymal Transition.

Science.gov (United States)

Ren, Zhou-Xin; Yu, Hai-Bin; Shen, Jun-Ling; Li, Ya; Li, Jian-Sheng

2015-06-01

To establish a preprocessing method for cell morphometry in microscopic images of A549 cells in epithelial-mesenchymal transition (EMT). Adobe Photoshop CS2 (Adobe Systems, Inc.) was used for preprocessing the images. First, all images were processed for size uniformity and high distinguishability between the cell and background area. Then, a blank image with the same size and grids was established and cross points of the grids were added into a distinct color. The blank image was merged into a processed image. In the merged images, the cells with 1 or more cross points were chosen, and then the cell areas were enclosed and were replaced in a distinct color. Except for chosen cellular areas, all areas were changed into a unique hue. Three observers quantified roundness of cells in images with the image preprocess (IPP) or without the method (Controls), respectively. Furthermore, 1 observer measured the roundness 3 times with the 2 methods, respectively. The results between IPPs and Controls were compared for repeatability and reproducibility. As compared with the Control method, among 3 observers, use of the IPP method resulted in a higher number and a higher percentage of same-chosen cells in an image. The relative average deviation values of roundness, either for 3 observers or 1 observer, were significantly higher in Controls than in IPPs (p Photoshop, a chosen cell from an image was more objective, regular, and accurate, creating an increase of reproducibility and repeatability on morphometry of A549 cells in epithelial to mesenchymal transition.
An algorithm of Saxena-Easo on fuzzy time series forecasting

Science.gov (United States)

Ramadhani, L. C.; Anggraeni, D.; Kamsyakawuni, A.; Hadi, A. F.

2018-04-01

This paper presents a forecast model of Saxena-Easo fuzzy time series prediction to study the prediction of Indonesia inflation rate in 1970-2016. We use MATLAB software to compute this method. The algorithm of Saxena-Easo fuzzy time series doesn’t need stationarity like conventional forecasting method, capable of dealing with the value of time series which are linguistic and has the advantage of reducing the calculation, time and simplifying the calculation process. Generally it’s focus on percentage change as the universe discourse, interval partition and defuzzification. The result indicate that between the actual data and the forecast data are close enough with Root Mean Square Error (RMSE) = 1.5289.
Evolutionary Algorithms for the Detection of Structural Breaks in Time Series

DEFF Research Database (Denmark)

Doerr, Benjamin; Fischer, Paul; Hilbert, Astrid

2013-01-01

Detecting structural breaks is an essential task for the statistical analysis of time series, for example, for fitting parametric models to it. In short, structural breaks are points in time at which the behavior of the time series changes. Typically, no solid background knowledge of the time...
Fast randomized point location without preprocessing in two- and three-dimensional Delaunay triangulations

Energy Technology Data Exchange (ETDEWEB)

Muecke, E.P.; Saias, I.; Zhu, B.

1996-05-01

This paper studies the point location problem in Delaunay triangulations without preprocessing and additional storage. The proposed procedure finds the query point simply by walking through the triangulation, after selecting a good starting point by random sampling. The analysis generalizes and extends a recent result of d = 2 dimensions by proving this procedure to take expected time close to O(n{sup 1/(d+1)}) for point location in Delaunay triangulations of n random points in d = 3 dimensions. Empirical results in both two and three dimensions show that this procedure is efficient in practice.
On modeling panels of time series

NARCIS (Netherlands)

Ph.H.B.F. Franses (Philip Hans)

2002-01-01

textabstractThis paper reviews research issues in modeling panels of time series. Examples of this type of data are annually observed macroeconomic indicators for all countries in the world, daily returns on the individual stocks listed in the S&P500, and the sales records of all items in a

Unsupervised Symbolization of Signal Time Series for Extraction of the Embedded Information

Directory of Open Access Journals (Sweden)

Yue Li

2017-03-01

Full Text Available This paper formulates an unsupervised algorithm for symbolization of signal time series to capture the embedded dynamic behavior. The key idea is to convert time series of the digital signal into a string of (spatially discrete symbols from which the embedded dynamic information can be extracted in an unsupervised manner (i.e., no requirement for labeling of time series. The main challenges here are: (1 definition of the symbol assignment for the time series; (2 identification of the partitioning segment locations in the signal space of time series; and (3 construction of probabilistic finite-state automata (PFSA from the symbol strings that contain temporal patterns. The reported work addresses these challenges by maximizing the mutual information measures between symbol strings and PFSA states. The proposed symbolization method has been validated by numerical simulation as well as by experimentation in a laboratory environment. Performance of the proposed algorithm has been compared to that of two commonly used algorithms of time series partitioning.
Classification of time-series images using deep convolutional neural networks

Science.gov (United States)

Hatami, Nima; Gavet, Yann; Debayle, Johan

2018-04-01

Convolutional Neural Networks (CNN) has achieved a great success in image recognition task by automatically learning a hierarchical feature representation from raw data. While the majority of Time-Series Classification (TSC) literature is focused on 1D signals, this paper uses Recurrence Plots (RP) to transform time-series into 2D texture images and then take advantage of the deep CNN classifier. Image representation of time-series introduces different feature types that are not available for 1D signals, and therefore TSC can be treated as texture image recognition task. CNN model also allows learning different levels of representations together with a classifier, jointly and automatically. Therefore, using RP and CNN in a unified framework is expected to boost the recognition rate of TSC. Experimental results on the UCR time-series classification archive demonstrate competitive accuracy of the proposed approach, compared not only to the existing deep architectures, but also to the state-of-the art TSC algorithms.
Safe and sensible preprocessing and baseline correction of pupil-size data.

Science.gov (United States)

Mathôt, Sebastiaan; Fabius, Jasper; Van Heusden, Elle; Van der Stigchel, Stefan

2018-02-01

Measurement of pupil size (pupillometry) has recently gained renewed interest from psychologists, but there is little agreement on how pupil-size data is best analyzed. Here we focus on one aspect of pupillometric analyses: baseline correction, i.e., analyzing changes in pupil size relative to a baseline period. Baseline correction is useful in experiments that investigate the effect of some experimental manipulation on pupil size. In such experiments, baseline correction improves statistical power by taking into account random fluctuations in pupil size over time. However, we show that baseline correction can also distort data if unrealistically small pupil sizes are recorded during the baseline period, which can easily occur due to eye blinks, data loss, or other distortions. Divisive baseline correction (corrected pupil size = pupil size/baseline) is affected more strongly by such distortions than subtractive baseline correction (corrected pupil size = pupil size - baseline). We discuss the role of baseline correction as a part of preprocessing of pupillometric data, and make five recommendations: (1) before baseline correction, perform data preprocessing to mark missing and invalid data, but assume that some distortions will remain in the data; (2) use subtractive baseline correction; (3) visually compare your corrected and uncorrected data; (4) be wary of pupil-size effects that emerge faster than the latency of the pupillary response allows (within ±220 ms after the manipulation that induces the effect); and (5) remove trials on which baseline pupil size is unrealistically small (indicative of blinks and other distortions).
Long Range Dependence Prognostics for Bearing Vibration Intensity Chaotic Time Series

Directory of Open Access Journals (Sweden)

Qing Li

2016-01-01

Full Text Available According to the chaotic features and typical fractional order characteristics of the bearing vibration intensity time series, a forecasting approach based on long range dependence (LRD is proposed. In order to reveal the internal chaotic properties, vibration intensity time series are reconstructed based on chaos theory in phase-space, the delay time is computed with C-C method and the optimal embedding dimension and saturated correlation dimension are calculated via the Grassberger–Procaccia (G-P method, respectively, so that the chaotic characteristics of vibration intensity time series can be jointly determined by the largest Lyapunov exponent and phase plane trajectory of vibration intensity time series, meanwhile, the largest Lyapunov exponent is calculated by the Wolf method and phase plane trajectory is illustrated using Duffing-Holmes Oscillator (DHO. The Hurst exponent and long range dependence prediction method are proposed to verify the typical fractional order features and improve the prediction accuracy of bearing vibration intensity time series, respectively. Experience shows that the vibration intensity time series have chaotic properties and the LRD prediction method is better than the other prediction methods (largest Lyapunov, auto regressive moving average (ARMA and BP neural network (BPNN model in prediction accuracy and prediction performance, which provides a new approach for running tendency predictions for rotating machinery and provide some guidance value to the engineering practice.
Critical values for unit root tests in seasonal time series

NARCIS (Netherlands)

Ph.H.B.F. Franses (Philip Hans); B. Hobijn (Bart)

1997-01-01

textabstractIn this paper, we present tables with critical values for a variety of tests for seasonal and non-seasonal unit roots in seasonal time series. We consider (extensions of) the Hylleberg et al. and Osborn et al. test procedures. These extensions concern time series with increasing seasonal
Classification of time series patterns from complex dynamic systems

Energy Technology Data Exchange (ETDEWEB)

Schryver, J.C.; Rao, N.

1998-07-01

An increasing availability of high-performance computing and data storage media at decreasing cost is making possible the proliferation of large-scale numerical databases and data warehouses. Numeric warehousing enterprises on the order of hundreds of gigabytes to terabytes are a reality in many fields such as finance, retail sales, process systems monitoring, biomedical monitoring, surveillance and transportation. Large-scale databases are becoming more accessible to larger user communities through the internet, web-based applications and database connectivity. Consequently, most researchers now have access to a variety of massive datasets. This trend will probably only continue to grow over the next several years. Unfortunately, the availability of integrated tools to explore, analyze and understand the data warehoused in these archives is lagging far behind the ability to gain access to the same data. In particular, locating and identifying patterns of interest in numerical time series data is an increasingly important problem for which there are few available techniques. Temporal pattern recognition poses many interesting problems in classification, segmentation, prediction, diagnosis and anomaly detection. This research focuses on the problem of classification or characterization of numerical time series data. Highway vehicles and their drivers are examples of complex dynamic systems (CDS) which are being used by transportation agencies for field testing to generate large-scale time series datasets. Tools for effective analysis of numerical time series in databases generated by highway vehicle systems are not yet available, or have not been adapted to the target problem domain. However, analysis tools from similar domains may be adapted to the problem of classification of numerical time series data.
Sensitivity analysis of machine-learning models of hydrologic time series

Science.gov (United States)

O'Reilly, A. M.

2017-12-01

Sensitivity analysis traditionally has been applied to assessing model response to perturbations in model parameters, where the parameters are those model input variables adjusted during calibration. Unlike physics-based models where parameters represent real phenomena, the equivalent of parameters for machine-learning models are simply mathematical "knobs" that are automatically adjusted during training/testing/verification procedures. Thus the challenge of extracting knowledge of hydrologic system functionality from machine-learning models lies in their very nature, leading to the label "black box." Sensitivity analysis of the forcing-response behavior of machine-learning models, however, can provide understanding of how the physical phenomena represented by model inputs affect the physical phenomena represented by model outputs.As part of a previous study, hybrid spectral-decomposition artificial neural network (ANN) models were developed to simulate the observed behavior of hydrologic response contained in multidecadal datasets of lake water level, groundwater level, and spring flow. Model inputs used moving window averages (MWA) to represent various frequencies and frequency-band components of time series of rainfall and groundwater use. Using these forcing time series, the MWA-ANN models were trained to predict time series of lake water level, groundwater level, and spring flow at 51 sites in central Florida, USA. A time series of sensitivities for each MWA-ANN model was produced by perturbing forcing time-series and computing the change in response time-series per unit change in perturbation. Variations in forcing-response sensitivities are evident between types (lake, groundwater level, or spring), spatially (among sites of the same type), and temporally. Two generally common characteristics among sites are more uniform sensitivities to rainfall over time and notable increases in sensitivities to groundwater usage during significant drought periods.
Fractal analysis and nonlinear forecasting of indoor 222Rn time series

International Nuclear Information System (INIS)

Pausch, G.; Bossew, P.; Hofmann, W.; Steger, F.

1998-01-01

Fractal analyses of indoor 222 Rn time series were performed using different chaos theory based measurements such as time delay method, Hurst's rescaled range analysis, capacity (fractal) dimension, and Lyapunov exponent. For all time series we calculated only positive Lyapunov exponents which is a hint to chaos, while the Hurst exponents were well below 0.5, indicating antipersistent behaviour (past trends tend to reverse in the future). These time series were also analyzed with a nonlinear prediction method which allowed an estimation of the embedding dimensions with some restrictions, limiting the prediction to about three relative time steps. (orig.)
Koopman Operator Framework for Time Series Modeling and Analysis

Science.gov (United States)

Surana, Amit

2018-01-01

We propose an interdisciplinary framework for time series classification, forecasting, and anomaly detection by combining concepts from Koopman operator theory, machine learning, and linear systems and control theory. At the core of this framework is nonlinear dynamic generative modeling of time series using the Koopman operator which is an infinite-dimensional but linear operator. Rather than working with the underlying nonlinear model, we propose two simpler linear representations or model forms based on Koopman spectral properties. We show that these model forms are invariants of the generative model and can be readily identified directly from data using techniques for computing Koopman spectral properties without requiring the explicit knowledge of the generative model. We also introduce different notions of distances on the space of such model forms which is essential for model comparison/clustering. We employ the space of Koopman model forms equipped with distance in conjunction with classical machine learning techniques to develop a framework for automatic feature generation for time series classification. The forecasting/anomaly detection framework is based on using Koopman model forms along with classical linear systems and control approaches. We demonstrate the proposed framework for human activity classification, and for time series forecasting/anomaly detection in power grid application.
Time series analysis and its applications with R examples

CERN Document Server

Shumway, Robert H

2017-01-01

The fourth edition of this popular graduate textbook, like its predecessors, presents a balanced and comprehensive treatment of both time and frequency domain methods with accompanying theory. Numerous examples using nontrivial data illustrate solutions to problems such as discovering natural and anthropogenic climate change, evaluating pain perception experiments using functional magnetic resonance imaging, and monitoring a nuclear test ban treaty. The book is designed as a textbook for graduate level students in the physical, biological, and social sciences and as a graduate level text in statistics. Some parts may also serve as an undergraduate introductory course. Theory and methodology are separated to allow presentations on different levels. In addition to coverage of classical methods of time series regression, ARIMA models, spectral analysis and state-space models, the text includes modern developments including categorical time series analysis, multivariate spectral methods, long memory series, nonli...
A KST framework for correlation network construction from time series signals

Science.gov (United States)

Qi, Jin-Peng; Gu, Quan; Zhu, Ying; Zhang, Ping

2018-04-01

A KST (Kolmogorov-Smirnov test and T statistic) method is used for construction of a correlation network based on the fluctuation of each time series within the multivariate time signals. In this method, each time series is divided equally into multiple segments, and the maximal data fluctuation in each segment is calculated by a KST change detection procedure. Connections between each time series are derived from the data fluctuation matrix, and are used for construction of the fluctuation correlation network (FCN). The method was tested with synthetic simulations and the result was compared with those from using KS or T only for detection of data fluctuation. The novelty of this study is that the correlation analyses was based on the data fluctuation in each segment of each time series rather than on the original time signals, which would be more meaningful for many real world applications and for analysis of large-scale time signals where prior knowledge is uncertain.
Summary of ENDF/B Pre-Processing Codes June 1983

International Nuclear Information System (INIS)

Cullen, D.E.

1983-06-01

This is the summary documentation for the 1983 version of the ENDF/B Pre-Processing Codes LINEAR, RECENT, SIGMA1, GROUPIE, EVALPLOT, MERGER, DICTION, COMPLOT, CONVERT. This summary documentation is merely a copy of the comment cards that appear at the beginning of each programme; these comment cards always reflect the latest status of input options, etc
Multivariate stochastic analysis for Monthly hydrological time series at Cuyahoga River Basin

Science.gov (United States)

zhang, L.

2011-12-01

Copula has become a very powerful statistic and stochastic methodology in case of the multivariate analysis in Environmental and Water resources Engineering. In recent years, the popular one-parameter Archimedean copulas, e.g. Gumbel-Houggard copula, Cook-Johnson copula, Frank copula, the meta-elliptical copula, e.g. Gaussian Copula, Student-T copula, etc. have been applied in multivariate hydrological analyses, e.g. multivariate rainfall (rainfall intensity, duration and depth), flood (peak discharge, duration and volume), and drought analyses (drought length, mean and minimum SPI values, and drought mean areal extent). Copula has also been applied in the flood frequency analysis at the confluences of river systems by taking into account the dependence among upstream gauge stations rather than by using the hydrological routing technique. In most of the studies above, the annual time series have been considered as stationary signal which the time series have been assumed as independent identically distributed (i.i.d.) random variables. But in reality, hydrological time series, especially the daily and monthly hydrological time series, cannot be considered as i.i.d. random variables due to the periodicity existed in the data structure. Also, the stationary assumption is also under question due to the Climate Change and Land Use and Land Cover (LULC) change in the fast years. To this end, it is necessary to revaluate the classic approach for the study of hydrological time series by relaxing the stationary assumption by the use of nonstationary approach. Also as to the study of the dependence structure for the hydrological time series, the assumption of same type of univariate distribution also needs to be relaxed by adopting the copula theory. In this paper, the univariate monthly hydrological time series will be studied through the nonstationary time series analysis approach. The dependence structure of the multivariate monthly hydrological time series will be
Forecasting daily meteorological time series using ARIMA and regression models

Science.gov (United States)

Murat, Małgorzata; Malinowska, Iwona; Gos, Magdalena; Krzyszczak, Jaromir

2018-04-01

The daily air temperature and precipitation time series recorded between January 1, 1980 and December 31, 2010 in four European sites (Jokioinen, Dikopshof, Lleida and Lublin) from different climatic zones were modeled and forecasted. In our forecasting we used the methods of the Box-Jenkins and Holt- Winters seasonal auto regressive integrated moving-average, the autoregressive integrated moving-average with external regressors in the form of Fourier terms and the time series regression, including trend and seasonality components methodology with R software. It was demonstrated that obtained models are able to capture the dynamics of the time series data and to produce sensible forecasts.
Analysis of complex time series using refined composite multiscale entropy

International Nuclear Information System (INIS)

Wu, Shuen-De; Wu, Chiu-Wen; Lin, Shiou-Gwo; Lee, Kung-Yen; Peng, Chung-Kang

2014-01-01

Multiscale entropy (MSE) is an effective algorithm for measuring the complexity of a time series that has been applied in many fields successfully. However, MSE may yield an inaccurate estimation of entropy or induce undefined entropy because the coarse-graining procedure reduces the length of a time series considerably at large scales. Composite multiscale entropy (CMSE) was recently proposed to improve the accuracy of MSE, but it does not resolve undefined entropy. Here we propose a refined composite multiscale entropy (RCMSE) to improve CMSE. For short time series analyses, we demonstrate that RCMSE increases the accuracy of entropy estimation and reduces the probability of inducing undefined entropy.
Compounding approach for univariate time series with nonstationary variances

Science.gov (United States)

Schäfer, Rudi; Barkhofen, Sonja; Guhr, Thomas; Stöckmann, Hans-Jürgen; Kuhl, Ulrich

2015-12-01

A defining feature of nonstationary systems is the time dependence of their statistical parameters. Measured time series may exhibit Gaussian statistics on short time horizons, due to the central limit theorem. The sample statistics for long time horizons, however, averages over the time-dependent variances. To model the long-term statistical behavior, we compound the local distribution with the distribution of its parameters. Here, we consider two concrete, but diverse, examples of such nonstationary systems: the turbulent air flow of a fan and a time series of foreign exchange rates. Our main focus is to empirically determine the appropriate parameter distribution for the compounding approach. To this end, we extract the relevant time scales by decomposing the time signals into windows and determine the distribution function of the thus obtained local variances.
Tools for Generating Useful Time-series Data from PhenoCam Images

Science.gov (United States)

Milliman, T. E.; Friedl, M. A.; Frolking, S.; Hufkens, K.; Klosterman, S.; Richardson, A. D.; Toomey, M. P.

2012-12-01

The PhenoCam project (http://phenocam.unh.edu/) is tasked with acquiring, processing, and archiving digital repeat photography to be used for scientific studies of vegetation phenological processes. Over the past 5 years the PhenoCam project has collected over 2 million time series images for a total over 700 GB of image data. Several papers have been published describing derived "vegetation indices" (such as green-chromatic-coordinate or gcc) which can be compared to standard measures such as NDVI or EVI. Imagery from our archive is available for download but converting series of images for a particular camera into useful scientific data, while simple in principle, is complicated by a variety of factors. Cameras are often exposed to harsh weather conditions (high wind, rain, ice, snow pile up), which result in images where the field of view (FOV) is partially obscured or completely blocked for periods of time. The FOV can also change for other reasons (mount failures, tower maintenance, etc.) Some of the relatively inexpensive cameras that are being used can also temporarily lose color balance or exposure controls resulting in loss of imagery. All these factors negatively influence the automated analysis of the image time series making this a non-trivial task. Here we discuss the challenges of processing PhenoCam image time-series for vegetation monitoring and the associated data management tasks. We describe our current processing framework and a simple standardized output format for the resulting time-series data. The time-series data in this format will be generated for specific "regions of interest" (ROI's) for each of the cameras in the PhenoCam network. This standardized output (which will be updated daily) can be considered 'the pulse' of a particular camera and will provide a default phenological dynamic for said camera. The time-series data can also be viewed as a higher level product which can be used to generate "vegetation indices", like gcc, for
Multiple Time Series Ising Model for Financial Market Simulations

International Nuclear Information System (INIS)

Takaishi, Tetsuya

2015-01-01

In this paper we propose an Ising model which simulates multiple financial time series. Our model introduces the interaction which couples to spins of other systems. Simulations from our model show that time series exhibit the volatility clustering that is often observed in the real financial markets. Furthermore we also find non-zero cross correlations between the volatilities from our model. Thus our model can simulate stock markets where volatilities of stocks are mutually correlated
Time Series Modelling of Syphilis Incidence in China from 2005 to 2012.

Science.gov (United States)

Zhang, Xingyu; Zhang, Tao; Pei, Jiao; Liu, Yuanyuan; Li, Xiaosong; Medrano-Gracia, Pau

2016-01-01

The infection rate of syphilis in China has increased dramatically in recent decades, becoming a serious public health concern. Early prediction of syphilis is therefore of great importance for heath planning and management. In this paper, we analyzed surveillance time series data for primary, secondary, tertiary, congenital and latent syphilis in mainland China from 2005 to 2012. Seasonality and long-term trend were explored with decomposition methods. Autoregressive integrated moving average (ARIMA) was used to fit a univariate time series model of syphilis incidence. A separate multi-variable time series for each syphilis type was also tested using an autoregressive integrated moving average model with exogenous variables (ARIMAX). The syphilis incidence rates have increased three-fold from 2005 to 2012. All syphilis time series showed strong seasonality and increasing long-term trend. Both ARIMA and ARIMAX models fitted and estimated syphilis incidence well. All univariate time series showed highest goodness-of-fit results with the ARIMA(0,0,1)×(0,1,1) model. Time series analysis was an effective tool for modelling the historical and future incidence of syphilis in China. The ARIMAX model showed superior performance than the ARIMA model for the modelling of syphilis incidence. Time series correlations existed between the models for primary, secondary, tertiary, congenital and latent syphilis.
Normalization methods in time series of platelet function assays

Science.gov (United States)

Van Poucke, Sven; Zhang, Zhongheng; Roest, Mark; Vukicevic, Milan; Beran, Maud; Lauwereins, Bart; Zheng, Ming-Hua; Henskens, Yvonne; Lancé, Marcus; Marcus, Abraham

2016-01-01

Abstract Platelet function can be quantitatively assessed by specific assays such as light-transmission aggregometry, multiple-electrode aggregometry measuring the response to adenosine diphosphate (ADP), arachidonic acid, collagen, and thrombin-receptor activating peptide and viscoelastic tests such as rotational thromboelastometry (ROTEM). The task of extracting meaningful statistical and clinical information from high-dimensional data spaces in temporal multivariate clinical data represented in multivariate time series is complex. Building insightful visualizations for multivariate time series demands adequate usage of normalization techniques. In this article, various methods for data normalization (z-transformation, range transformation, proportion transformation, and interquartile range) are presented and visualized discussing the most suited approach for platelet function data series. Normalization was calculated per assay (test) for all time points and per time point for all tests. Interquartile range, range transformation, and z-transformation demonstrated the correlation as calculated by the Spearman correlation test, when normalized per assay (test) for all time points. When normalizing per time point for all tests, no correlation could be abstracted from the charts as was the case when using all data as 1 dataset for normalization. PMID:27428217

Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series.

Science.gov (United States)

Yuan, Yuan; Chen, Yi-Ping Phoebe; Ni, Shengyu; Xu, Augix Guohua; Tang, Lin; Vingron, Martin; Somel, Mehmet; Khaitovich, Philipp

2011-08-18

Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S) algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR) and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html.
hctsa: A Computational Framework for Automated Time-Series Phenotyping Using Massive Feature Extraction.

Science.gov (United States)

Fulcher, Ben D; Jones, Nick S

2017-11-22

Phenotype measurements frequently take the form of time series, but we currently lack a systematic method for relating these complex data streams to scientifically meaningful outcomes, such as relating the movement dynamics of organisms to their genotype or measurements of brain dynamics of a patient to their disease diagnosis. Previous work addressed this problem by comparing implementations of thousands of diverse scientific time-series analysis methods in an approach termed highly comparative time-series analysis. Here, we introduce hctsa, a software tool for applying this methodological approach to data. hctsa includes an architecture for computing over 7,700 time-series features and a suite of analysis and visualization algorithms to automatically select useful and interpretable time-series features for a given application. Using exemplar applications to high-throughput phenotyping experiments, we show how hctsa allows researchers to leverage decades of time-series research to quantify and understand informative structure in time-series data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models

Science.gov (United States)

Price, Larry R.

2012-01-01

The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Applied time series analysis and innovative computing

CERN Document Server

Ao, Sio-Iong

2010-01-01

This text is a systematic, state-of-the-art introduction to the use of innovative computing paradigms as an investigative tool for applications in time series analysis. It includes frontier case studies based on recent research.
On Stabilizing the Variance of Dynamic Functional Brain Connectivity Time Series.

Science.gov (United States)

Thompson, William Hedley; Fransson, Peter

2016-12-01

Assessment of dynamic functional brain connectivity based on functional magnetic resonance imaging (fMRI) data is an increasingly popular strategy to investigate temporal dynamics of the brain's large-scale network architecture. Current practice when deriving connectivity estimates over time is to use the Fisher transformation, which aims to stabilize the variance of correlation values that fluctuate around varying true correlation values. It is, however, unclear how well the stabilization of signal variance performed by the Fisher transformation works for each connectivity time series, when the true correlation is assumed to be fluctuating. This is of importance because many subsequent analyses either assume or perform better when the time series have stable variance or adheres to an approximate Gaussian distribution. In this article, using simulations and analysis of resting-state fMRI data, we analyze the effect of applying different variance stabilization strategies on connectivity time series. We focus our investigation on the Fisher transformation, the Box-Cox (BC) transformation and an approach that combines both transformations. Our results show that, if the intention of stabilizing the variance is to use metrics on the time series, where stable variance or a Gaussian distribution is desired (e.g., clustering), the Fisher transformation is not optimal and may even skew connectivity time series away from being Gaussian. Furthermore, we show that the suboptimal performance of the Fisher transformation can be substantially improved by including an additional BC transformation after the dynamic functional connectivity time series has been Fisher transformed.
Characteristics of the transmission of autoregressive sub-patterns in financial time series

Science.gov (United States)

Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong

2014-09-01

There are many types of autoregressive patterns in financial time series, and they form a transmission process. Here, we define autoregressive patterns quantitatively through an econometrical regression model. We present a computational algorithm that sets the autoregressive patterns as nodes and transmissions between patterns as edges, and then converts the transmission process of autoregressive patterns in a time series into a network. We utilised daily Shanghai (securities) composite index time series to study the transmission characteristics of autoregressive patterns. We found statistically significant evidence that the financial market is not random and that there are similar characteristics between parts and whole time series. A few types of autoregressive sub-patterns and transmission patterns drive the oscillations of the financial market. A clustering effect on fluctuations appears in the transmission process, and certain non-major autoregressive sub-patterns have high media capabilities in the financial time series. Different stock indexes exhibit similar characteristics in the transmission of fluctuation information. This work not only proposes a distinctive perspective for analysing financial time series but also provides important information for investors.
A Review of Some Aspects of Robust Inference for Time Series.

Science.gov (United States)

1984-09-01

REVIEW OF SOME ASPECTSOF ROBUST INFERNCE FOR TIME SERIES by Ad . Dougla Main TE "iAL REPOW No. 63 Septermber 1984 Department of Statistics University of ...clear. One cannot hope to have a good method for dealing with outliers in time series by using only an instantaneous nonlinear transformation of the data...AI.49 716 A REVIEWd OF SOME ASPECTS OF ROBUST INFERENCE FOR TIME 1/1 SERIES(U) WASHINGTON UNIV SEATTLE DEPT OF STATISTICS R D MARTIN SEP 84 TR-53
Refined composite multiscale weighted-permutation entropy of financial time series

Science.gov (United States)

Zhang, Yongping; Shang, Pengjian

2018-04-01

For quantifying the complexity of nonlinear systems, multiscale weighted-permutation entropy (MWPE) has recently been proposed. MWPE has incorporated amplitude information and been applied to account for the multiple inherent dynamics of time series. However, MWPE may be unreliable, because its estimated values show large fluctuation for slight variation of the data locations, and a significant distinction only for the different length of time series. Therefore, we propose the refined composite multiscale weighted-permutation entropy (RCMWPE). By comparing the RCMWPE results with other methods' results on both synthetic data and financial time series, RCMWPE method shows not only the advantages inherited from MWPE but also lower sensitivity to the data locations, more stable and much less dependent on the length of time series. Moreover, we present and discuss the results of RCMWPE method on the daily price return series from Asian and European stock markets. There are significant differences between Asian markets and European markets, and the entropy values of Hang Seng Index (HSI) are close to but higher than those of European markets. The reliability of the proposed RCMWPE method has been supported by simulations on generated and real data. It could be applied to a variety of fields to quantify the complexity of the systems over multiple scales more accurately.
Parametric, nonparametric and parametric modelling of a chaotic circuit time series

Science.gov (United States)

Timmer, J.; Rust, H.; Horbelt, W.; Voss, H. U.

2000-09-01

The determination of a differential equation underlying a measured time series is a frequently arising task in nonlinear time series analysis. In the validation of a proposed model one often faces the dilemma that it is hard to decide whether possible discrepancies between the time series and model output are caused by an inappropriate model or by bad estimates of parameters in a correct type of model, or both. We propose a combination of parametric modelling based on Bock's multiple shooting algorithm and nonparametric modelling based on optimal transformations as a strategy to test proposed models and if rejected suggest and test new ones. We exemplify this strategy on an experimental time series from a chaotic circuit where we obtain an extremely accurate reconstruction of the observed attractor.
Synthetic river flow time series generator for dispatch and spot price forecast

International Nuclear Information System (INIS)

Flores, R.A.

2007-01-01

Decision-making in electricity markets is complicated by uncertainties in demand growth, power supplies and fuel prices. In Peru, where the electrical power system is highly dependent on water resources at dams and river flows, hydrological uncertainties play a primary role in planning, price and dispatch forecast. This paper proposed a signal processing method for generating new synthetic river flow time series as a support for planning and spot market price forecasting. River flow time series are natural phenomena representing a continuous-time domain process. As an alternative synthetic representation of the original river flow time series, this proposed signal processing method preserves correlations, basic statistics and seasonality. It takes into account deterministic, periodic and non periodic components such as those due to the El Nino Southern Oscillation phenomenon. The new synthetic time series has many correlations with the original river flow time series, rendering it suitable for possible replacement of the classical method of sorting historical river flow time series. As a dispatch and planning approach to spot pricing, the proposed method offers higher accuracy modeling by decomposing the signal into deterministic, periodic, non periodic and stochastic sub signals. 4 refs., 4 tabs., 13 figs
Cross-sample entropy of foreign exchange time series

Science.gov (United States)

Liu, Li-Zhi; Qian, Xi-Yuan; Lu, Heng-Yao

2010-11-01

The correlation of foreign exchange rates in currency markets is investigated based on the empirical data of DKK/USD, NOK/USD, CAD/USD, JPY/USD, KRW/USD, SGD/USD, THB/USD and TWD/USD for a period from 1995 to 2002. Cross-SampEn (cross-sample entropy) method is used to compare the returns of every two exchange rate time series to assess their degree of asynchrony. The calculation method of confidence interval of SampEn is extended and applied to cross-SampEn. The cross-SampEn and its confidence interval for every two of the exchange rate time series in periods 1995-1998 (before the Asian currency crisis) and 1999-2002 (after the Asian currency crisis) are calculated. The results show that the cross-SampEn of every two of these exchange rates becomes higher after the Asian currency crisis, indicating a higher asynchrony between the exchange rates. Especially for Singapore, Thailand and Taiwan, the cross-SampEn values after the Asian currency crisis are significantly higher than those before the Asian currency crisis. Comparison with the correlation coefficient shows that cross-SampEn is superior to describe the correlation between time series.
Clustering Multivariate Time Series Using Hidden Markov Models

Directory of Open Access Journals (Sweden)

Shima Ghassempour

2014-03-01

Full Text Available In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs, where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers.
TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

Science.gov (United States)

Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

2017-12-01

Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at
Stochastic generation of hourly wind speed time series

International Nuclear Information System (INIS)

Shamshad, A.; Wan Mohd Ali Wan Hussin; Bawadi, M.A.; Mohd Sanusi, S.A.

2006-01-01

In the present study hourly wind speed data of Kuala Terengganu in Peninsular Malaysia are simulated by using transition matrix approach of Markovian process. The wind speed time series is divided into various states based on certain criteria. The next wind speed states are selected based on the previous states. The cumulative probability transition matrix has been formed in which each row ends with 1. Using the uniform random numbers between 0 and 1, a series of future states is generated. These states have been converted to the corresponding wind speed values using another uniform random number generator. The accuracy of the model has been determined by comparing the statistical characteristics such as average, standard deviation, root mean square error, probability density function and autocorrelation function of the generated data to those of the original data. The generated wind speed time series data is capable to preserve the wind speed characteristics of the observed data
Active and Passive Remote Sensing Data Time Series for Flood Detection and Surface Water Mapping

Science.gov (United States)

Bioresita, Filsa; Puissant, Anne; Stumpf, André; Malet, Jean-Philippe

2017-04-01

As a consequence of environmental changes surface waters are undergoing changes in time and space. A better knowledge of the spatial and temporal distribution of surface waters resources becomes essential to support sustainable policies and development activities. Especially because surface waters, are not only a vital sweet water resource, but can also pose hazards to human settlements and infrastructures through flooding. Floods are a highly frequent disaster in the world and can caused huge material losses. Detecting and mapping their spatial distribution is fundamental to ascertain damages and for relief efforts. Spaceborne Synthetic Aperture Radar (SAR) is an effective way to monitor surface waters bodies over large areas since it provides excellent temporal coverage and, all-weather day-and-night imaging capabilities. However, emergent vegetation, trees, wind or flow turbulence can increase radar back-scatter returns and pose problems for the delineation of inundated areas. In such areas, passive remote sensing data can be used to identify vegetated areas and support the interpretation of SAR data. The availability of new Earth Observation products, for example Sentinel-1 (active) and Sentinel-2 (passive) imageries, with both high spatial and temporal resolution, have the potential to facilitate flood detection and monitoring of surface waters changes which are very dynamic in space and time. In this context, the research consists of two parts. In the first part, the objective is to propose generic and reproducible methodologies for the analysis of Sentinel-1 time series data for floods detection and surface waters mapping. The processing chain comprises a series of pre-processing steps and the statistical modeling of the pixel value distribution to produce probabilistic maps for the presence of surface waters. Images pre-processing for all Sentinel-1 images comprise the reduction SAR effect like orbit errors, speckle noise, and geometric effects. A modified
Causal strength induction from time series data.

Science.gov (United States)

Soo, Kevin W; Rottman, Benjamin M

2018-04-01

One challenge when inferring the strength of cause-effect relations from time series data is that the cause and/or effect can exhibit temporal trends. If temporal trends are not accounted for, a learner could infer that a causal relation exists when it does not, or even infer that there is a positive causal relation when the relation is negative, or vice versa. We propose that learners use a simple heuristic to control for temporal trends-that they focus not on the states of the cause and effect at a given instant, but on how the cause and effect change from one observation to the next, which we call transitions. Six experiments were conducted to understand how people infer causal strength from time series data. We found that participants indeed use transitions in addition to states, which helps them to reach more accurate causal judgments (Experiments 1A and 1B). Participants use transitions more when the stimuli are presented in a naturalistic visual format than a numerical format (Experiment 2), and the effect of transitions is not driven by primacy or recency effects (Experiment 3). Finally, we found that participants primarily use the direction in which variables change rather than the magnitude of the change for estimating causal strength (Experiments 4 and 5). Collectively, these studies provide evidence that people often use a simple yet effective heuristic for inferring causal strength from time series data. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Interpretable Categorization of Heterogeneous Time Series Data

Science.gov (United States)

Lee, Ritchie; Kochenderfer, Mykel J.; Mengshoel, Ole J.; Silbermann, Joshua

2017-01-01

We analyze data from simulated aircraft encounters to validate and inform the development of a prototype aircraft collision avoidance system. The high-dimensional and heterogeneous time series dataset is analyzed to discover properties of near mid-air collisions (NMACs) and categorize the NMAC encounters. Domain experts use these properties to better organize and understand NMAC occurrences. Existing solutions either are not capable of handling high-dimensional and heterogeneous time series datasets or do not provide explanations that are interpretable by a domain expert. The latter is critical to the acceptance and deployment of safety-critical systems. To address this gap, we propose grammar-based decision trees along with a learning algorithm. Our approach extends decision trees with a grammar framework for classifying heterogeneous time series data. A context-free grammar is used to derive decision expressions that are interpretable, application-specific, and support heterogeneous data types. In addition to classification, we show how grammar-based decision trees can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply grammar-based decision trees to a simulated aircraft encounter dataset and evaluate the performance of four variants of our learning algorithm. The best algorithm is used to analyze and categorize near mid-air collisions in the aircraft encounter dataset. We describe each discovered category in detail and discuss its relevance to aircraft collision avoidance.
Minimum entropy density method for the time series analysis

Science.gov (United States)

Lee, Jeong Won; Park, Joongwoo Brian; Jo, Hang-Hyun; Yang, Jae-Suk; Moon, Hie-Tae

2009-01-01

The entropy density is an intuitive and powerful concept to study the complicated nonlinear processes derived from physical systems. We develop the minimum entropy density method (MEDM) to detect the structure scale of a given time series, which is defined as the scale in which the uncertainty is minimized, hence the pattern is revealed most. The MEDM is applied to the financial time series of Standard and Poor’s 500 index from February 1983 to April 2006. Then the temporal behavior of structure scale is obtained and analyzed in relation to the information delivery time and efficient market hypothesis.
Time series analysis of the developed financial markets' integration using visibility graphs

Science.gov (United States)

Zhuang, Enyu; Small, Michael; Feng, Gang

2014-09-01

A time series representing the developed financial markets' segmentation from 1973 to 2012 is studied. The time series reveals an obvious market integration trend. To further uncover the features of this time series, we divide it into seven windows and generate seven visibility graphs. The measuring capabilities of the visibility graphs provide means to quantitatively analyze the original time series. It is found that the important historical incidents that influenced market integration coincide with variations in the measured graphical node degree. Through the measure of neighborhood span, the frequencies of the historical incidents are disclosed. Moreover, it is also found that large "cycles" and significant noise in the time series are linked to large and small communities in the generated visibility graphs. For large cycles, how historical incidents significantly affected market integration is distinguished by density and compactness of the corresponding communities.
A cluster merging method for time series microarray with production values.

Science.gov (United States)

Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio

2014-09-01

A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.

Pre-processing data using wavelet transform and PCA based on ...

Indian Academy of Sciences (India)

Abazar Solgi

2017-07-14

Jul 14, 2017 ... Pre-processing data using wavelet transform and PCA based on support vector regression and gene expression programming for river flow simulation. Abazar Solgi1,*, Amir Pourhaghi1, Ramin Bahmani2 and Heidar Zarei3. 1. Department of Water Resources Engineering, Shahid Chamran University of ...
Constructing networks from a dynamical system perspective for multivariate nonlinear time series.

Science.gov (United States)

Nakamura, Tomomichi; Tanizawa, Toshihiro; Small, Michael

2016-03-01

We describe a method for constructing networks for multivariate nonlinear time series. We approach the interaction between the various scalar time series from a deterministic dynamical system perspective and provide a generic and algorithmic test for whether the interaction between two measured time series is statistically significant. The method can be applied even when the data exhibit no obvious qualitative similarity: a situation in which the naive method utilizing the cross correlation function directly cannot correctly identify connectivity. To establish the connectivity between nodes we apply the previously proposed small-shuffle surrogate (SSS) method, which can investigate whether there are correlation structures in short-term variabilities (irregular fluctuations) between two data sets from the viewpoint of deterministic dynamical systems. The procedure to construct networks based on this idea is composed of three steps: (i) each time series is considered as a basic node of a network, (ii) the SSS method is applied to verify the connectivity between each pair of time series taken from the whole multivariate time series, and (iii) the pair of nodes is connected with an undirected edge when the null hypothesis cannot be rejected. The network constructed by the proposed method indicates the intrinsic (essential) connectivity of the elements included in the system or the underlying (assumed) system. The method is demonstrated for numerical data sets generated by known systems and applied to several experimental time series.
Time Series Modelling of Syphilis Incidence in China from 2005 to 2012

Science.gov (United States)

Zhang, Xingyu; Zhang, Tao; Pei, Jiao; Liu, Yuanyuan; Li, Xiaosong; Medrano-Gracia, Pau

2016-01-01

Background The infection rate of syphilis in China has increased dramatically in recent decades, becoming a serious public health concern. Early prediction of syphilis is therefore of great importance for heath planning and management. Methods In this paper, we analyzed surveillance time series data for primary, secondary, tertiary, congenital and latent syphilis in mainland China from 2005 to 2012. Seasonality and long-term trend were explored with decomposition methods. Autoregressive integrated moving average (ARIMA) was used to fit a univariate time series model of syphilis incidence. A separate multi-variable time series for each syphilis type was also tested using an autoregressive integrated moving average model with exogenous variables (ARIMAX). Results The syphilis incidence rates have increased three-fold from 2005 to 2012. All syphilis time series showed strong seasonality and increasing long-term trend. Both ARIMA and ARIMAX models fitted and estimated syphilis incidence well. All univariate time series showed highest goodness-of-fit results with the ARIMA(0,0,1)×(0,1,1) model. Conclusion Time series analysis was an effective tool for modelling the historical and future incidence of syphilis in China. The ARIMAX model showed superior performance than the ARIMA model for the modelling of syphilis incidence. Time series correlations existed between the models for primary, secondary, tertiary, congenital and latent syphilis. PMID:26901682
Reconstruction of tritium time series in precipitation

International Nuclear Information System (INIS)

Celle-Jeanton, H.; Gourcy, L.; Aggarwal, P.K.

2002-01-01

Tritium is commonly used in groundwaters studies to calculate the recharge rate and to identify the presence of a modern recharge. The knowledge of 3 H precipitation time series is then very important for the study of groundwater recharge. Rozanski and Araguas provided good information on precipitation tritium content in 180 stations of the GNIP network to the end of 1987, but it shows some lacks of measurements either within one chronicle or within one region (the Southern hemisphere for instance). Therefore, it seems to be essential to find a method to recalculate data for a region where no measurement is available.To solve this problem, we propose another method which is based on triangulation. It needs the knowledge of 3 H time series of 3 stations surrounding geographically the 4-th station for which tritium input curve has to be reconstructed
Time Series, Stochastic Processes and Completeness of Quantum Theory

International Nuclear Information System (INIS)

Kupczynski, Marian

2011-01-01

Most of physical experiments are usually described as repeated measurements of some random variables. Experimental data registered by on-line computers form time series of outcomes. The frequencies of different outcomes are compared with the probabilities provided by the algorithms of quantum theory (QT). In spite of statistical predictions of QT a claim was made that it provided the most complete description of the data and of the underlying physical phenomena. This claim could be easily rejected if some fine structures, averaged out in the standard descriptive statistical analysis, were found in time series of experimental data. To search for these structures one has to use more subtle statistical tools which were developed to study time series produced by various stochastic processes. In this talk we review some of these tools. As an example we show how the standard descriptive statistical analysis of the data is unable to reveal a fine structure in a simulated sample of AR (2) stochastic process. We emphasize once again that the violation of Bell inequalities gives no information on the completeness or the non locality of QT. The appropriate way to test the completeness of quantum theory is to search for fine structures in time series of the experimental data by means of the purity tests or by studying the autocorrelation and partial autocorrelation functions.
Efficient use of correlation entropy for analysing time series data

Indian Academy of Sciences (India)

Abstract. The correlation dimension D2 and correlation entropy K2 are both important quantifiers in nonlinear time series analysis. However, use of D2 has been more common compared to K2 as a discriminating measure. One reason for this is that D2 is a static measure and can be easily evaluated from a time series.
Financial time series analysis based on information categorization method

Science.gov (United States)

Tian, Qiang; Shang, Pengjian; Feng, Guochen

2014-12-01

The paper mainly applies the information categorization method to analyze the financial time series. The method is used to examine the similarity of different sequences by calculating the distances between them. We apply this method to quantify the similarity of different stock markets. And we report the results of similarity in US and Chinese stock markets in periods 1991-1998 (before the Asian currency crisis), 1999-2006 (after the Asian currency crisis and before the global financial crisis), and 2007-2013 (during and after global financial crisis) by using this method. The results show the difference of similarity between different stock markets in different time periods and the similarity of the two stock markets become larger after these two crises. Also we acquire the results of similarity of 10 stock indices in three areas; it means the method can distinguish different areas' markets from the phylogenetic trees. The results show that we can get satisfactory information from financial markets by this method. The information categorization method can not only be used in physiologic time series, but also in financial time series.
Classification of biosensor time series using dynamic time warping: applications in screening cancer cells with characteristic biomarkers.

Science.gov (United States)

Rai, Shesh N; Trainor, Patrick J; Khosravi, Farhad; Kloecker, Goetz; Panchapakesan, Balaji

2016-01-01

The development of biosensors that produce time series data will facilitate improvements in biomedical diagnostics and in personalized medicine. The time series produced by these devices often contains characteristic features arising from biochemical interactions between the sample and the sensor. To use such characteristic features for determining sample class, similarity-based classifiers can be utilized. However, the construction of such classifiers is complicated by the variability in the time domains of such series that renders the traditional distance metrics such as Euclidean distance ineffective in distinguishing between biological variance and time domain variance. The dynamic time warping (DTW) algorithm is a sequence alignment algorithm that can be used to align two or more series to facilitate quantifying similarity. In this article, we evaluated the performance of DTW distance-based similarity classifiers for classifying time series that mimics electrical signals produced by nanotube biosensors. Simulation studies demonstrated the positive performance of such classifiers in discriminating between time series containing characteristic features that are obscured by noise in the intensity and time domains. We then applied a DTW distance-based k -nearest neighbors classifier to distinguish the presence/absence of mesenchymal biomarker in cancer cells in buffy coats in a blinded test. Using a train-test approach, we find that the classifier had high sensitivity (90.9%) and specificity (81.8%) in differentiating between EpCAM-positive MCF7 cells spiked in buffy coats and those in plain buffy coats.
chipPCR: an R package to pre-process raw data of amplification curves.

Science.gov (United States)

Rödiger, Stefan; Burdukiewicz, Michał; Schierack, Peter

2015-09-01

Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments. chipPCR is an R: package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R: 's S4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R: , such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation. http://cran.r-project.org/web/packages/chipPCR. Source code: https://github.com/michbur/chipPCR. stefan.roediger@b-tu.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A novel water quality data analysis framework based on time-series data mining.

Science.gov (United States)

Deng, Weihui; Wang, Guoyin

2017-07-01

The rapid development of time-series data mining provides an emerging method for water resource management research. In this paper, based on the time-series data mining methodology, we propose a novel and general analysis framework for water quality time-series data. It consists of two parts: implementation components and common tasks of time-series data mining in water quality data. In the first part, we propose to granulate the time series into several two-dimensional normal clouds and calculate the similarities in the granulated level. On the basis of the similarity matrix, the similarity search, anomaly detection, and pattern discovery tasks in the water quality time-series instance dataset can be easily implemented in the second part. We present a case study of this analysis framework on weekly Dissolve Oxygen time-series data collected from five monitoring stations on the upper reaches of Yangtze River, China. It discovered the relationship of water quality in the mainstream and tributary as well as the main changing patterns of DO. The experimental results show that the proposed analysis framework is a feasible and efficient method to mine the hidden and valuable knowledge from water quality historical time-series data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Development and application of a modified dynamic time warping algorithm (DTW-S to analyses of primate brain expression time series

Directory of Open Access Journals (Sweden)

Vingron Martin

2011-08-01

Full Text Available Abstract Background Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. Results Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. Conclusions The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html.
PhilDB: the time series database with built-in change logging

Directory of Open Access Journals (Sweden)

Andrew MacDonald

2016-03-01

Full Text Available PhilDB is an open-source time series database that supports storage of time series datasets that are dynamic; that is, it records updates to existing values in a log as they occur. PhilDB eases loading of data for the user by utilising an intelligent data write method. It preserves existing values during updates and abstracts the update complexity required to achieve logging of data value changes. It implements fast reads to make it practical to select data for analysis. Recent open-source systems have been developed to indefinitely store long-period high-resolution time series data without change logging. Unfortunately, such systems generally require a large initial installation investment before use because they are designed to operate over a cluster of servers to achieve high-performance writing of static data in real time. In essence, they have a ‘big data’ approach to storage and access. Other open-source projects for handling time series data that avoid the ‘big data’ approach are also relatively new and are complex or incomplete. None of these systems gracefully handle revision of existing data while tracking values that change. Unlike ‘big data’ solutions, PhilDB has been designed for single machine deployment on commodity hardware, reducing the barrier to deployment. PhilDB takes a unique approach to meta-data tracking; optional attribute attachment. This facilitates scaling the complexities of storing a wide variety of data. That is, it allows time series data to be loaded as time series instances with minimal initial meta-data, yet additional attributes can be created and attached to differentiate the time series instances when a wider variety of data is needed. PhilDB was written in Python, leveraging existing libraries. While some existing systems come close to meeting the needs PhilDB addresses, none cover all the needs at once. PhilDB was written to fill this gap in existing solutions. This paper explores existing time
Model-based Clustering of Categorical Time Series with Multinomial Logit Classification

Science.gov (United States)

Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea

2010-09-01

A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.
Time Series Discord Detection in Medical Data using a Parallel Relational Database

Energy Technology Data Exchange (ETDEWEB)

Woodbridge, Diane; Rintoul, Mark Daniel; Wilson, Andrew T.; Goldstein, Richard

2015-10-01

Recent advances in sensor technology have made continuous real-time health monitoring available in both hospital and non-hospital settings. Since data collected from high frequency medical sensors includes a huge amount of data, storing and processing continuous medical data is an emerging big data area. Especially detecting anomaly in real time is important for patients’ emergency detection and prevention. A time series discord indicates a subsequence that has the maximum difference to the rest of the time series subsequences, meaning that it has abnormal or unusual data trends. In this study, we implemented two versions of time series discord detection algorithms on a high performance parallel database management system (DBMS) and applied them to 240 Hz waveform data collected from 9,723 patients. The initial brute force version of the discord detection algorithm takes each possible subsequence and calculates a distance to the nearest non-self match to find the biggest discords in time series. For the heuristic version of the algorithm, a combination of an array and a trie structure was applied to order time series data for enhancing time efficiency. The study results showed efficient data loading, decoding and discord searches in a large amount of data, benefiting from the time series discord detection algorithm and the architectural characteristics of the parallel DBMS including data compression, data pipe-lining, and task scheduling.
Estimation of system parameters in discrete dynamical systems from time series

International Nuclear Information System (INIS)

Palaniyandi, P.; Lakshmanan, M.

2005-01-01

We propose a simple method to estimate the parameters involved in discrete dynamical systems from time series. The method is based on the concept of controlling chaos by constant feedback. The major advantages of the method are that it needs a minimal number of time series data (either vector or scalar) and is applicable to dynamical systems of any dimension. The method also works extremely well even in the presence of noise in the time series. The method is specifically illustrated by means of logistic and Henon maps
Evaluation of nonlinearity and validity of nonlinear modeling for complex time series.

Science.gov (United States)

Suzuki, Tomoya; Ikeguchi, Tohru; Suzuki, Masuo

2007-10-01

Even if an original time series exhibits nonlinearity, it is not always effective to approximate the time series by a nonlinear model because such nonlinear models have high complexity from the viewpoint of information criteria. Therefore, we propose two measures to evaluate both the nonlinearity of a time series and validity of nonlinear modeling applied to it by nonlinear predictability and information criteria. Through numerical simulations, we confirm that the proposed measures effectively detect the nonlinearity of an observed time series and evaluate the validity of the nonlinear model. The measures are also robust against observational noises. We also analyze some real time series: the difference of the number of chickenpox and measles patients, the number of sunspots, five Japanese vowels, and the chaotic laser. We can confirm that the nonlinear model is effective for the Japanese vowel /a/, the difference of the number of measles patients, and the chaotic laser.
Preprocessing for Optimization of Probabilistic-Logic Models for Sequence Analysis

DEFF Research Database (Denmark)

Christiansen, Henning; Lassen, Ole Torp

2009-01-01

and approximation are needed. The first steps are taken towards a methodology for optimizing such models by approximations using auxiliary models for preprocessing or splitting them into submodels. Evaluation of such approximating models is challenging as authoritative test data may be sparse. On the other hand...
A Framework and Algorithms for Multivariate Time Series Analytics (MTSA): Learning, Monitoring, and Recommendation

Science.gov (United States)

Ngan, Chun-Kit

2013-01-01

Making decisions over multivariate time series is an important topic which has gained significant interest in the past decade. A time series is a sequence of data points which are measured and ordered over uniform time intervals. A multivariate time series is a set of multiple, related time series in a particular domain in which domain experts…
Modeling vector nonlinear time series using POLYMARS

NARCIS (Netherlands)

de Gooijer, J.G.; Ray, B.K.

2003-01-01

A modified multivariate adaptive regression splines method for modeling vector nonlinear time series is investigated. The method results in models that can capture certain types of vector self-exciting threshold autoregressive behavior, as well as provide good predictions for more general vector
Forecasting with periodic autoregressive time series models

NARCIS (Netherlands)

Ph.H.B.F. Franses (Philip Hans); R. Paap (Richard)

1999-01-01

textabstractThis paper is concerned with forecasting univariate seasonal time series data using periodic autoregressive models. We show how one should account for unit roots and deterministic terms when generating out-of-sample forecasts. We illustrate the models for various quarterly UK consumption

vector bilinear autoregressive time series model and its superiority

African Journals Online (AJOL)

KEYWORDS: Linear time series, Autoregressive process, Autocorrelation function, Partial autocorrelation function,. Vector time .... important result on matrix algebra with respect to the spectral ..... application to covariance analysis of super-.
Correlation measure to detect time series distances, whence economy globalization

Science.gov (United States)

Miśkiewicz, Janusz; Ausloos, Marcel

2008-11-01

An instantaneous time series distance is defined through the equal time correlation coefficient. The idea is applied to the Gross Domestic Product (GDP) yearly increments of 21 rich countries between 1950 and 2005 in order to test the process of economic globalisation. Some data discussion is first presented to decide what (EKS, GK, or derived) GDP series should be studied. Distances are then calculated from the correlation coefficient values between pairs of series. The role of time averaging of the distances over finite size windows is discussed. Three network structures are next constructed based on the hierarchy of distances. It is shown that the mean distance between the most developed countries on several networks actually decreases in time, -which we consider as a proof of globalization. An empirical law is found for the evolution after 1990, similar to that found in flux creep. The optimal observation time window size is found ≃15 years.
Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms.

Science.gov (United States)

Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

2002-12-10

Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the
Time Series Analysis Using Geometric Template Matching.

Science.gov (United States)

Frank, Jordan; Mannor, Shie; Pineau, Joelle; Precup, Doina

2013-03-01

We present a novel framework for analyzing univariate time series data. At the heart of the approach is a versatile algorithm for measuring the similarity of two segments of time series called geometric template matching (GeTeM). First, we use GeTeM to compute a similarity measure for clustering and nearest-neighbor classification. Next, we present a semi-supervised learning algorithm that uses the similarity measure with hierarchical clustering in order to improve classification performance when unlabeled training data are available. Finally, we present a boosting framework called TDEBOOST, which uses an ensemble of GeTeM classifiers. TDEBOOST augments the traditional boosting approach with an additional step in which the features used as inputs to the classifier are adapted at each step to improve the training error. We empirically evaluate the proposed approaches on several datasets, such as accelerometer data collected from wearable sensors and ECG data.
On-line analysis of reactor noise using time-series analysis

International Nuclear Information System (INIS)

McGevna, V.G.

1981-10-01

A method to allow use of time series analysis for on-line noise analysis has been developed. On-line analysis of noise in nuclear power reactors has been limited primarily to spectral analysis and related frequency domain techniques. Time series analysis has many distinct advantages over spectral analysis in the automated processing of reactor noise. However, fitting an autoregressive-moving average (ARMA) model to time series data involves non-linear least squares estimation. Unless a high speed, general purpose computer is available, the calculations become too time consuming for on-line applications. To eliminate this problem, a special purpose algorithm was developed for fitting ARMA models. While it is based on a combination of steepest descent and Taylor series linearization, properties of the ARMA model are used so that the auto- and cross-correlation functions can be used to eliminate the need for estimating derivatives. The number of calculations, per iteration varies lineegardless of the mee 0.2% yield strength displayed anisotropy, with axial and circumferential values being greater than radial. For CF8-CPF8 and CF8M-CPF8M castings to meet current ASME Code S acid fuel cells
Improving GNSS time series for volcano monitoring: application to Canary Islands (Spain)

Science.gov (United States)

García-Cañada, Laura; Sevilla, Miguel J.; Pereda de Pablo, Jorge; Domínguez Cerdeña, Itahiza

2017-04-01

The number of permanent GNSS stations has increased significantly in recent years for different geodetic applications such as volcano monitoring, which require a high precision. Recently we have started to have coordinates time series long enough so that we can apply different analysis and filters that allow us to improve the GNSS coordinates results. Following this idea we have processed data from GNSS permanent stations used by the Spanish Instituto Geográfico Nacional (IGN) for volcano monitoring in Canary Islands to obtained time series by double difference processing method with Bernese v5.0 for the period 2007-2014. We have identified the characteristics of these time series and obtained models to estimate velocities with greater accuracy and more realistic uncertainties. In order to improve the results we have used two kinds of filters to improve the time series. The first, a spatial filter, has been computed using the series of residuals of all stations in the Canary Islands without an anomalous behaviour after removing a linear trend. This allows us to apply this filter to all sets of coordinates of the permanent stations reducing their dispersion. The second filter takes account of the temporal correlation in the coordinate time series for each station individually. A research about the evolution of the velocity depending on the series length has been carried out and it has demonstrated the need for using time series of at least four years. Therefore, in those stations with more than four years of data, we calculated the velocity and the characteristic parameters in order to have time series of residuals. This methodology has been applied to the GNSS data network in El Hierro (Canary Islands) during the 2011-2012 eruption and the subsequent magmatic intrusions (2012-2014). The results show that in the new series it is easier to detect anomalous behaviours in the coordinates, so they are most useful to detect crustal deformations in volcano monitoring.
Complexity analysis of the turbulent environmental fluid flow time series

Science.gov (United States)

Mihailović, D. T.; Nikolić-Đorić, E.; Drešković, N.; Mimić, G.

2014-02-01

We have used the Kolmogorov complexities, sample and permutation entropies to quantify the randomness degree in river flow time series of two mountain rivers in Bosnia and Herzegovina, representing the turbulent environmental fluid, for the period 1926-1990. In particular, we have examined the monthly river flow time series from two rivers (the Miljacka and the Bosnia) in the mountain part of their flow and then calculated the Kolmogorov complexity (KL) based on the Lempel-Ziv Algorithm (LZA) (lower-KLL and upper-KLU), sample entropy (SE) and permutation entropy (PE) values for each time series. The results indicate that the KLL, KLU, SE and PE values in two rivers are close to each other regardless of the amplitude differences in their monthly flow rates. We have illustrated the changes in mountain river flow complexity by experiments using (i) the data set for the Bosnia River and (ii) anticipated human activities and projected climate changes. We have explored the sensitivity of considered measures in dependence on the length of time series. In addition, we have divided the period 1926-1990 into three subintervals: (a) 1926-1945, (b) 1946-1965, (c) 1966-1990, and calculated the KLL, KLU, SE, PE values for the various time series in these subintervals. It is found that during the period 1946-1965, there is a decrease in their complexities, and corresponding changes in the SE and PE, in comparison to the period 1926-1990. This complexity loss may be primarily attributed to (i) human interventions, after the Second World War, on these two rivers because of their use for water consumption and (ii) climate change in recent times.
Mapping Crop Cycles in China Using MODIS-EVI Time Series

Directory of Open Access Journals (Sweden)

Le Li

2014-03-01

Full Text Available As the Earth’s population continues to grow and demand for food increases, the need for improved and timely information related to the properties and dynamics of global agricultural systems is becoming increasingly important. Global land cover maps derived from satellite data provide indispensable information regarding the geographic distribution and areal extent of global croplands. However, land use information, such as cropping intensity (defined here as the number of cropping cycles per year, is not routinely available over large areas because mapping this information from remote sensing is challenging. In this study, we present a simple but efficient algorithm for automated mapping of cropping intensity based on data from NASA’s (NASA: The National Aeronautics and Space Administration MODerate Resolution Imaging Spectroradiometer (MODIS. The proposed algorithm first applies an adaptive Savitzky-Golay filter to smooth Enhanced Vegetation Index (EVI time series derived from MODIS surface reflectance data. It then uses an iterative moving-window methodology to identify cropping cycles from the smoothed EVI time series. Comparison of results from our algorithm with national survey data at both the provincial and prefectural level in China show that the algorithm provides estimates of gross sown area that agree well with inventory data. Accuracy assessment comparing visually interpreted time series with algorithm results for a random sample of agricultural areas in China indicates an overall accuracy of 91.0% for three classes defined based on the number of cycles observed in EVI time series. The algorithm therefore appears to provide a straightforward and efficient method for mapping cropping intensity from MODIS time series data.
Toward automatic time-series forecasting using neural networks.

Science.gov (United States)

Yan, Weizhong

2012-07-01

Over the past few decades, application of artificial neural networks (ANN) to time-series forecasting (TSF) has been growing rapidly due to several unique features of ANN models. However, to date, a consistent ANN performance over different studies has not been achieved. Many factors contribute to the inconsistency in the performance of neural network models. One such factor is that ANN modeling involves determining a large number of design parameters, and the current design practice is essentially heuristic and ad hoc, this does not exploit the full potential of neural networks. Systematic ANN modeling processes and strategies for TSF are, therefore, greatly needed. Motivated by this need, this paper attempts to develop an automatic ANN modeling scheme. It is based on the generalized regression neural network (GRNN), a special type of neural network. By taking advantage of several GRNN properties (i.e., a single design parameter and fast learning) and by incorporating several design strategies (e.g., fusing multiple GRNNs), we have been able to make the proposed modeling scheme to be effective for modeling large-scale business time series. The initial model was entered into the NN3 time-series competition. It was awarded the best prediction on the reduced dataset among approximately 60 different models submitted by scholars worldwide.
Rapid prototyping of SoC-based real-time vision system: application to image preprocessing and face detection

Science.gov (United States)

Jridi, Maher; Alfalou, Ayman

2017-05-01

By this paper, the major goal is to investigate the Multi-CPU/FPGA SoC (System on Chip) design flow and to transfer a know-how and skills to rapidly design embedded real-time vision system. Our aim is to show how the use of these devices can be benefit for system level integration since they make possible simultaneous hardware and software development. We take the facial detection and pretreatments as case study since they have a great potential to be used in several applications such as video surveillance, building access control and criminal identification. The designed system use the Xilinx Zedboard platform. The last is the central element of the developed vision system. The video acquisition is performed using either standard webcam connected to the Zedboard via USB interface or several camera IP devices. The visualization of video content and intermediate results are possible with HDMI interface connected to HD display. The treatments embedded in the system are as follow: (i) pre-processing such as edge detection implemented in the ARM and in the reconfigurable logic, (ii) software implementation of motion detection and face detection using either ViolaJones or LBP (Local Binary Pattern), and (iii) application layer to select processing application and to display results in a web page. One uniquely interesting feature of the proposed system is that two functions have been developed to transmit data from and to the VDMA port. With the proposed optimization, the hardware implementation of the Sobel filter takes 27 ms and 76 ms for 640x480, and 720p resolutions, respectively. Hence, with the FPGA implementation, an acceleration of 5 times is obtained which allow the processing of 37 fps and 13 fps for 640x480, and 720p resolutions, respectively.
Multi-granular trend detection for time-series analysis

NARCIS (Netherlands)

van Goethem, A.I.; Staals, F.; Löffler, M.; Dykes, J.; Speckmann, B.

2017-01-01

Time series (such as stock prices) and ensembles (such as model runs for weather forecasts) are two important types of one-dimensional time-varying data. Such data is readily available in large quantities but visual analysis of the raw data quickly becomes infeasible, even for moderately sized data
Optimal transformations for categorical autoregressive time series

NARCIS (Netherlands)

Buuren, S. van

1996-01-01

This paper describes a method for finding optimal transformations for analyzing time series by autoregressive models. 'Optimal' implies that the agreement between the autoregressive model and the transformed data is maximal. Such transformations help 1) to increase the model fit, and 2) to analyze
Satellite Image Time Series Decomposition Based on EEMD

Directory of Open Access Journals (Sweden)

Yun-long Kong

2015-11-01

Full Text Available Satellite Image Time Series (SITS have recently been of great interest due to the emerging remote sensing capabilities for Earth observation. Trend and seasonal components are two crucial elements of SITS. In this paper, a novel framework of SITS decomposition based on Ensemble Empirical Mode Decomposition (EEMD is proposed. EEMD is achieved by sifting an ensemble of adaptive orthogonal components called Intrinsic Mode Functions (IMFs. EEMD is noise-assisted and overcomes the drawback of mode mixing in conventional Empirical Mode Decomposition (EMD. Inspired by these advantages, the aim of this work is to employ EEMD to decompose SITS into IMFs and to choose relevant IMFs for the separation of seasonal and trend components. In a series of simulations, IMFs extracted by EEMD achieved a clear representation with physical meaning. The experimental results of 16-day compositions of Moderate Resolution Imaging Spectroradiometer (MODIS, Normalized Difference Vegetation Index (NDVI, and Global Environment Monitoring Index (GEMI time series with disturbance illustrated the effectiveness and stability of the proposed approach to monitoring tasks, such as applications for the detection of abrupt changes.
An accuracy assessment of realtime GNSS time series toward semi- real time seafloor geodetic observation

Science.gov (United States)

Osada, Y.; Ohta, Y.; Demachi, T.; Kido, M.; Fujimoto, H.; Azuma, R.; Hino, R.

2013-12-01

Large interplate earthquake repeatedly occurred in Japan Trench. Recently, the detail crustal deformation revealed by the nation-wide inland GPS network called as GEONET by GSI. However, the maximum displacement region for interplate earthquake is mainly located offshore region. GPS/Acoustic seafloor geodetic observation (hereafter GPS/A) is quite important and useful for understanding of shallower part of the interplate coupling between subducting and overriding plates. We typically conduct GPS/A in specific ocean area based on repeated campaign style using research vessel or buoy. Therefore, we cannot monitor the temporal variation of seafloor crustal deformation in real time. The one of technical issue on real time observation is kinematic GPS analysis because kinematic GPS analysis based on reference and rover data. If the precise kinematic GPS analysis will be possible in the offshore region, it should be promising method for real time GPS/A with USV (Unmanned Surface Vehicle) and a moored buoy. We assessed stability, precision and accuracy of StarFireTM global satellites based augmentation system. We primarily tested for StarFire in the static condition. In order to assess coordinate precision and accuracy, we compared 1Hz StarFire time series and post-processed precise point positioning (PPP) 1Hz time series by GIPSY-OASIS II processing software Ver. 6.1.2 with three difference product types (ultra-rapid, rapid, and final orbits). We also used difference interval clock information (30 and 300 seconds) for the post-processed PPP processing. The standard deviation of real time StarFire time series is less than 30 mm (horizontal components) and 60 mm (vertical component) based on 1 month continuous processing. We also assessed noise spectrum of the estimated time series by StarFire and post-processed GIPSY PPP results. We found that the noise spectrum of StarFire time series is similar pattern with GIPSY-OASIS II processing result based on JPL rapid orbit
The Exponential Model for the Spectrum of a Time Series: Extensions and Applications

DEFF Research Database (Denmark)

Proietti, Tommaso; Luati, Alessandra

The exponential model for the spectrum of a time series and its fractional extensions are based on the Fourier series expansion of the logarithm of the spectral density. The coefficients of the expansion form the cepstrum of the time series. After deriving the cepstrum of important classes of time...
Enteroclysis and small bowel series: Comparison of radiation dose and examination time

International Nuclear Information System (INIS)

Thoeni, R.F.; Gould, R.G.

1991-01-01

Respective radiation doses and total examination and fluoroscopy times were compared for 50 patients; 25 underwent enteroclysis and 25 underwent small bowel series with (n = 17) and without (n = 8) an examination of the upper gastrointestinal (GI) tract. For enteroclysis, the mean skin entry radiation dose (12.3 rad [123 mGy]) and mean fluoroscopy time (18.4 minutes) were almost 1 1/2 times greater than those for the small bowel series with examination of the upper GI tract (8.4 rad [84 mGy]; 11.4 minutes) and almost three times greater than those for the small bowel series without upper GI examination (4.6 rad [46 mGy]; 6.3 minutes). However, the mean total examination completion time for enteroclysis (31.2 minutes) was almost half that of the small bowel series without upper GI examination (57.5 minutes) and almost four times shorter than that of the small bowel series with upper GI examination (114 minutes). The higher radiation dose of enteroclysis should be considered along with the short examination time, the age and clinical condition of the patient, and the reported higher accuracy when deciding on the appropriate radiographic examination of the small bowel
Thresholding: A Pixel-Level Image Processing Methodology Preprocessing Technique for an OCR System for the Brahmi Script

Directory of Open Access Journals (Sweden)

H. K. Anasuya Devi

2006-12-01

Full Text Available In this paper we study the methodology employed for preprocessing the archaeological images. We present the various algorithms used in the low-level processing stage of image analysis for Optical Character Recognition System for Brahmi Script. The image preprocessing technique covered in this paper is thresholding. We also try to analyze the results obtained by the pixel-level processing algorithms.
Rotation in the dynamic factor modeling of multivariate stationary time series.

NARCIS (Netherlands)

Molenaar, P.C.M.; Nesselroade, J.R.

2001-01-01

A special rotation procedure is proposed for the exploratory dynamic factor model for stationary multivariate time series. The rotation procedure applies separately to each univariate component series of a q-variate latent factor series and transforms such a component, initially represented as white
3-D image pre-processing algorithms for improved automated tracing of neuronal arbors.

Science.gov (United States)

Narayanaswamy, Arunachalam; Wang, Yu; Roysam, Badrinath

2011-09-01

The accuracy and reliability of automated neurite tracing systems is ultimately limited by image quality as reflected in the signal-to-noise ratio, contrast, and image variability. This paper describes a novel combination of image processing methods that operate on images of neurites captured by confocal and widefield microscopy, and produce synthetic images that are better suited to automated tracing. The algorithms are based on the curvelet transform (for denoising curvilinear structures and local orientation estimation), perceptual grouping by scalar voting (for elimination of non-tubular structures and improvement of neurite continuity while preserving branch points), adaptive focus detection, and depth estimation (for handling widefield images without deconvolution). The proposed methods are fast, and capable of handling large images. Their ability to handle images of unlimited size derives from automated tiling of large images along the lateral dimension, and processing of 3-D images one optical slice at a time. Their speed derives in part from the fact that the core computations are formulated in terms of the Fast Fourier Transform (FFT), and in part from parallel computation on multi-core computers. The methods are simple to apply to new images since they require very few adjustable parameters, all of which are intuitive. Examples of pre-processing DIADEM Challenge images are used to illustrate improved automated tracing resulting from our pre-processing methods.
A simple and fast representation space for classifying complex time series

International Nuclear Information System (INIS)

Zunino, Luciano; Olivares, Felipe; Bariviera, Aurelio F.; Rosso, Osvaldo A.

2017-01-01

In the context of time series analysis considerable effort has been directed towards the implementation of efficient discriminating statistical quantifiers. Very recently, a simple and fast representation space has been introduced, namely the number of turning points versus the Abbe value. It is able to separate time series from stationary and non-stationary processes with long-range dependences. In this work we show that this bidimensional approach is useful for distinguishing complex time series: different sets of financial and physiological data are efficiently discriminated. Additionally, a multiscale generalization that takes into account the multiple time scales often involved in complex systems has been also proposed. This multiscale analysis is essential to reach a higher discriminative power between physiological time series in health and disease. - Highlights: • A bidimensional scheme has been tested for classification purposes. • A multiscale generalization is introduced. • Several practical applications confirm its usefulness. • Different sets of financial and physiological data are efficiently distinguished. • This multiscale bidimensional approach has high potential as discriminative tool.

A simple and fast representation space for classifying complex time series

Energy Technology Data Exchange (ETDEWEB)

Zunino, Luciano, E-mail: lucianoz@ciop.unlp.edu.ar [Centro de Investigaciones Ópticas (CONICET La Plata – CIC), C.C. 3, 1897 Gonnet (Argentina); Departamento de Ciencias Básicas, Facultad de Ingeniería, Universidad Nacional de La Plata (UNLP), 1900 La Plata (Argentina); Olivares, Felipe, E-mail: olivaresfe@gmail.com [Instituto de Física, Pontificia Universidad Católica de Valparaíso (PUCV), 23-40025 Valparaíso (Chile); Bariviera, Aurelio F., E-mail: aurelio.fernandez@urv.cat [Department of Business, Universitat Rovira i Virgili, Av. Universitat 1, 43204 Reus (Spain); Rosso, Osvaldo A., E-mail: oarosso@gmail.com [Instituto de Física, Universidade Federal de Alagoas (UFAL), BR 104 Norte km 97, 57072-970, Maceió, Alagoas (Brazil); Instituto Tecnológico de Buenos Aires (ITBA) and CONICET, C1106ACD, Av. Eduardo Madero 399, Ciudad Autónoma de Buenos Aires (Argentina); Complex Systems Group, Facultad de Ingeniería y Ciencias Aplicadas, Universidad de los Andes, Av. Mons. Álvaro del Portillo 12.455, Las Condes, Santiago (Chile)

2017-03-18

In the context of time series analysis considerable effort has been directed towards the implementation of efficient discriminating statistical quantifiers. Very recently, a simple and fast representation space has been introduced, namely the number of turning points versus the Abbe value. It is able to separate time series from stationary and non-stationary processes with long-range dependences. In this work we show that this bidimensional approach is useful for distinguishing complex time series: different sets of financial and physiological data are efficiently discriminated. Additionally, a multiscale generalization that takes into account the multiple time scales often involved in complex systems has been also proposed. This multiscale analysis is essential to reach a higher discriminative power between physiological time series in health and disease. - Highlights: • A bidimensional scheme has been tested for classification purposes. • A multiscale generalization is introduced. • Several practical applications confirm its usefulness. • Different sets of financial and physiological data are efficiently distinguished. • This multiscale bidimensional approach has high potential as discriminative tool.
CSS Preprocessing: Tools and Automation Techniques

Directory of Open Access Journals (Sweden)

Ricardo Queirós

2018-01-01

Full Text Available Cascading Style Sheets (CSS is a W3C specification for a style sheet language used for describing the presentation of a document written in a markup language, more precisely, for styling Web documents. However, in the last few years, the landscape for CSS development has changed dramatically with the appearance of several languages and tools aiming to help developers build clean, modular and performance-aware CSS. These new approaches give developers mechanisms to preprocess CSS rules through the use of programming constructs, defined as CSS preprocessors, with the ultimate goal to bring those missing constructs to the CSS realm and to foster stylesheets structured programming. At the same time, a new set of tools appeared, defined as postprocessors, for extension and automation purposes covering a broad set of features ranging from identifying unused and duplicate code to applying vendor prefixes. With all these tools and techniques in hands, developers need to provide a consistent workflow to foster CSS modular coding. This paper aims to present an introductory survey on the CSS processors. The survey gathers information on a specific set of processors, categorizes them and compares their features regarding a set of predefined criteria such as: maturity, coverage and performance. Finally, we propose a basic set of best practices in order to setup a simple and pragmatic styling code workflow.
Visibility graphlet approach to chaotic time series

Energy Technology Data Exchange (ETDEWEB)

Mutua, Stephen [Business School, University of Shanghai for Science and Technology, Shanghai 200093 (China); Computer Science Department, Masinde Muliro University of Science and Technology, P.O. Box 190-50100, Kakamega (Kenya); Gu, Changgui, E-mail: gu-changgui@163.com, E-mail: hjyang@ustc.edu.cn; Yang, Huijie, E-mail: gu-changgui@163.com, E-mail: hjyang@ustc.edu.cn [Business School, University of Shanghai for Science and Technology, Shanghai 200093 (China)

2016-05-15

Many novel methods have been proposed for mapping time series into complex networks. Although some dynamical behaviors can be effectively captured by existing approaches, the preservation and tracking of the temporal behaviors of a chaotic system remains an open problem. In this work, we extended the visibility graphlet approach to investigate both discrete and continuous chaotic time series. We applied visibility graphlets to capture the reconstructed local states, so that each is treated as a node and tracked downstream to create a temporal chain link. Our empirical findings show that the approach accurately captures the dynamical properties of chaotic systems. Networks constructed from periodic dynamic phases all converge to regular networks and to unique network structures for each model in the chaotic zones. Furthermore, our results show that the characterization of chaotic and non-chaotic zones in the Lorenz system corresponds to the maximal Lyapunov exponent, thus providing a simple and straightforward way to analyze chaotic systems.
Fast and Flexible Multivariate Time Series Subsequence Search

Data.gov (United States)

National Aeronautics and Space Administration — Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical...
Automated Feature Design for Time Series Classification by Genetic Programming

OpenAIRE

Harvey, Dustin Yewell

2014-01-01

Time series classification (TSC) methods discover and exploit patterns in time series and other one-dimensional signals. Although many accurate, robust classifiers exist for multivariate feature sets, general approaches are needed to extend machine learning techniques to make use of signal inputs. Numerous applications of TSC can be found in structural engineering, especially in the areas of structural health monitoring and non-destructive evaluation. Additionally, the fields of process contr...
The use of synthetic input sequences in time series modeling

International Nuclear Information System (INIS)

Oliveira, Dair Jose de; Letellier, Christophe; Gomes, Murilo E.D.; Aguirre, Luis A.

2008-01-01

In many situations time series models obtained from noise-like data settle to trivial solutions under iteration. This Letter proposes a way of producing a synthetic (dummy) input, that is included to prevent the model from settling down to a trivial solution, while maintaining features of the original signal. Simulated benchmark models and a real time series of RR intervals from an ECG are used to illustrate the procedure
Near-Real-Time Monitoring of Insect Defoliation Using Landsat Time Series

Directory of Open Access Journals (Sweden)

Valerie J. Pasquarella

2017-07-01

Full Text Available Introduced insects and pathogens impact millions of acres of forested land in the United States each year, and large-scale monitoring efforts are essential for tracking the spread of outbreaks and quantifying the extent of damage. However, monitoring the impacts of defoliating insects presents a significant challenge due to the ephemeral nature of defoliation events. Using the 2016 gypsy moth (Lymantria dispar outbreak in Southern New England as a case study, we present a new approach for near-real-time defoliation monitoring using synthetic images produced from Landsat time series. By comparing predicted and observed images, we assessed changes in vegetation condition multiple times over the course of an outbreak. Initial measures can be made as imagery becomes available, and season-integrated products provide a wall-to-wall assessment of potential defoliation at 30 m resolution. Qualitative and quantitative comparisons suggest our Landsat Time Series (LTS products improve identification of defoliation events relative to existing products and provide a repeatable metric of change in condition. Our synthetic-image approach is an important step toward using the full temporal potential of the Landsat archive for operational monitoring of forest health over large extents, and provides an important new tool for understanding spatial and temporal dynamics of insect defoliators.
Analysis of three amphibian populations with quarter-century long time-series.

OpenAIRE

Meyer, A H; Schimidt, B R; Grossenbacher, K

1998-01-01

Amphibians are in decline in many parts of the world. Long tme-series of amphibian populations are necessary to distinguish declines from the often strong fluctuations observed in natural populations. Time-series may also help to understand the causes of these declines. We analysed 23-28-year long time-series of the frog Rana temporaria. Only one of the three studied populations showed a negative trend which was probably caused by the introduction of fish. Two populations appeared to be densi...
Linguistic Preprocessing and Tagging for Problem Report Trend Analysis

Science.gov (United States)

Beil, Robert J.; Malin, Jane T.

2012-01-01

Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.
Validation of DWI pre-processing procedures for reliable differentiation between human brain gliomas.

Science.gov (United States)

Vellmer, Sebastian; Tonoyan, Aram S; Suter, Dieter; Pronin, Igor N; Maximov, Ivan I

2018-02-01

Diffusion magnetic resonance imaging (dMRI) is a powerful tool in clinical applications, in particular, in oncology screening. dMRI demonstrated its benefit and efficiency in the localisation and detection of different types of human brain tumours. Clinical dMRI data suffer from multiple artefacts such as motion and eddy-current distortions, contamination by noise, outliers etc. In order to increase the image quality of the derived diffusion scalar metrics and the accuracy of the subsequent data analysis, various pre-processing approaches are actively developed and used. In the present work we assess the effect of different pre-processing procedures such as a noise correction, different smoothing algorithms and spatial interpolation of raw diffusion data, with respect to the accuracy of brain glioma differentiation. As a set of sensitive biomarkers of the glioma malignancy grades we chose the derived scalar metrics from diffusion and kurtosis tensor imaging as well as the neurite orientation dispersion and density imaging (NODDI) biophysical model. Our results show that the application of noise correction, anisotropic diffusion filtering, and cubic-order spline interpolation resulted in the highest sensitivity and specificity for glioma malignancy grading. Thus, these pre-processing steps are recommended for the statistical analysis in brain tumour studies. Copyright © 2017. Published by Elsevier GmbH.
Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting

KAUST Repository

Fernandes, José Antonio

2013-02-01

A multi-species approach to fisheries management requires taking into account the interactions between species in order to improve recruitment forecasting of the fish species. Recent advances in Bayesian networks direct the learning of models with several interrelated variables to be forecasted simultaneously. These models are known as multi-dimensional Bayesian network classifiers (MDBNs). Pre-processing steps are critical for the posterior learning of the model in these kinds of domains. Therefore, in the present study, a set of \\'state-of-the-art\\' uni-dimensional pre-processing methods, within the categories of missing data imputation, feature discretization and feature subset selection, are adapted to be used with MDBNs. A framework that includes the proposed multi-dimensional supervised pre-processing methods, coupled with a MDBN classifier, is tested with synthetic datasets and the real domain of fish recruitment forecasting. The correctly forecasting of three fish species (anchovy, sardine and hake) simultaneously is doubled (from 17.3% to 29.5%) using the multi-dimensional approach in comparison to mono-species models. The probability assessments also show high improvement reducing the average error (estimated by means of Brier score) from 0.35 to 0.27. Finally, these differences are superior to the forecasting of species by pairs. © 2012 Elsevier Ltd.
The 1992 ENDF Pre-processing codes

International Nuclear Information System (INIS)

Cullen, D.E.

1992-01-01

This document summarizes the 1992 version of the ENDF pre-processing codes which are required for processing evaluated nuclear data coded in the format ENDF-4, ENDF-5, or ENDF-6. Included are the codes CONVERT, MERGER, LINEAR, RECENT, SIGMA1, LEGEND, FIXUP, GROUPIE, DICTION, MIXER, VIRGIN, COMPLOT, EVALPLOT, RELABEL. Some of the functions of these codes are: to calculate cross-sections from resonance parameters; to calculate angular distributions, group average, mixtures of cross-sections, etc; to produce graphical plottings and data comparisons. The codes are designed to operate on virtually any type of computer including PC's. They are available from the IAEA Nuclear Data Section, free of charge upon request, on magnetic tape or a set of HD diskettes. (author)
Robust Control Charts for Time Series Data

NARCIS (Netherlands)

Croux, C.; Gelper, S.; Mahieu, K.

2010-01-01

This article presents a control chart for time series data, based on the one-step- ahead forecast errors of the Holt-Winters forecasting method. We use robust techniques to prevent that outliers affect the estimation of the control limits of the chart. Moreover, robustness is important to maintain
Lecture notes for Advanced Time Series Analysis

DEFF Research Database (Denmark)

Madsen, Henrik; Holst, Jan

1997-01-01

A first version of this notes was used at the lectures in Grenoble, and they are now extended and improved (together with Jan Holst), and used in Ph.D. courses on Advanced Time Series Analysis at IMM and at the Department of Mathematical Statistics, University of Lund, 1994, 1997, ...
Predicting long-term catchment nutrient export: the use of nonlinear time series models

Science.gov (United States)

Valent, Peter; Howden, Nicholas J. K.; Szolgay, Jan; Komornikova, Magda

2010-05-01

After the Second World War the nitrate concentrations in European water bodies changed significantly as the result of increased nitrogen fertilizer use and changes in land use. However, in the last decades, as a consequence of the implementation of nitrate-reducing measures in Europe, the nitrate concentrations in water bodies slowly decrease. This causes that the mean and variance of the observed time series also changes with time (nonstationarity and heteroscedascity). In order to detect changes and properly describe the behaviour of such time series by time series analysis, linear models (such as autoregressive (AR), moving average (MA) and autoregressive moving average models (ARMA)), are no more suitable. Time series with sudden changes in statistical characteristics can cause various problems in the calibration of traditional water quality models and thus give biased predictions. Proper statistical analysis of these non-stationary and heteroscedastic time series with the aim of detecting and subsequently explaining the variations in their statistical characteristics requires the use of nonlinear time series models. This information can be then used to improve the model building and calibration of conceptual water quality model or to select right calibration periods in order to produce reliable predictions. The objective of this contribution is to analyze two long time series of nitrate concentrations of the rivers Ouse and Stour with advanced nonlinear statistical modelling techniques and compare their performance with traditional linear models of the ARMA class in order to identify changes in the time series characteristics. The time series were analysed with nonlinear models with multiple regimes represented by self-exciting threshold autoregressive (SETAR) and Markov-switching models (MSW). The analysis showed that, based on the value of residual sum of squares (RSS) in both datasets, SETAR and MSW models described the time-series better than models of the
Extracting biologically significant patterns from short time series gene expression data

Directory of Open Access Journals (Sweden)

McGinnis Thomas

2009-08-01

Full Text Available Abstract Background Time series gene expression data analysis is used widely to study the dynamics of various cell processes. Most of the time series data available today consist of few time points only, thus making the application of standard clustering techniques difficult. Results We developed two new algorithms that are capable of extracting biological patterns from short time point series gene expression data. The two algorithms, ASTRO and MiMeSR, are inspired by the rank order preserving framework and the minimum mean squared residue approach, respectively. However, ASTRO and MiMeSR differ from previous approaches in that they take advantage of the relatively few number of time points in order to reduce the problem from NP-hard to linear. Tested on well-defined short time expression data, we found that our approaches are robust to noise, as well as to random patterns, and that they can correctly detect the temporal expression profile of relevant functional categories. Evaluation of our methods was performed using Gene Ontology (GO annotations and chromatin immunoprecipitation (ChIP-chip data. Conclusion Our approaches generally outperform both standard clustering algorithms and algorithms designed specifically for clustering of short time series gene expression data. Both algorithms are available at http://www.benoslab.pitt.edu/astro/.
The application of complex network time series analysis in turbulent heated jets

International Nuclear Information System (INIS)

Charakopoulos, A. K.; Karakasidis, T. E.; Liakopoulos, A.; Papanicolaou, P. N.

2014-01-01

In the present study, we applied the methodology of the complex network-based time series analysis to experimental temperature time series from a vertical turbulent heated jet. More specifically, we approach the hydrodynamic problem of discriminating time series corresponding to various regions relative to the jet axis, i.e., time series corresponding to regions that are close to the jet axis from time series originating at regions with a different dynamical regime based on the constructed network properties. Applying the transformation phase space method (k nearest neighbors) and also the visibility algorithm, we transformed time series into networks and evaluated the topological properties of the networks such as degree distribution, average path length, diameter, modularity, and clustering coefficient. The results show that the complex network approach allows distinguishing, identifying, and exploring in detail various dynamical regions of the jet flow, and associate it to the corresponding physical behavior. In addition, in order to reject the hypothesis that the studied networks originate from a stochastic process, we generated random network and we compared their statistical properties with that originating from the experimental data. As far as the efficiency of the two methods for network construction is concerned, we conclude that both methodologies lead to network properties that present almost the same qualitative behavior and allow us to reveal the underlying system dynamics
Trend analysis using non-stationary time series clustering based on the finite element method

Science.gov (United States)

Gorji Sefidmazgi, M.; Sayemuzzaman, M.; Homaifar, A.; Jha, M. K.; Liess, S.

2014-05-01

In order to analyze low-frequency variability of climate, it is useful to model the climatic time series with multiple linear trends and locate the times of significant changes. In this paper, we have used non-stationary time series clustering to find change points in the trends. Clustering in a multi-dimensional non-stationary time series is challenging, since the problem is mathematically ill-posed. Clustering based on the finite element method (FEM) is one of the methods that can analyze multidimensional time series. One important attribute of this method is that it is not dependent on any statistical assumption and does not need local stationarity in the time series. In this paper, it is shown how the FEM-clustering method can be used to locate change points in the trend of temperature time series from in situ observations. This method is applied to the temperature time series of North Carolina (NC) and the results represent region-specific climate variability despite higher frequency harmonics in climatic time series. Next, we investigated the relationship between the climatic indices with the clusters/trends detected based on this clustering method. It appears that the natural variability of climate change in NC during 1950-2009 can be explained mostly by AMO and solar activity.
Frontiers in Time Series and Financial Econometrics : An overview

NARCIS (Netherlands)

S. Ling (Shiqing); M.J. McAleer (Michael); H. Tong (Howell)

2015-01-01

markdownabstract__Abstract__ Two of the fastest growing frontiers in econometrics and quantitative finance are time series and financial econometrics. Significant theoretical contributions to financial econometrics have been made by experts in statistics, econometrics, mathematics, and time
Frontiers in Time Series and Financial Econometrics: An Overview

NARCIS (Netherlands)

S. Ling (Shiqing); M.J. McAleer (Michael); H. Tong (Howell)

2015-01-01

markdownabstract__Abstract__ Two of the fastest growing frontiers in econometrics and quantitative finance are time series and financial econometrics. Significant theoretical contributions to financial econometrics have been made by experts in statistics, econometrics, mathematics, and time

A four-stage hybrid model for hydrological time series forecasting.

Science.gov (United States)

Di, Chongli; Yang, Xiaohua; Wang, Xiaochao

2014-01-01

Hydrological time series forecasting remains a difficult task due to its complicated nonlinear, non-stationary and multi-scale characteristics. To solve this difficulty and improve the prediction accuracy, a novel four-stage hybrid model is proposed for hydrological time series forecasting based on the principle of 'denoising, decomposition and ensemble'. The proposed model has four stages, i.e., denoising, decomposition, components prediction and ensemble. In the denoising stage, the empirical mode decomposition (EMD) method is utilized to reduce the noises in the hydrological time series. Then, an improved method of EMD, the ensemble empirical mode decomposition (EEMD), is applied to decompose the denoised series into a number of intrinsic mode function (IMF) components and one residual component. Next, the radial basis function neural network (RBFNN) is adopted to predict the trend of all of the components obtained in the decomposition stage. In the final ensemble prediction stage, the forecasting results of all of the IMF and residual components obtained in the third stage are combined to generate the final prediction results, using a linear neural network (LNN) model. For illustration and verification, six hydrological cases with different characteristics are used to test the effectiveness of the proposed model. The proposed hybrid model performs better than conventional single models, the hybrid models without denoising or decomposition and the hybrid models based on other methods, such as the wavelet analysis (WA)-based hybrid models. In addition, the denoising and decomposition strategies decrease the complexity of the series and reduce the difficulties of the forecasting. With its effective denoising and accurate decomposition ability, high prediction precision and wide applicability, the new model is very promising for complex time series forecasting. This new forecast model is an extension of nonlinear prediction models.
A Four-Stage Hybrid Model for Hydrological Time Series Forecasting

Science.gov (United States)

Di, Chongli; Yang, Xiaohua; Wang, Xiaochao

2014-01-01

Hydrological time series forecasting remains a difficult task due to its complicated nonlinear, non-stationary and multi-scale characteristics. To solve this difficulty and improve the prediction accuracy, a novel four-stage hybrid model is proposed for hydrological time series forecasting based on the principle of ‘denoising, decomposition and ensemble’. The proposed model has four stages, i.e., denoising, decomposition, components prediction and ensemble. In the denoising stage, the empirical mode decomposition (EMD) method is utilized to reduce the noises in the hydrological time series. Then, an improved method of EMD, the ensemble empirical mode decomposition (EEMD), is applied to decompose the denoised series into a number of intrinsic mode function (IMF) components and one residual component. Next, the radial basis function neural network (RBFNN) is adopted to predict the trend of all of the components obtained in the decomposition stage. In the final ensemble prediction stage, the forecasting results of all of the IMF and residual components obtained in the third stage are combined to generate the final prediction results, using a linear neural network (LNN) model. For illustration and verification, six hydrological cases with different characteristics are used to test the effectiveness of the proposed model. The proposed hybrid model performs better than conventional single models, the hybrid models without denoising or decomposition and the hybrid models based on other methods, such as the wavelet analysis (WA)-based hybrid models. In addition, the denoising and decomposition strategies decrease the complexity of the series and reduce the difficulties of the forecasting. With its effective denoising and accurate decomposition ability, high prediction precision and wide applicability, the new model is very promising for complex time series forecasting. This new forecast model is an extension of nonlinear prediction models. PMID:25111782
A Gaussian Process Based Online Change Detection Algorithm for Monitoring Periodic Time Series

Energy Technology Data Exchange (ETDEWEB)

Chandola, Varun [ORNL; Vatsavai, Raju [ORNL

2011-01-01

Online time series change detection is a critical component of many monitoring systems, such as space and air-borne remote sensing instruments, cardiac monitors, and network traffic profilers, which continuously analyze observations recorded by sensors. Data collected by such sensors typically has a periodic (seasonal) component. Most existing time series change detection methods are not directly applicable to handle such data, either because they are not designed to handle periodic time series or because they cannot operate in an online mode. We propose an online change detection algorithm which can handle periodic time series. The algorithm uses a Gaussian process based non-parametric time series prediction model and monitors the difference between the predictions and actual observations within a statistically principled control chart framework to identify changes. A key challenge in using Gaussian process in an online mode is the need to solve a large system of equations involving the associated covariance matrix which grows with every time step. The proposed algorithm exploits the special structure of the covariance matrix and can analyze a time series of length T in O(T^2) time while maintaining a O(T) memory footprint, compared to O(T^4) time and O(T^2) memory requirement of standard matrix manipulation methods. We experimentally demonstrate the superiority of the proposed algorithm over several existing time series change detection algorithms on a set of synthetic and real time series. Finally, we illustrate the effectiveness of the proposed algorithm for identifying land use land cover changes using Normalized Difference Vegetation Index (NDVI) data collected for an agricultural region in Iowa state, USA. Our algorithm is able to detect different types of changes in a NDVI validation data set (with ~80% accuracy) which occur due to crop type changes as well as disruptive changes (e.g., natural disasters).
Input data preprocessing method for exchange rate forecasting via neural network

Directory of Open Access Journals (Sweden)

Antić Dragan S.

2014-01-01

Full Text Available The aim of this paper is to present a method for neural network input parameters selection and preprocessing. The purpose of this network is to forecast foreign exchange rates using artificial intelligence. Two data sets are formed for two different economic systems. Each system is represented by six categories with 70 economic parameters which are used in the analysis. Reduction of these parameters within each category was performed by using the principal component analysis method. Component interdependencies are established and relations between them are formed. Newly formed relations were used to create input vectors of a neural network. The multilayer feed forward neural network is formed and trained using batch training. Finally, simulation results are presented and it is concluded that input data preparation method is an effective way for preprocessing neural network data. [Projekat Ministarstva nauke Republike Srbije, br.TR 35005, br. III 43007 i br. III 44006
CudaPre3D: An Alternative Preprocessing Algorithm for Accelerating 3D Convex Hull Computation on the GPU

Directory of Open Access Journals (Sweden)

MEI, G.

2015-05-01

Full Text Available In the calculating of convex hulls for point sets, a preprocessing procedure that is to filter the input points by discarding non-extreme points is commonly used to improve the computational efficiency. We previously proposed a quite straightforward preprocessing approach for accelerating 2D convex hull computation on the GPU. In this paper, we extend that algorithm to being used in 3D cases. The basic ideas behind these two preprocessing algorithms are similar: first, several groups of extreme points are found according to the original set of input points and several rotated versions of the input set; then, a convex polyhedron is created using the found extreme points; and finally those interior points locating inside the formed convex polyhedron are discarded. Experimental results show that: when employing the proposed preprocessing algorithm, it achieves the speedups of about 4x on average and 5x to 6x in the best cases over the cases where the proposed approach is not used. In addition, more than 95 percent of the input points can be discarded in most experimental tests.
Time-variant power spectral analysis of heart-rate time series by ...

Indian Academy of Sciences (India)

Frequency domain representation of a short-term heart-rate time series (HRTS) signal is a popular method for evaluating the cardiovascular control system. The spectral parameters, viz. percentage power in low frequency band (%PLF), percentage power in high frequency band (%PHF), power ratio of low frequency to high ...
Rotation in the Dynamic Factor Modeling of Multivariate Stationary Time Series.

Science.gov (United States)

Molenaar, Peter C. M.; Nesselroade, John R.

2001-01-01

Proposes a special rotation procedure for the exploratory dynamic factor model for stationary multivariate time series. The rotation procedure applies separately to each univariate component series of a q-variate latent factor series and transforms such a component, initially represented as white noise, into a univariate moving-average.…
Estimating the level of dynamical noise in time series by using fractal dimensions

Energy Technology Data Exchange (ETDEWEB)

Sase, Takumi, E-mail: sase@sat.t.u-tokyo.ac.jp [Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 153-8505 (Japan); Ramírez, Jonatán Peña [CONACYT Research Fellow, Center for Scientific Research and Higher Education at Ensenada (CICESE), Carretera Ensenada-Tijuana No. 3918, Zona Playitas, C.P. 22860, Ensenada, Baja California (Mexico); Kitajo, Keiichi [BSI-Toyota Collaboration Center, RIKEN Brain Science Institute, Wako, Saitama 351-0198 (Japan); Aihara, Kazuyuki; Hirata, Yoshito [Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 153-8505 (Japan); Institute of Industrial Science, The University of Tokyo, Tokyo 153-8505 (Japan)

2016-03-11

We present a method for estimating the dynamical noise level of a ‘short’ time series even if the dynamical system is unknown. The proposed method estimates the level of dynamical noise by calculating the fractal dimensions of the time series. Additionally, the method is applied to EEG data to demonstrate its possible effectiveness as an indicator of temporal changes in the level of dynamical noise. - Highlights: • A dynamical noise level estimator for time series is proposed. • The estimator does not need any information about the dynamics generating the time series. • The estimator is based on a novel definition of time series dimension (TSD). • It is demonstrated that there exists a monotonic relationship between the • TSD and the level of dynamical noise. • We apply the proposed method to human electroencephalographic data.
Estimating the level of dynamical noise in time series by using fractal dimensions

International Nuclear Information System (INIS)

Sase, Takumi; Ramírez, Jonatán Peña; Kitajo, Keiichi; Aihara, Kazuyuki; Hirata, Yoshito

2016-01-01

We present a method for estimating the dynamical noise level of a ‘short’ time series even if the dynamical system is unknown. The proposed method estimates the level of dynamical noise by calculating the fractal dimensions of the time series. Additionally, the method is applied to EEG data to demonstrate its possible effectiveness as an indicator of temporal changes in the level of dynamical noise. - Highlights: • A dynamical noise level estimator for time series is proposed. • The estimator does not need any information about the dynamics generating the time series. • The estimator is based on a novel definition of time series dimension (TSD). • It is demonstrated that there exists a monotonic relationship between the • TSD and the level of dynamical noise. • We apply the proposed method to human electroencephalographic data.
Updating Landsat time series of surface-reflectance composites and forest change products with new observations

Science.gov (United States)

Hermosilla, Txomin; Wulder, Michael A.; White, Joanne C.; Coops, Nicholas C.; Hobart, Geordie W.

2017-12-01

The use of time series satellite data allows for the temporally dense, systematic, transparent, and synoptic capture of land dynamics over time. Subsequent to the opening of the Landsat archive, several time series approaches for characterizing landscape change have been developed, often representing a particular analytical time window. The information richness and widespread utility of these time series data have created a need to maintain the currency of time series information via the addition of new data, as it becomes available. When an existing time series is temporally extended, it is critical that previously generated change information remains consistent, thereby not altering reported change statistics or science outcomes based on that change information. In this research, we investigate the impacts and implications of adding additional years to an existing 29-year annual Landsat time series for forest change. To do so, we undertook a spatially explicit comparison of the 29 overlapping years of a time series representing 1984-2012, with a time series representing 1984-2016. Surface reflectance values, and presence, year, and type of change were compared. We found that the addition of years to extend the time series had minimal effect on the annual surface reflectance composites, with slight band-specific differences (r ≥ 0.1) in the final years of the original time series being updated. The area of stand replacing disturbances and determination of change year are virtually unchanged for the overlapping period between the two time-series products. Over the overlapping temporal period (1984-2012), the total area of change differs by 0.53%, equating to an annual difference in change area of 0.019%. Overall, the spatial and temporal agreement of the changes detected by both time series was 96%. Further, our findings suggest that the entire pre-existing historic time series does not need to be re-processed during the update process. Critically, given the time
Status of pre-processing of waste electrical and electronic equipment in Germany and its influence on the recovery of gold.

Science.gov (United States)

Chancerel, Perrine; Bolland, Til; Rotter, Vera Susanne

2011-03-01

Waste electrical and electronic equipment (WEEE) contains gold in low but from an environmental and economic point of view relevant concentration. After collection, WEEE is pre-processed in order to generate appropriate material fractions that are sent to the subsequent end-processing stages (recovery, reuse or disposal). The goal of this research is to quantify the overall recovery rates of pre-processing technologies used in Germany for the reference year 2007. To achieve this goal, facilities operating in Germany were listed and classified according to the technology they apply. Information on their processing capacity was gathered by evaluating statistical databases. Based on a literature review of experimental results for gold recovery rates of different pre-processing technologies, the German overall recovery rate of gold at the pre-processing level was quantified depending on the characteristics of the treated WEEE. The results reveal that - depending on the equipment groups - pre-processing recovery rates of gold of 29 to 61% are achieved in Germany. Some practical recommendations to reduce the losses during pre-processing could be formulated. Defining mass-based recovery targets in the legislation does not set incentives to recover trace elements. Instead, the priorities for recycling could be defined based on other parameters like the environmental impacts of the materials. The implementation of measures to reduce the gold losses would also improve the recovery of several other non-ferrous metals like tin, nickel, and palladium.
A New Modified Histogram Matching Normalization for Time Series Microarray Analysis.

Science.gov (United States)

Astola, Laura; Molenaar, Jaap

2014-07-01

Microarray data is often utilized in inferring regulatory networks. Quantile normalization (QN) is a popular method to reduce array-to-array variation. We show that in the context of time series measurements QN may not be the best choice for this task, especially not if the inference is based on continuous time ODE model. We propose an alternative normalization method that is better suited for network inference from time series data.
Quantifying evolutionary dynamics from variant-frequency time series

Science.gov (United States)

Khatri, Bhavin S.

2016-09-01

From Kimura’s neutral theory of protein evolution to Hubbell’s neutral theory of biodiversity, quantifying the relative importance of neutrality versus selection has long been a basic question in evolutionary biology and ecology. With deep sequencing technologies, this question is taking on a new form: given a time-series of the frequency of different variants in a population, what is the likelihood that the observation has arisen due to selection or neutrality? To tackle the 2-variant case, we exploit Fisher’s angular transformation, which despite being discovered by Ronald Fisher a century ago, has remained an intellectual curiosity. We show together with a heuristic approach it provides a simple solution for the transition probability density at short times, including drift, selection and mutation. Our results show under that under strong selection and sufficiently frequent sampling these evolutionary parameters can be accurately determined from simulation data and so they provide a theoretical basis for techniques to detect selection from variant or polymorphism frequency time-series.
RankExplorer: Visualization of Ranking Changes in Large Time Series Data.

Science.gov (United States)

Shi, Conglei; Cui, Weiwei; Liu, Shixia; Xu, Panpan; Chen, Wei; Qu, Huamin

2012-12-01

For many applications involving time series data, people are often interested in the changes of item values over time as well as their ranking changes. For example, people search many words via search engines like Google and Bing every day. Analysts are interested in both the absolute searching number for each word as well as their relative rankings. Both sets of statistics may change over time. For very large time series data with thousands of items, how to visually present ranking changes is an interesting challenge. In this paper, we propose RankExplorer, a novel visualization method based on ThemeRiver to reveal the ranking changes. Our method consists of four major components: 1) a segmentation method which partitions a large set of time series curves into a manageable number of ranking categories; 2) an extended ThemeRiver view with embedded color bars and changing glyphs to show the evolution of aggregation values related to each ranking category over time as well as the content changes in each ranking category; 3) a trend curve to show the degree of ranking changes over time; 4) rich user interactions to support interactive exploration of ranking changes. We have applied our method to some real time series data and the case studies demonstrate that our method can reveal the underlying patterns related to ranking changes which might otherwise be obscured in traditional visualizations.
ESTIMATING RELIABILITY OF DISTURBANCES IN SATELLITE TIME SERIES DATA BASED ON STATISTICAL ANALYSIS

Directory of Open Access Journals (Sweden)

Z.-G. Zhou

2016-06-01

Full Text Available Normally, the status of land cover is inherently dynamic and changing continuously on temporal scale. However, disturbances or abnormal changes of land cover — caused by such as forest fire, flood, deforestation, and plant diseases — occur worldwide at unknown times and locations. Timely detection and characterization of these disturbances is of importance for land cover monitoring. Recently, many time-series-analysis methods have been developed for near real-time or online disturbance detection, using satellite image time series. However, the detection results were only labelled with “Change/ No change” by most of the present methods, while few methods focus on estimating reliability (or confidence level of the detected disturbances in image time series. To this end, this paper propose a statistical analysis method for estimating reliability of disturbances in new available remote sensing image time series, through analysis of full temporal information laid in time series data. The method consists of three main steps. (1 Segmenting and modelling of historical time series data based on Breaks for Additive Seasonal and Trend (BFAST. (2 Forecasting and detecting disturbances in new time series data. (3 Estimating reliability of each detected disturbance using statistical analysis based on Confidence Interval (CI and Confidence Levels (CL. The method was validated by estimating reliability of disturbance regions caused by a recent severe flooding occurred around the border of Russia and China. Results demonstrated that the method can estimate reliability of disturbances detected in satellite image with estimation error less than 5% and overall accuracy up to 90%.
Finite difference time domain calculation of three-dimensional phononic band structures using a postprocessing method based on the filter diagonalization

International Nuclear Information System (INIS)

Su Xiaoxing; Ma Tianxue; Wang Yuesheng

2011-01-01

If the band structure of a three-dimensional (3D) phononic crystal (PNC) is calculated by using the finite difference time domain (FDTD) method combined with the fast Fourier transform (FFT)-based postprocessing method, good results can only be ensured by a sufficiently large number of FDTD iterations. On a common computer platform, the total computation time will be very long. To overcome this difficulty, an excellent harmonic inversion algorithm called the filter diagonalization method (FDM) can be used in the postprocessing to reduce the number of FDTD iterations. However, the low efficiency of the FDM, which occurs when a relatively long time series is given, does not necessarily ensure an effective reduction of the total computation time. In this paper, a postprocessing method based on the FDM is proposed. The main procedure of the method is designed considering the aim to make the time spent on the method itself far less than the corresponding time spent on the FDTD iterations. To this end, the FDTD time series is preprocessed to be shortened significantly before the FDM frequency extraction. The preprocessing procedure is performed with the filter and decimation operations, which are widely used in narrow-band signal processing. Numerical results for a typical 3D solid PNC system show that the proposed postprocessing method can be used to effectively reduce the total computation time of the FDTD calculation of 3D phononic band structures.
Finite difference time domain calculation of three-dimensional phononic band structures using a postprocessing method based on the filter diagonalization

Energy Technology Data Exchange (ETDEWEB)

Su Xiaoxing [School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044 (China); Ma Tianxue; Wang Yuesheng, E-mail: xxsu@bjtu.edu.cn [Institute of Engineering Mechanics, Beijing Jiaotong University, Beijing 100044 (China)

2011-10-15

If the band structure of a three-dimensional (3D) phononic crystal (PNC) is calculated by using the finite difference time domain (FDTD) method combined with the fast Fourier transform (FFT)-based postprocessing method, good results can only be ensured by a sufficiently large number of FDTD iterations. On a common computer platform, the total computation time will be very long. To overcome this difficulty, an excellent harmonic inversion algorithm called the filter diagonalization method (FDM) can be used in the postprocessing to reduce the number of FDTD iterations. However, the low efficiency of the FDM, which occurs when a relatively long time series is given, does not necessarily ensure an effective reduction of the total computation time. In this paper, a postprocessing method based on the FDM is proposed. The main procedure of the method is designed considering the aim to make the time spent on the method itself far less than the corresponding time spent on the FDTD iterations. To this end, the FDTD time series is preprocessed to be shortened significantly before the FDM frequency extraction. The preprocessing procedure is performed with the filter and decimation operations, which are widely used in narrow-band signal processing. Numerical results for a typical 3D solid PNC system show that the proposed postprocessing method can be used to effectively reduce the total computation time of the FDTD calculation of 3D phononic band structures.
Wavelet entropy of BOLD time series: An application to Rolandic epilepsy.

Science.gov (United States)

Gupta, Lalit; Jansen, Jacobus F A; Hofman, Paul A M; Besseling, René M H; de Louw, Anton J A; Aldenkamp, Albert P; Backes, Walter H

2017-12-01

To assess the wavelet entropy for the characterization of intrinsic aberrant temporal irregularities in the time series of resting-state blood-oxygen-level-dependent (BOLD) signal fluctuations. Further, to evaluate the temporal irregularities (disorder/order) on a voxel-by-voxel basis in the brains of children with Rolandic epilepsy. The BOLD time series was decomposed using the discrete wavelet transform and the wavelet entropy was calculated. Using a model time series consisting of multiple harmonics and nonstationary components, the wavelet entropy was compared with Shannon and spectral (Fourier-based) entropy. As an application, the wavelet entropy in 22 children with Rolandic epilepsy was compared to 22 age-matched healthy controls. The images were obtained by performing resting-state functional magnetic resonance imaging (fMRI) using a 3T system, an 8-element receive-only head coil, and an echo planar imaging pulse sequence ( T2*-weighted). The wavelet entropy was also compared to spectral entropy, regional homogeneity, and Shannon entropy. Wavelet entropy was found to identify the nonstationary components of the model time series. In Rolandic epilepsy patients, a significantly elevated wavelet entropy was observed relative to controls for the whole cerebrum (P = 0.03). Spectral entropy (P = 0.41), regional homogeneity (P = 0.52), and Shannon entropy (P = 0.32) did not reveal significant differences. The wavelet entropy measure appeared more sensitive to detect abnormalities in cerebral fluctuations represented by nonstationary effects in the BOLD time series than more conventional measures. This effect was observed in the model time series as well as in Rolandic epilepsy. These observations suggest that the brains of children with Rolandic epilepsy exhibit stronger nonstationary temporal signal fluctuations than controls. 2 Technical Efficacy: Stage 3 J. Magn. Reson. Imaging 2017;46:1728-1737. © 2017 International Society for Magnetic
Time Series Forecasting with Missing Values

OpenAIRE

Shin-Fu Wu; Chia-Yung Chang; Shie-Jue Lee

2015-01-01

Time series prediction has become more popular in various kinds of applications such as weather prediction, control engineering, financial analysis, industrial monitoring, etc. To deal with real-world problems, we are often faced with missing values in the data due to sensor malfunctions or human errors. Traditionally, the missing values are simply omitted or replaced by means of imputation methods. However, omitting those missing values may cause temporal discontinuity. Imputation methods, o...
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

Science.gov (United States)

del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

2015-06-17

Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

Detection of chaotic determinism in time series from randomly forced maps

Science.gov (United States)

Chon, K. H.; Kanters, J. K.; Cohen, R. J.; Holstein-Rathlou, N. H.

1997-01-01

Time series from biological system often display fluctuations in the measured variables. Much effort has been directed at determining whether this variability reflects deterministic chaos, or whether it is merely "noise". Despite this effort, it has been difficult to establish the presence of chaos in time series from biological sytems. The output from a biological system is probably the result of both its internal dynamics, and the input to the system from the surroundings. This implies that the system should be viewed as a mixed system with both stochastic and deterministic components. We present a method that appears to be useful in deciding whether determinism is present in a time series, and if this determinism has chaotic attributes, i.e., a positive characteristic exponent that leads to sensitivity to initial conditions. The method relies on fitting a nonlinear autoregressive model to the time series followed by an estimation of the characteristic exponents of the model over the observed probability distribution of states for the system. The method is tested by computer simulations, and applied to heart rate variability data.
Time-series modeling: applications to long-term finfish monitoring data

International Nuclear Information System (INIS)

Bireley, L.E.

1985-01-01

The growing concern and awareness that developed during the 1970's over the effects that industry had on the environment caused the electric utility industry in particular to develop monitoring programs. These programs generate long-term series of data that are not very amenable to classical normal-theory statistical analysis. The monitoring data collected from three finfish programs (impingement, trawl and seine) at the Millstone Nuclear Power Station were typical of such series and thus were used to develop methodology that used the full extent of the information in the series. The basis of the methodology was classic Box-Jenkins time-series modeling; however, the models also included deterministic components that involved flow, season and time as predictor variables. Time entered into the models as harmonic regression terms. Of the 32 models fitted to finfish catch data, 19 were found to account for more than 70% of the historical variation. The models were than used to forecast finfish catches a year in advance and comparisons were made to actual data. Usually the confidence intervals associated with the forecasts encompassed most of the observed data. The technique can provide the basis for intervention analysis in future impact assessments
Time Series Imputation via L1 Norm-Based Singular Spectrum Analysis

Science.gov (United States)

Kalantari, Mahdi; Yarmohammadi, Masoud; Hassani, Hossein; Silva, Emmanuel Sirimal

Missing values in time series data is a well-known and important problem which many researchers have studied extensively in various fields. In this paper, a new nonparametric approach for missing value imputation in time series is proposed. The main novelty of this research is applying the L1 norm-based version of Singular Spectrum Analysis (SSA), namely L1-SSA which is robust against outliers. The performance of the new imputation method has been compared with many other established methods. The comparison is done by applying them to various real and simulated time series. The obtained results confirm that the SSA-based methods, especially L1-SSA can provide better imputation in comparison to other methods.
Multiple Time Series Forecasting Using Quasi-Randomized Functional Link Neural Networks

Directory of Open Access Journals (Sweden)

Thierry Moudiki

2018-03-01

Full Text Available We are interested in obtaining forecasts for multiple time series, by taking into account the potential nonlinear relationships between their observations. For this purpose, we use a specific type of regression model on an augmented dataset of lagged time series. Our model is inspired by dynamic regression models (Pankratz 2012, with the response variable’s lags included as predictors, and is known as Random Vector Functional Link (RVFL neural networks. The RVFL neural networks have been successfully applied in the past, to solving regression and classification problems. The novelty of our approach is to apply an RVFL model to multivariate time series, under two separate regularization constraints on the regression parameters.
A novel time series link prediction method: Learning automata approach

Science.gov (United States)

Moradabadi, Behnaz; Meybodi, Mohammad Reza

2017-09-01

Link prediction is a main social network challenge that uses the network structure to predict future links. The common link prediction approaches to predict hidden links use a static graph representation where a snapshot of the network is analyzed to find hidden or future links. For example, similarity metric based link predictions are a common traditional approach that calculates the similarity metric for each non-connected link and sort the links based on their similarity metrics and label the links with higher similarity scores as the future links. Because people activities in social networks are dynamic and uncertainty, and the structure of the networks changes over time, using deterministic graphs for modeling and analysis of the social network may not be appropriate. In the time-series link prediction problem, the time series link occurrences are used to predict the future links In this paper, we propose a new time series link prediction based on learning automata. In the proposed algorithm for each link that must be predicted there is one learning automaton and each learning automaton tries to predict the existence or non-existence of the corresponding link. To predict the link occurrence in time T, there is a chain consists of stages 1 through T - 1 and the learning automaton passes from these stages to learn the existence or non-existence of the corresponding link. Our preliminary link prediction experiments with co-authorship and email networks have provided satisfactory results when time series link occurrences are considered.
Topological data analysis of financial time series: Landscapes of crashes

Science.gov (United States)

Gidea, Marian; Katz, Yuri

2018-02-01

We explore the evolution of daily returns of four major US stock market indices during the technology crash of 2000, and the financial crisis of 2007-2009. Our methodology is based on topological data analysis (TDA). We use persistence homology to detect and quantify topological patterns that appear in multidimensional time series. Using a sliding window, we extract time-dependent point cloud data sets, to which we associate a topological space. We detect transient loops that appear in this space, and we measure their persistence. This is encoded in real-valued functions referred to as a 'persistence landscapes'. We quantify the temporal changes in persistence landscapes via their Lp-norms. We test this procedure on multidimensional time series generated by various non-linear and non-equilibrium models. We find that, in the vicinity of financial meltdowns, the Lp-norms exhibit strong growth prior to the primary peak, which ascends during a crash. Remarkably, the average spectral density at low frequencies of the time series of Lp-norms of the persistence landscapes demonstrates a strong rising trend for 250 trading days prior to either dotcom crash on 03/10/2000, or to the Lehman bankruptcy on 09/15/2008. Our study suggests that TDA provides a new type of econometric analysis, which complements the standard statistical measures. The method can be used to detect early warning signals of imminent market crashes. We believe that this approach can be used beyond the analysis of financial time series presented here.
Effect of pre-processing on the physico-chemical properties of ...

African Journals Online (AJOL)

The findings indicated that the pre-processing treatments produced significant differences (p < 0.05) in protein (1.50 ± 0.18g/100g) and carbohydrate (1.09 ± 0.94g/100g) composition of the baking soda blanched milk sample. The viscosity of the baking soda blanched milk (18.91 ± 3.38cps) was significantly higher than that ...
Orthogonal feature selection method. [For preprocessing of man spectral data

Energy Technology Data Exchange (ETDEWEB)

Kowalski, B R [Univ. of Washington, Seattle; Bender, C F

1976-01-01

A new method of preprocessing spectral data for extraction of molecular structural information is desired. This SELECT method generates orthogonal features that are important for classification purposes and that also retain their identity to the original measurements. A brief introduction to chemical pattern recognition is presented. A brief description of the method and an application to mass spectral data analysis follow. (BLM)
Properties of Asymmetric Detrended Fluctuation Analysis in the time series of RR intervals

Science.gov (United States)

Piskorski, J.; Kosmider, M.; Mieszkowski, D.; Krauze, T.; Wykretowicz, A.; Guzik, P.

2018-02-01

Heart rate asymmetry is a phenomenon by which the accelerations and decelerations of heart rate behave differently, and this difference is consistent and unidirectional, i.e. in most of the analyzed recordings the inequalities have the same directions. So far, it has been established for variance and runs based types of descriptors of RR intervals time series. In this paper we apply the newly developed method of Asymmetric Detrended Fluctuation Analysis, which so far has mainly been used with economic time series, to the set of 420 stationary 30 min time series of RR intervals from young, healthy individuals aged between 20 and 40. This asymmetric approach introduces separate scaling exponents for rising and falling trends. We systematically study the presence of asymmetry in both global and local versions of this method. In this study global means "applying to the whole time series" and local means "applying to windows jumping along the recording". It is found that the correlation structure of the fluctuations left over after detrending in physiological time series shows strong asymmetric features in both magnitude, with α+ physiological data after shuffling or with a group of symmetric synthetic time series.
Improving the performance of streamflow forecasting model using data-preprocessing technique in Dungun River Basin

Science.gov (United States)

Khai Tiu, Ervin Shan; Huang, Yuk Feng; Ling, Lloyd

2018-03-01

An accurate streamflow forecasting model is important for the development of flood mitigation plan as to ensure sustainable development for a river basin. This study adopted Variational Mode Decomposition (VMD) data-preprocessing technique to process and denoise the rainfall data before putting into the Support Vector Machine (SVM) streamflow forecasting model in order to improve the performance of the selected model. Rainfall data and river water level data for the period of 1996-2016 were used for this purpose. Homogeneity tests (Standard Normal Homogeneity Test, the Buishand Range Test, the Pettitt Test and the Von Neumann Ratio Test) and normality tests (Shapiro-Wilk Test, Anderson-Darling Test, Lilliefors Test and Jarque-Bera Test) had been carried out on the rainfall series. Homogenous and non-normally distributed data were found in all the stations, respectively. From the recorded rainfall data, it was observed that Dungun River Basin possessed higher monthly rainfall from November to February, which was during the Northeast Monsoon. Thus, the monthly and seasonal rainfall series of this monsoon would be the main focus for this research as floods usually happen during the Northeast Monsoon period. The predicted water levels from SVM model were assessed with the observed water level using non-parametric statistical tests (Biased Method, Kendall's Tau B Test and Spearman's Rho Test).
A New Modified Histogram Matching Normalization for Time Series Microarray Analysis

Directory of Open Access Journals (Sweden)

Laura Astola

2014-07-01

Full Text Available Microarray data is often utilized in inferring regulatory networks. Quantile normalization (QN is a popular method to reduce array-to-array variation. We show that in the context of time series measurements QN may not be the best choice for this task, especially not if the inference is based on continuous time ODE model. We propose an alternative normalization method that is better suited for network inference from time series data.
AFSC/ABL: Ugashik sockeye salmon scale time series

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — A time series of scale samples (1956 b?? 2002) collected from adult sockeye salmon returning to Ugashik River were retrieved from the Alaska Department of Fish and...
AFSC/ABL: Naknek sockeye salmon scale time series

Data.gov (United States)

National Oceanic and Atmospheric Administration, Department of Commerce — A time series of scale samples (1956 2002) collected from adult sockeye salmon returning to Naknek River were retrieved from the Alaska Department of Fish and Game....
Time series regression model for infectious disease and weather.

Science.gov (United States)

Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro

2015-10-01

Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Assessing Coupling Dynamics from an Ensemble of Time Series

Directory of Open Access Journals (Sweden)

Germán Gómez-Herrero

2015-04-01

Full Text Available Finding interdependency relations between time series provides valuable knowledge about the processes that generated the signals. Information theory sets a natural framework for important classes of statistical dependencies. However, a reliable estimation from information-theoretic functionals is hampered when the dependency to be assessed is brief or evolves in time. Here, we show that these limitations can be partly alleviated when we have access to an ensemble of independent repetitions of the time series. In particular, we gear a data-efficient estimator of probability densities to make use of the full structure of trial-based measures. By doing so, we can obtain time-resolved estimates for a family of entropy combinations (including mutual information, transfer entropy and their conditional counterparts, which are more accurate than the simple average of individual estimates over trials. We show with simulated and real data generated by coupled electronic circuits that the proposed approach allows one to recover the time-resolved dynamics of the coupling between different subsystems.
Wavelet transform approach for fitting financial time series data

Science.gov (United States)

Ahmed, Amel Abdoullah; Ismail, Mohd Tahir

2015-10-01

This study investigates a newly developed technique; a combined wavelet filtering and VEC model, to study the dynamic relationship among financial time series. Wavelet filter has been used to annihilate noise data in daily data set of NASDAQ stock market of US, and three stock markets of Middle East and North Africa (MENA) region, namely, Egypt, Jordan, and Istanbul. The data covered is from 6/29/2001 to 5/5/2009. After that, the returns of generated series by wavelet filter and original series are analyzed by cointegration test and VEC model. The results show that the cointegration test affirms the existence of cointegration between the studied series, and there is a long-term relationship between the US, stock markets and MENA stock markets. A comparison between the proposed model and traditional model demonstrates that, the proposed model (DWT with VEC model) outperforms traditional model (VEC model) to fit the financial stock markets series well, and shows real information about these relationships among the stock markets.
Evolution of the Sunspot Number and Solar Wind B Time Series

Science.gov (United States)

Cliver, Edward W.; Herbst, Konstantin

2018-03-01

The past two decades have witnessed significant changes in our knowledge of long-term solar and solar wind activity. The sunspot number time series (1700-present) developed by Rudolf Wolf during the second half of the 19th century was revised and extended by the group sunspot number series (1610-1995) of Hoyt and Schatten during the 1990s. The group sunspot number is significantly lower than the Wolf series before ˜1885. An effort from 2011-2015 to understand and remove differences between these two series via a series of workshops had the unintended consequence of prompting several alternative constructions of the sunspot number. Thus it has been necessary to expand and extend the sunspot number reconciliation process. On the solar wind side, after a decade of controversy, an ISSI International Team used geomagnetic and sunspot data to obtain a high-confidence time series of the solar wind magnetic field strength (B) from 1750-present that can be compared with two independent long-term (> ˜600 year) series of annual B-values based on cosmogenic nuclides. In this paper, we trace the twists and turns leading to our current understanding of long-term solar and solar wind activity.
A Phenology-Based Classification of Time-Series MODIS Data for Rice Crop Monitoring in Mekong Delta, Vietnam

Directory of Open Access Journals (Sweden)

Nguyen-Thanh Son

2013-12-01

Full Text Available Rice crop monitoring is an important activity for crop management. This study aimed to develop a phenology-based classification approach for the assessment of rice cropping systems in Mekong Delta, Vietnam, using Moderate Resolution Imaging Spectroradiometer (MODIS data. The data were processed from December 2000, to December 2012, using empirical mode decomposition (EMD in three main steps: (1 data pre-processing to construct the smooth MODIS enhanced vegetation index (EVI time-series data; (2 rice crop classification; and (3 accuracy assessment. The comparisons between the classification maps and the ground reference data indicated overall accuracies and Kappa coefficients, respectively, of 81.4% and 0.75 for 2002, 80.6% and 0.74 for 2006 and 85.5% and 0.81 for 2012. The results by comparisons between MODIS-derived rice area and rice area statistics were slightly overestimated, with a relative error in area (REA from 0.9–15.9%. There was, however, a close correlation between the two datasets (R2 ≥ 0.89. From 2001 to 2012, the areas of triple-cropped rice increased approximately 31.6%, while those of the single-cropped rain-fed rice, double-cropped irrigated rice and double-cropped rain-fed rice decreased roughly −5.0%, −19.2% and −7.4%, respectively. This study demonstrates the validity of such an approach for rice-crop monitoring with MODIS data and could be transferable to other regions.
Time irreversibility and intrinsics revealing of series with complex network approach

Science.gov (United States)

Xiong, Hui; Shang, Pengjian; Xia, Jianan; Wang, Jing

2018-06-01

In this work, we analyze time series on the basis of the visibility graph algorithm that maps the original series into a graph. By taking into account the all-round information carried by the signals, the time irreversibility and fractal behavior of series are evaluated from a complex network perspective, and considered signals are further classified from different aspects. The reliability of the proposed analysis is supported by numerical simulations on synthesized uncorrelated random noise, short-term correlated chaotic systems and long-term correlated fractal processes, and by the empirical analysis on daily closing prices of eleven worldwide stock indices. Obtained results suggest that finite size has a significant effect on the evaluation, and that there might be no direct relation between the time irreversibility and long-range correlation of series. Similarity and dissimilarity between stock indices are also indicated from respective regional and global perspectives, showing the existence of multiple features of underlying systems.
Data pre-processing: a case study in predicting student's retention in ...

African Journals Online (AJOL)

dataset with features that are ready for data mining task. The study also proposed a process model and suggestions, which can be applied to support more comprehensible tools for educational domain who is the end user. Subsequently, the data pre-processing become more efficient for predicting student's retention in ...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.