Identifying Patterns in Time Series Data - Trend Analysis
There are no proven "automatic" techniques to identify trend components in the time series data; however, as long as the trend is monotonous (consistently increasing or decreasing) that part of data analysis is typically not very difficult. If the time series data contain considerable error, then the first step in the process of trend identification is smoothing.
- Smoothing
- Smoothing always involves some form of local averaging of data such that the nonsystematic components of individual observations cancel each other out. The most common technique is moving average smoothing which replaces each element of the series by either the simple or weighted average of n surrounding elements, where n is the width of the smoothing "window" (see Box & Jenkins, 1976; Velleman & Hoaglin, 1981). Medians can be used instead of means. The main advantage of median as compared to moving average smoothing is that its results are less biased by outliers (within the smoothing window). Thus, if there are outliers in the data (e.g., due to measurement errors), median smoothing typically produces smoother or at least more "reliable" curves than moving average based on the same window width. The main disadvantage of median smoothing is that in the absence of clear outliers it may produce more "jagged" curves than moving average and it does not allow for weighting.
All of these techniques are included among the interactive transformations in the Time Series module. In the relatively less common cases (in time series data), when the measurement error is very large, the Distance weighted LS smoothing or Negative expon Weighted LS smoothing techniques in the Fit group box on the Quick tab of the 3D Surface Plots dialog can be used. All those methods will filter out the noise and convert the data into a smooth curve that is relatively unbiased by outliers.
- Fitting a function
- Many monotonous time series data can be adequately approximated by a linear function; if there is a clear monotonous nonlinear component, the data first need to be transformed to remove the nonlinearity. Usually a logarithmic, exponential, or (less often) polynomial function can be used. There are several ways to do this in STATISTICA. You can experiment with transformations of unlimited complexity using spreadsheet formulas, and later submit the transformed series to linear regression (either via the Multiple Regression or the Time Series module) and generate forecasts (in Multiple Regression). Nonlinear functions of practically unlimited complexity including piecewise estimations with break points (where different functions can be simultaneously fitted to different ranges of the series) can be performed with the Nonlinear Estimation module. Finally, STATISTICA includes general purpose curve fitting procedures that can be used to fit polynomial functions (of user-specified order), logarithmic functions (with user-specified bases), exponential and other functions.
See also, Exploratory Data Analysis and Data Mining Techniques.