Spectrum Analysis Basic Notation and Principles - The General Structural Model
As mentioned before, the purpose of spectrum analysis is to decompose the original series into underlying sine and cosine functions of different frequencies, in order to determine those that appear particularly strong or important. One way to do so would be to cast the issue as a linear Multiple Regression problem, where the dependent variable is the observed time series, and the independent variables are the sine functions of all possible (discrete) frequencies. Such a linear multiple regression model may be written as:
xt = a0 + Σ[ak*cos(λk*t) + bk*sin(λk*t)] (for k = 1 to q)
Following the common notation from classical harmonic analysis, in this equation λ (lambda) is the frequency expressed in terms of radians per unit time, that is: λ = 2*π*νk, where π is the constant pi =3.1415... and νk = k/q. What is important here is to recognize that the computational problem of fitting sine and cosine functions of different lengths to the data can be considered in terms of multiple linear regression. Note that the cosine parameters ak and sine parameters bk are regression coefficients that tell us the degree to which the respective functions are correlated with the data. Overall there are q different sine and cosine functions; intuitively (as also discussed in Multiple Regression), it should be clear that we cannot have more sine and cosine functions than there are data points in the series. Without going into detail, if there are N data points in the series, then there will be N/2+1 cosine functions and N/2-1 sine functions. In other words, there will be as many different sinusoidal waves as there are data points, and we will be able to completely reproduce the series from the underlying functions. (Note that if the number of cases in the series is odd, then the last data point will usually be ignored; in order for a sinusoidal function to be identified, you need at least two points: the high peak and the low peak.)
To summarize, spectrum analysis will identify the correlation of sine and cosine functions of different frequency with the observed data. If a large correlation (sine or cosine coefficient) is identified, one can conclude that there is a strong periodicity of the respective frequency (or period) in the data.
Complex numbers (real and imaginary numbers)
In many text books on spectrum analysis, the structural model shown above is presented in terms of complex numbers, that is, the parameter estimation process is described in terms of the Fourier transform of a series into real and imaginary parts. Complex numbers are the superset that includes all real and imaginary numbers. Imaginary numbers, by definition, are numbers that are multiplied by the constant i, where i is defined as the square root of -1. Obviously, the square root of -1 does not exist, hence the term imaginary number; however, meaningful arithmetic operations on imaginary numbers can still be performed (e.g., [i*2]2= -4). It is useful to think of real and imaginary numbers as forming a two dimensional plane, where the horizontal or X-axis represents all real numbers, and the vertical or y-axis represents all imaginary numbers. Complex numbers can then be represented as points in the two-dimensional plane. For example, the complex number 3+i*2 can be represented by a point with coordinates {3,2} in this plane. You can also think of complex numbers as angles, for example, you can connect the point representing a complex number in the plane with the origin (complex number 0+i*0), and measure the angle of that vector to the horizontal line. Thus, you can see intuitively how the spectrum decomposition formula shown above, consisting of sine and cosine functions, can be rewritten in terms of operations on complex numbers. In fact, in this manner the mathematical discussion and required computations are often more elegant and easier to perform; which is why many text books prefer the presentation of spectrum analysis in terms of complex numbers.