STATISTICAL FOURIER ANALYSIS - University of Leicester [PDF]

length 2π; and, sometimes, it may be appropriate to define the function over the interval [0,2π] instead of the .....

3 downloads 12 Views 287KB Size

Recommend Stories


Blackboard - University of Leicester
You have survived, EVERY SINGLE bad day so far. Anonymous

University of Leicester
In the end only three things matter: how much you loved, how gently you lived, and how gracefully you

February 2001 - University of Leicester [PDF]
Feb 4, 2001 - Musical for his performance in the show, sent a personal message to the University of Leicester Theatre wishing the company ... University, when Professor David Phillips OBE,. Professor of Inorganic Chemistry at Imperial College, ... Dr

fourier analysis
Happiness doesn't result from what we get, but from what we give. Ben Carson

Download PDF Statistical Analysis of Network Data
Courage doesn't always roar. Sometimes courage is the quiet voice at the end of the day saying, "I will

statistical analysis of odot
The best time to plant a tree was 20 years ago. The second best time is now. Chinese Proverb

Leicester
Keep your face always toward the sunshine - and shadows will fall behind you. Walt Whitman

Fourier analysis for vectors
Love only grows by sharing. You can only have more for yourself by giving it away to others. Brian

DFT Vs FFT For Fourier Analysis of Waveforms [PDF]
Example Waveform. To illustrate the difference between the DFT and FFT techniques, consider the following ... test frequency and average the results over one or more cycles of the test frequency. For example: 2. 0 .... method has its advantages and d

Statistical Analysis of Half-Tetrads
What you seek is seeking you. Rumi

Idea Transcript


STATISTICAL FOURIER ANALYSIS: CLARIFICATIONS AND INTERPRETATIONS by D.S.G. Pollock (University of Leicester) Email: stephen [email protected] This paper expounds some of the results of Fourier theory that are essential to the statistical analysis of time series. It employs the algebra of circulant matrices to expose the structure of the discrete Fourier transform and to elucidate the filtering operations that may be applied to finite data sequences. An ideal filter with a gain of unity throughout the pass band and a gain of zero throughout the stop band is commonly regarded as incapable of being realised in finite samples. It is shown here that, to the contrary, such a filter can be realised both in the time domain and in the frequency domain. The algebra of circulant matrices is also helpful in revealing the nature of statistical processes that are band limited in the frequency domain. In order to apply the conventional techniques of autoregressive moving-average modelling, the data generated by such processes must be subjected to antialiasing filtering and sub sampling. These techniques are also described. It is argued that band-limited processes are more prevalent in statistical and econometric time series than is commonly recognised.

1

D.S.G. POLLOCK: Statistical Fourier Analysis 1. Introduction Statistical Fourier analysis is an important part of modern time-series analysis, yet it frequently poses an impediment that prevents a full understanding of temporal stochastic processes and of the manipulations to which their data are amenable. This paper provides a survey of the theory that is not overburdened by inessential complications, and it addresses some enduring misapprehensions. Amongst these is a misunderstanding of the effects in the frequency domain of linear filtering operations. It is commonly maintained, for example, that an ideal frequency-selective filter that preserves some of the elements of a time series whilst nullifying all others is incapable of being realised in finite samples. This paper shows that such finite-sample filters are readily available and that they possess time-domain and frequency-domain representations that are both tractable. Their representations are directly related to the classical Wiener–Kolmogorov theory of filtering, which presupposes that the data can be treated as if they form a doubly-infinite or a semi-infinite sequence. A related issue, which the paper aims to clarify, concerns the nature of continuous stochastic processes that are band-limited in frequency. The paper provides a model of such processes that challenges the common supposition that the natural primum mobile of any continuous-time stochastic process is Wiener process, which consists of a steam of infinitesimal impulses. The sampling of a Wiener process is inevitably accompanied by an aliasing effect, whereby elements of high frequencies are confounded with elements of frequencies that fall within the range that is observable in sampling. Here, it is shown, to the contrary, that there exist simple models of continuous bandlimited processes for which no aliasing need occur in sampling. The spectral theory of time series is a case of a non-canonical Fourier theory. It has some peculiarities that originally caused considerable analytic difficulties, which were first overcome in a satisfactory manner by Norbert Wiener (1930) via his theory of a generalised harmonic analysis. The statistical filtering theory that is appropriate to such time series originated with Wiener (1941) and Kolmogorov (1941a). Many of the difficulties of the spectral representation of a time series can be avoided by concentrating exclusively on its correlation structure. Then, one can exploit a simple canonical Fourier relationship that exists between the autocovariance function of a discrete stationary process and its spectral density function. This relationship is by virtue of an inverted form of the classical Fourier series relationship that exists between a continuous, or piecewise continuous, periodic function and its transform, which is a sequence of Fourier coefficients. (In the inverted form of the relationship, which is described as the discretetime Fourier transform, it is the sequence that is the primary function and the continuous periodic function that is its transform.) However, it is important to have a mathematical model of the process itself, and this is where some of the complications arise. Figure 1 depicts what may be described as the canonical Fourier transforms. For conciseness, they have been expressed using complex exponentials, 2

D.S.G. POLLOCK: Statistical Fourier Analysis whereas they might be more intelligible when expressed in terms of ordinary trigonometrical functions. When dealing with the latter, only positive frequencies are considered, whereas complex exponentials are defined for both positive and negative frequencies. In the diagrams, the real-valued time-domain functions are symmetric, with the consequence that their frequency-domain transforms are also real-valued and symmetric. Under other conditions, the transforms would be complex-valued, giving rise to a pair of functions. 2. Canonical Fourier Analysis The classical Fourier series, illustrated in Figure 1.(ii), expresses a continuous, or piecewise continuous, period function x(t) = x(t+T ) in terms of a sum of sine and cosine functions of harmonically increasing frequencies {ωj = 2πj/T ; j = 1, 2, 3, . . .}: ∞ ∞   x(t) = α0 + αj cos(ωj t) + βj sin(ωj t) = α0 +

j=1 ∞ 

j=1

(1)

ρj cos(ωj t − θj ).

j=1

Here, ω1 = 2π/T is the fundamental frequency or angular velocity, which corresponds to a trigonometric function that completes a single cycle in the interval [0, T ], or in any other interval of length T such as [−T /2, T /2]. The second expression depends upon the definitions ρ2j = αj2 + βj2

and θj = tan−1 (βj /αj ).

(2)

The equality follows from the trigonometrical identity cos(A − B) = cos(A) cos(B) + sin(A) sin(B).

(3)

In representing the periodic nature of x(t), it is helpful to consider mapping the interval [0, T ], over which the function is defined, onto the circumference of a circle, such that the end points of the interval coincide. Then, successive laps or circuits of the circle will generate successive cycles of the function. According to Euler’s equations, there are cos(ωj t) =

1 iωj t + e−iωj t ) and (e 2

sin(ωj t) =

−i iωj t − e−iωjt ). (e 2

(4)

Therefore, equation (1) can be expressed as x(t) = α0 +

∞ ∞  αj + iβj −iωj t  αj − iβj iωj t + e e , 2 2 j=1 j=1

(5)

which can be written concisely as x(t) =

∞  j=−∞

3

ξj eiωj t ,

(6)

D.S.G. POLLOCK: Statistical Fourier Analysis (i) The Fourier integral:

1 x(t) = 2π







−∞

iωt

ξ(ω)e



←→

ξ(ω) =



−∞

x(t)e−iωt dt

(ii) The classical Fourier series: ∞ 

x(t) =

iωj t

ξj e

←→

j=−∞

1 ξj = T



T

x(t)e−iωj t dt

0

(iii) The discrete-time Fourier transform:

1 xt = 2π



π iωt

−π

ξ(ω)e



←→

ξ(ω) =

∞ 

xt e−iωt

t=−∞

(iv) The discrete Fourier transform:

xt =

T −1 

iωj t

ξj e

←→

j=0

T −1 1  ξj = xt e−iωj t T t=0

Figure 1. The classes of the Fourier transforms.

4

D.S.G. POLLOCK: Statistical Fourier Analysis where ξ0 = α0 ,

αj − iβj 2

ξj =

and ξ−j = ξj∗ =

αj + iβj . 2

(7)

The inverse of the classical transform of (6) is demonstrated by writing   T  ∞ 1 x(t)e−iωj t dt = ξk eiωk t e−iωj t dt T 0 0 k=−∞    ∞ T 1  ξk ei(ωk −ωj )t dt = ξj , = T 0

1 ξj = T



T

(8)

k=−∞

where the final equality follows from an orthogonality condition in the form of 

T

i(ωk −ωj )t

e

 dt =

0

0,

if j = k;

T,

if j = k.

(9)

The relationship between the continuous periodic function and its Fourier transform can be summarised by writing x(t) =

∞ 

iωj t

ξj e

←→

j=−∞

1 ξj = T



T

x(t)e−iωj t dt.

(10)

0

The question of the conditions that are sufficient for the existence of such a Fourier relationship is an essential one; and there are a variety of Fourier theories. The theory of Fourier series is concerned with establishing the conditions under which the partial sums converge to the function, in some specified sense, as the number of the terms increases. At the simplest level, it is sufficient for convergence that x(t) should be continuous and bounded in the interval [0, T ]. However, the classical theory of Fourier series is concerned with the existence of the relationship in the case where x(t) is bounded but is also permitted to have a finite number of maxima and minima and a finite number of jump discontinuities. It can be shown that, in that case, as successive terms are added, the Fourier series of (1) converges to 1 {x(t + 0) + x(t − 0)}, (11) 2 where x(t + 0) is the value as t is approached from the right and x(t − 0) is the value as it is approached from the left. If x(t) is continuous at the point in question, then the Fourier series converges to x(t). Here, the relevant criterion is that of overall convergence in mean-square rather than of pointwise convergence. The question of the mode of converge was the issue at stake in the paradox known as Gibbs’ phenomenon, which concerns the absence of pointwise convergence in the presence jump discontinuities. This phenomenon is illustrated in Figure 2, where it is apparent that not all of the oscillations in the partial sums that approximate a periodic square wave (i.e. a time-domain rectangle) are decreasing at a uniform rate as the 5

D.S.G. POLLOCK: Statistical Fourier Analysis

n=6 1.25 1.00 n = 30 0.75 n=3

0.50 0.25 0.00

T/8

0

T/4

3T/8

T/2

Figure 2. The Fourier approximation of a square wave. The diagram shows approximations to the positive half of a rectangle defined on the interval [−T /2, T /2] and symmetric with respect to the vertical axis through zero.

number of terms n increases. Instead, the oscillations that are adjacent to the point of discontinuity are tending to a limiting amplitude, which is about 9% of the jump. However, as n increases, the width of these end-oscillations becomes vanishingly small; and thus the mean-square convergence of the Fourier series is assured. Gibbs’ phenomenon is analysed in detail by Carslaw (1930); and Carslaw (1925) has also recounted the history of its discovery. Parseval’s theorem asserts that, under the stated conditions, which guarantee mean-square convergence, there is 1 T



T

|x(t)| dt = 2

0

α02

∞ ∞  1 2 2 + (α + βj ) = |ξj |2 , 4 j=1 j j=−∞

(12)

where |ξj |2 = ξj ξj∗ = ρ2j /4. This indicates that the energy of the function x(t), which becomes its average power if we divide by T , is equal to the sum of the energies of the sinusoidal components. Here, it is essential that the square integral converges, which is to say that the function possesses a finite energy. The fulfilment of such an energy or power condition characterises what we have chosen to describe as the canonical Fourier transforms. There are some powerful symmetries between the two domains of the Fourier transforms. The discrete-time Fourier transform, which is fundamental to time-series analysis, is obtained by interchanging the two domains of the classical Fourier series transform. It effects the transformation of a sequence x(t) = {xt ; t = 0, ±1, ±2, . . .} within the time domain, that is square summable, 6

D.S.G. POLLOCK: Statistical Fourier Analysis into a continuous periodic function in the frequency domain via an integral expression that is mean-square convergent. It is illustrated in Figure 1.(iii). We may express the relationship in question by writing 1 xt = 2π



π iωt

−π

ξ(ω)e



←→

∞ 

ξ(ω) =

xt e−iωt .

(13)

t=−∞

The periodicity is now in the frequency domain such that ξ(ω) = ξ(ω + 2π) for all ω. The periodic function completes a single cycle in any interval of length 2π; and, sometimes, it may be appropriate to define the function over the interval [0, 2π] instead of the interval [−π, π]. In that case, the relationship of (13) will be unaffected; and its correspondence with (10), which defines the classical Fourier series, is clarified. There remain two other Fourier transforms, which embody a complete symmetry between the two domains. The first of these is the Fourier integral transform of Figure 1.(i) which is denoted by 1 x(t) = 2π





−∞

 iωt

ξ(ω)e



←→

ξ(ω) =



−∞

x(t)e−iωt dt.

(14)

Here, apart from the scaling, which can be revised to achieve symmetry, there is symmetry between the transform and its inverse. (The factor 1/2π can be eliminated from the first integral by changing the variable from angular velocity ω, measured in radians per period, to frequency f = ω/2π, measured in cycles per period.) One of the surprises on first encountering the Fourier integral is the aperiodic nature of the transform, which is in spite of the cyclical nature of the constituent sinusoidal elements. However, this is readily explained. Consider the sum of two elements within a Fourier synthesis that are at the frequencies ωm and ωn respectively. Then, their sum will constitute a periodic function of period τ = 2π/ω∗ if and only if ωm = mω∗ and ωn = nω∗ are integer multiples of a common frequency ω∗ . For a sum of many sinusoidal elements to constitute a periodic function, it is necessary and sufficient that all of the corresponding frequencies should be integer multiples of a fundamental frequency. Whereas this condition is fulfilled by the classical Fourier series of (10), it cannot be fulfilled by a Fourier integral comprising frequencies that are arbitrary irrational numbers. A sequence that has been sampled at the integer time points from a continuous aperiodic function that is square-integrable will have a transform that is a periodic function. This outcome is manifest in the discrete-time Fourier transform, where it can be seen that the period in question has the length of 2π radians. Let ξ(ω) be the transform of the continuous aperiodic function, and let ξS (ω) be the transform of the sampled sequence {xt ; t = 0, ±1, ±2, . . .}. Then  ∞  π 1 1 iωt ξ(ω)e dω = ξS (ω)eiωt dω. (15) xt = 2π −∞ 2π −π 7

D.S.G. POLLOCK: Statistical Fourier Analysis

1 0.75 0.5 0.25 0 −0.25 −8

−6

−4

−2

0

2

4

6

8

Figure 3. The sinc function wave-packet ψ0 (t) = sin(πt)/πt comprising frequencies in the interval [0, π].

The equality of the two integrals implies that ∞ 

ξS (ω) =

ξ(ω + 2jπ).

(16)

j=−∞

Thus, the periodic function ξS (ω) is obtained by wrapping ξ(ω) around a circle of circumference of 2π and adding the coincident ordinates. The relationship between the Fourier series and the discrete-time Fourier transform enables us to infer that a similar effect will arise from the regular sampling of a continuous function of frequency. Thus, the corresponding timedomain function will be wrapped around a circle of circumferences T = 2π/ω1 , where ω1 is both the fundamental frequency and the sampling interval in the frequency domain. The Fourier integral is employed extensively in mathematical physics. In quantum mechanics, for example, it is used in one domain to describe a localised wave train and, in the other domain, to describe its frequency composition, which is a resolution of its energy amongst a set of frequencies. There is an inverse relationship between the dispersion of the wave train in space and the dispersion of its frequency components. The product of the two dispersions is bounded below according to the famous Heisenberg uncertainty principle. A simple and important example of the Fourier integral is afforded by the sinc function wave packet and its transform, which is a frequency-domain rectangle. Figure 3 depicts a continuous sinc function of which the Fourier transform is a rectangle on the frequency interval [−π, π]: 1 ψ0 (t) = 2π





π iωt

−π

e

eitω dω(t) = i2πt

π = −π

sin πt . πt

(17)

Here, the final equality is by virtue of (4), which expresses a sine function as a combination of complex exponential functions. 8

D.S.G. POLLOCK: Statistical Fourier Analysis The function ψ0 (t) can also be construed as a wave packet centred on time t = 0. The figure also represents a sampled version the sinc function. This would be obtained, in the manner of a classical Fourier series, from the rectangle on the interval [−π, π], if this were regarded as a single cycle of a periodic function. The function ψ0 (t) with t ∈ I = {0, ±1, ±2, . . .} is nothing but the unit impulse sequence. Therefore, the set of all sequences {ψ0 (t − k); t, k ∈ I}, obtained by integer displacements k of ψ0 (t), constitutes the ordinary orthogonal Cartesian basis in the time domain for the set of all real-valued time series. When t ∈ R is a real-valued index of continuous time, the set of displaced sinc functions {ψ0 (t − k); t ∈ R, k ∈ I} constitute a basis for the set of continuous functions of which the frequency content is bounded by the Nyquist value of π radians per unit time interval. In common with their discretely sampled counterparts, the sequence of continuous sinc functions at integer displacements constitutes an orthogonal basis. To demonstrate the orthogonality, consider the fact that the corresponding frequency-domain rectangle is an idempotent function. When multiplied by itself it does not change. The time-domain operation corresponding to this frequency-domain multiplication is an autoconvolution. The time-domain functions are real and symmetric with ψ0 (k−t) = ψ0 (t−k), so their autoconvolution is the same as their autocorrelation:   γ(k) = ψ0 (t)ψ0 (k − t)dt = ψ0 (t)ψ0 (t − k)dt. (18) Therefore, the sinc function is its own autocovariance function: γ(k) = ψ0 (k). The zeros of the sinc function that are found at integer displacements from the centre correspond the orthogonality of sinc functions separated from each other by these distances. A sinc function wave packet that is limited to the frequency band [α, β] ∈ [0, π], if we are talking only of positive frequencies, has the functional form of 1 {sin(βt) − sin(αt)} πt 2 = cos{(α + β)t/2} sin{(β − α)t/2} πt 2 = cos(γt) sin(δt), πt

ψ(t) =

(19)

where γ = (α+β)/2 is the centre of the band and δ = (β −α)/2 is half its width. The equality follows from the identity sin(A + B) − sin(A − B) = 2 cos A sin B. Finally, we must consider the Fourier transform that is the most important one from the point of view of this paper. This is the discrete Fourier transform of Figure 1.(iv) that maps from a sequence of T data {xt ; t = 0, 1, . . . , T − 1} in the time domain to a set of T ordinates {ξj ; j = 0, 1, . . . , T − 1} in the frequency domain at the points {ωj = 2πj/T ; j = 0, 1, . . . , T − 1}, which are equally spaced within the interval [0, 2π], and vice versa. The relationship can be expressed in terms of sines and cosines; but it is more conveniently expressed 9

D.S.G. POLLOCK: Statistical Fourier Analysis Table 1. The classes of Fourier transforms* Periodic Discrete aperiodic Fourier series Discrete periodic Discrete FT

Continuous Discrete

Aperiodic Continuous aperiodic Fourier integral Continuous periodic Discrete-time FT

* The class of the Fourier transform depends upon the nature of the function which is transformed. This function may be discrete or continuous and it may be periodic or aperiodic. The names and the natures of corresponding transforms are shown in the cells of the table.

in terms of complex exponentials, even in the case where the sequence {xt } is a set of real data values: xt =

T −1 

iωj t

ξj e

←→

j=0

T −1 1  ξj = xt e−iωj t . T t=0

(20)

Since all sequences in data analysis are finite, the discrete Fourier transform is used, in practice, to represent all manner of Fourier transforms. Although the sequences in both domains are finite and of an equal number of elements, it is often convenient to regard both of them as representing single cycles of periodic functions defined over the entire set of positive and negative integers. These infinite sequences are described and the periodic extensions of their finite counterparts. The various Fourier transforms may be summarised in a table that records the essential characteristics of the function that is to be transformed, namely whether it is discrete or continuous and whether it is periodic or aperiodic. This is table 1. There are numerous accounts of Fourier analysis that can be accessed in pursuit of a more thorough treatment. Much of the text of Carslaw (1930) is devoted to the issues in real analysis that were raised in the course of the development of Fourier analysis. The book was first published in 1906, which is close to the dates of some of the essential discoveries that clarified such matters as Gibbs’ phenomenon. It continues to be a serviceable and authoritative text. Titchmarsh (1937), from the same era, provides a classical account of the Fourier integral. The more recent text of Kreider, Kuller, Ostberg and Perkins (1966) deals with some of the analytic complications of Fourier analysis in an accessible manner, as does the brief text of Churchill and Brown (1978). The recent text of Stade (2005) is an extensive resource. That of Nahin (2006) is both engaging and insightful. Many of the texts that are intended primarily for electrical engineers are also useful to statisticians and econometricians. Baher (1990) gives much of the relevant analytic details, whereas Brigham (1988) provides a practical text that avoids analytic complications. 10

D.S.G. POLLOCK: Statistical Fourier Analysis Following the discoveries of Wiener (1930) and Kolmogorov (1941b), the mathematical exposition of the spectral analysis of time series developed rapidly. Detailed treatments of the analytic issues are to be found in some of the classical texts of time series analysis, amongst which are those of Wold (1938), Doob (1953), Grenander and Rosenblat (1957), Hannan (1960), Yaglom (1962) and Rozanov (1967). Yaglom and Rozanov were heirs to a Russian tradition that began with Kolmogorov (1941b) It was not until the 1960’s that spectral analysis began to find widespread application; and the book of Jenkins and Watts (1968) was symptomatic of the increasing practicality of the subject. Econometricians also began to take note of spectral analysis in the 1960’s; and two influential books that brought it to their attention were those of Granger and Hatanaka (1964) and Fishman (1969). Nerlove, Grether and Carvalho (1979) continued the pursuit of the spectral analysis of economic data. Latterly, Brillinger (1975), Priestly (1981), Rosenblatt (1985) and Brockwell and Davis (1987) have treated Fourier analysis with a view to the statistical analysis of time series, as has Pollock (1999). 3. Representations of the Discrete Fourier Transform It is often helpful to express the discrete Fourier transform in terms of the notation of the z-transform. The z-transforms of the sequences {xt ; t = 0, . . . , T −1} and {ξj ; j = 0, . . . , T − 1} are the polynomials ξ(z) =

T −1 

ξj z

j

and x(z) =

T −1 

xt z t ,

(21)

t=0

j=0

wherein z is an indeterminate algebraic symbol that is commonly taken to be a complex number, in accordance with the fact that the equations ξ(z) = 0 and x(z) = 0 have solutions within the complex plane. The polynomial equation z T = 1 is important in establishing the connection between the z-transform of a sequence of T elements and the corresponding discrete Fourier transform. The solutions of the equation are the complex numbers





2πj 2πj −i2πj j = cos − i sin ; j = 0, 1, . . . , T − 1. (22) WT = exp T T T These constitute a set of points, described as the T roots of unity, that are equally spaced around the circumference of the unit circle at angles of ωj = 2πj/T radians. Multiplying any of these angles by T will generate a multiple of 2π radians, which coincides with an angle of zero radians, which is the angle, or argument, attributable to unity within the complex plane. Using this conventional notation for the roots of unity, we can write the discrete Fourier transform of (20) and its inverse as T −1 1  ξj = xt WTjt T t=0

and

xt =

T −1  j=0

11

ξj WT−jt .

(23)

D.S.G. POLLOCK: Statistical Fourier Analysis The advantage of the z-transforms is that they enable us to deploy the ordinary algebra of polynomials and power series to manipulate the objects of Fourier analysis. In econometrics, it is common to replace z by the so-called lag operator L, which operates on doubly-infinite sequences and which has the effect that Lx(t) = x(t − 1). It is also useful to replace z by a variety of matrix operators. The matrix lag operator LT = [e1 , e2 , . . . , eT −1 , 0] is derived from the identity matrix IT = [e0 , e1 , . . . , eT −1 ] of order T by deleting its leading column and appending a column of zeros to the end of the array. Setting z = LT within the polynomial x(z) = x0 + x1 z + · · · + xT −1 z T −1 gives rise to a banded lower-triangular Toeplitz matrix of order T . The matrices IT = L0T , LT , L2T , . . . , LTT −1 form a basis for the vector space comprising all such matrices. The ordinary polynomial algebra can be applied to these matrices with the provision that the argument LT is nilpotent of degree T , which is to say that LqT = 0 for all q ≥ T . Circulant Matrices A matrix argument that has properties that more closely resemble those of the complex exponentials is the circulant matrix KT = [e1 , . . . , eT −1 , e0 ], which is formed by carrying the leading column of the identity matrix IT to the back of the array. This is an orthonormal matrix of which the transpose is the inverse, such that KT KT = KT KT = IT . The powers of the matrix form a T -periodic sequence such that KTT +q = KTq for all q. The periodicity of these powers is analogous to the periodicity of the powers of the argument z = exp{−i2π/T }, which is to be found in the Fourier transform of a sequence of T elements. The matrices KT0 = IT , KT , . . . , KTT −1 form a basis for the set of all circulant matrices of order T —a circulant matrix X = [xij ] of order T being defined as a matrix in which the value of the generic element xij is determined by the index {(i − j) mod T }. This implies that each column of X is equal to the previous column rotated downwards by one element. The generic circulant matrix has a form that may be illustrated follows: ⎡ ⎤ x0 xT −1 xT −2 . . . x1 x0 xT −1 . . . x2 ⎥ ⎢ x1 ⎢ ⎥ x x1 x0 . . . x3 ⎥ . (24) X=⎢ ⎢ .2 .. .. .. ⎥ .. ⎣ .. ⎦ . . . . xT −1 xT −2 xT −3 . . . x0 There exists a one-to-one correspondence between the set of all polynomials of degree less than T and the set of all circulant matrices of order T . Thus, if x(z) is a polynomial of degree less that T , then there exits a corresponding circulant matrix X = x(KT ) = x0 IT + x1 KT + · · · + xT −1 KTT −1 .

(25)

A convergent sequence of an indefinite length can also be mapped into a circulant matrix. If {ci } is an absolutely summable sequence obeying the 12

D.S.G. POLLOCK: Statistical Fourier Analysis  condition that |c i | < ∞, then the z-transform of the sequence, which is defined by c(z) = cj z j , is an analytic function on the unit circle. In that case, replacing z by KT gives rise to a circulant matrix C ◦ = c(KT ) with finite-valued elements. In consequence of the periodicity of the powers of KT , it follows that       ∞ ∞ ∞ ◦ cjT IT + c(jT +1) KT + · · · + c(jT +T −1) KTT −1 C = (26) j=0 j=0 j=0 = c◦0 IT + c◦1 KT + · · · + c◦T −1 KTT −1 . Given that {ci } is a convergent sequence, it follows that the sequence of the matrix coefficients {c◦0 , c◦1 , . . . , c◦T −1 } converges to {c0 , c1 , . . . , cT −1 } as T increases. Notice that the matrix c◦ (KT ) = c◦0 IT + c◦1 KT + · · · + c◦T −1 KTT −1 , which is derived from a polynomial c◦ (z) of degree T − 1, is a synonym for the matrix c(KT ), which is derived from the z-transform of an infinite convergent sequence. The polynomial representation is enough to establish that circulant matrices commute in multiplication and that their product is also a polynomial in K. That is to say If X = x(KT ) and Y = y(KT ) are circulant matrices, then XY = Y X is also a circulant matrix.

(27)

The matrix operator KT has a spectral factorisation that is particularly useful in analysing the properties of the discrete Fourier transform. To demonstrate this factorisation, we must first define the so-called Fourier matrix. This is a symmetric matrix U = T −1/2 [W jt ; t, j = 0, . . . , T − 1], of which the generic element in the jth row and tth column is W jt , where W = exp{−i2π/T } is the first root of unity, from which we are now omitting the subscripted T for ease of notation. On taking account of the T -periodicity of W q , the matrix can be written explicitly as ⎡ ⎤ 1 1 1 ... 1 ⎢1 W W2 . . . W T −1 ⎥ ⎥ 1 ⎢ 1 W2 W4 . . . W T −2 ⎥ . U=√ ⎢ (28) ⎢ .. .. .. ⎥ T ⎣ .. ⎦ . . . . 1 W T −1 W T −2 . . . W The second row and the second column of this matrix contain the T roots of ¯ = T −1/2 [W −jt ; t, j = 0, . . . , T − 1]; unity. The conjugate matrix is defined as U −q T −q , this can be written explicitly as and, by using W = W ⎤ ⎡ 1 1 1 ... 1 ⎢ 1 W T −1 W T −2 . . . W ⎥ ⎥ ⎢ 1 T −2 T −4 ¯ = √ ⎢1 W W ... W2 ⎥. U (29) ⎢ .. .. .. ⎥ T ⎣ .. ⎦ . . . . . . . W T −1 1 W W2 13

D.S.G. POLLOCK: Statistical Fourier Analysis The matrix U is a unitary, which is to say that it fulfils the condition ¯U = UU ¯ = I. U

(30)

The operator K, from which we now omit the subscript, can be factorised as where

¯ DU = U D ¯U ¯, K=U

(31)

D = diag{1, W, W 2 , . . . , W T −1 }

(32)

is a diagonal matrix whose elements are the T roots of unity, which are found on the circumference of the unit circle in the complex plane. Observe also that ¯ Dq U = U D ¯ qU ¯ for any D is T -periodic, such that Dq+T = Dq , and that K q = U integer q. Since the powers of K form the basis for the set of circulant matrices, it follows that all circulant matrices are amenable to a spectral factorisation based on (31). Circulant matrices have represented a mathematical curiosity ever since their first appearance in the literature in a paper by Catalan (1846). The literature on circulant matrices, from their introduction until 1920, was summarised in four papers by Muir (1911)–(1923). A recent treatise on the subject, which contains a useful bibliography, has been provided by Davis (1979); but his book does not deal with problems in time-series analysis. An up-to-date account, orientated towards statistical signal processing, has been provided by Gray (2002). The Matrix Discrete Fourier Transform ¯ are entailed in the discrete Fourier transform. Thus, The matrices U and U the equations of (20) or (23) can be written as ξ = T −1/2 U x = T −1 W x

and

¯ξ = W ¯ ξ, x = T 1/2 U

(33)

where x = [x0 , x1 , . . . xT −1 ] and ξ = [ξ0 , ξ1 , . . . , ξT −1 ] , and where W = [W jt ] ¯ = [W −jt ] are the matrices of (28) and (29) respectively, freed from their and W scaling factors. We may note, in particular, that ¯ e0 ι = T 1/2 U e0 = T 1/2 U

¯ ι, e0 = T −1/2 U ι = T −1/2 U

and

(34)

where e0 = [1, 0, . . . 0] is the leading column of the identity matrix and ι = [1, 1, . . . , 1] is the summation vector. Thus, a spike or an impulse located at t = 0 in the time domain or at j = 0 in the frequency domain is transformed into a constant sequence in the other domain and vice versa. These are extreme examples of the inverse relationship between the dispersion of a sequence and that of its transform. Given the circulant the matrix X = x(K) defined in (25), the discrete Fourier transform and its inverse may be represented by the following spectral factorisations: ¯ x(D)U ¯. X=U and x(D) = U X U (35) 14

D.S.G. POLLOCK: Statistical Fourier Analysis

7

8

0

0

4

4

0

0

3

0

6

5

2

1

8

3

2

0

5

0

6

7 1

0

Figure 4. A device for finding the circular correlation of two sequences. The upper disc is rotated clockwise through successive angles of 30 degrees. Adjacent numbers on the two discs are multiplied and the products are summed and divided by the sample size T to obtain the circular covariances.

Here, x(D) = T diag{ξ0 , ξ1 , . . . , ξT −1 } is a diagonal matrix containing the spectral ordinates of the data. The equations of (33) are recovered by post multiplying the X by e0 and x(D) by ι and using the results of (34). Assuming that the data are mean adjusted, i.e. that ι X = 0, the cross product of the matrix X gives rise to the matrix C ◦ of the circular autocovariances of the data. Thus ¯ c◦ (D)U T −1 X  X = C ◦ = U ¯ x(D)x(D)U, ¯ = T −1 U

(36)

¯ =U ¯ x(D)U ¯ . where we have used X  = U x(D)U ◦ ◦ The values of the elements of [c0 , c1 , . . . , c◦T −1 ] = c◦ = C ◦ e0 are given by the formula T −1 1  ◦ cτ = xt xt+τ ; where xt = x(t mod T ) T t=0 or, equivalently, 1 c◦τ = T

T −1−τ 

xt xt+τ

t=0

τ −1 1  + xt xt+T −τ T t=0

(37)

= cτ + cT −τ , where cτ is the ordinary empirical autocovariance of lag τ . The vector c◦ of circular autocovariances is obtained via a circular correlation of the data vector x. One can envisage the data disposed at T equallyspaced points around the circumferences of two discs on a common axis. The same numbers on the two discs are aligned and products are formed and added. This generates the sum of squares. Then, the upper disc is rotated by an angle of 2π/T radians and the displaced numbers are multiplied together and added. The process is repeated. The T values formed via a complete rotation are each 15

D.S.G. POLLOCK: Statistical Fourier Analysis divided by T , and the result is the sequence of circular autocovariances. This device is illustrated in Figure 4, where T = 12 and where identical sequences on the two discs end with four zeros. The core of the factorisation of the matrix of C ◦ of (36) is the real-valued diagonal matrix ¯ = T Diag{|ξ0 |2 , |ξ1 |2 , . . . , |ξT −1 |2 }, c◦ (D) = T −1 x(D)x(D)

(38)

of which the graph of the elements is described as the periodogram. (The frequencies indexed by j = 0, . . . , T −1 extend from 0 to 2π(T −1)/T . However, for real-valued data, there is ξj = ξT −j , and so it is customary to plot the periodogram ordinates only over a frequency range from 0 to π, as in Figure 14.) ¯ e0 = T −1/2 ι, we get the following Since e0 C ◦ e0 = c◦0 = c0 and U e0 = U expression for the variance: c0 = T −1

T −1 

x2t

j=0

¯ = T −2 ι x(D)x(D)ι =

T −1 

(39) |ξj |2 .

j=0

This is the discrete version of Parseval’s theorem. In statistical terms, it represents a frequency-specific analysis of variance. 4. Fourier Analysis of Temporal Sequences It is clear that the discrete-time Fourier transform is inappropriate to a doublyinfinite stationary stochastic sequence y(t) = {yt ; t = 0, ±1, ±2, . . .} defined over the set of positive and negative integers. Such a sequence, which has infinite energy, is not summable. In the appendix, it is demonstrated there there exists a spectral representation of y(t) in the form of  π y(t) = eiωt dZ(ω), (40) −π

where Z(ω) is continuous non-differentiable complex-valued stochastic process defined on the interval [−π, π]. The increments dZ(ω), dZ(λ) of the process are uncorrelated for all λ = ω. However, the statistical expectation E{dZ(ω)dZ ∗ (ω)} = dF (ω) = f (ω)dω

(41)

comprises a function F (ω), known as the spectral distribution function, which is continuous and differentiable almost everywhere in the interval, provided that y(t) contains no perfectly regular components of finite power. If there are regular components within y(t), then F (ω) will manifest jump discontinuities at the corresponding frequencies. 16

D.S.G. POLLOCK: Statistical Fourier Analysis The discrete-time transform comes into effect when it is applied not to the data sequence itself but, instead, to its autocovariance function γ(τ ) = {γt ; τ = 0, ±1, ±2, . . .}

where

γτ = γ−τ = E(yt yt−τ ).

(42)

To ensure that its transform is mean-square convergent, the autocovariance  2 sequence must the square summable such that γτ < ∞. This is a necessary and sufficient condition for the stationarity of the process y(t). The Fourier transform of the autocovariance function is the spectral density function or power spectrum f (ω), which, in view of the symmetry of γ(τ ), can be expressed as ∞   1  γτ (eiωτ + e−iωτ ) γ0 + f (ω) = 2π τ =1 ∞   1  γτ cos(ωτ ) . γ0 + 2 = 2π τ =1

(43)

Notice, however, that compared with equation (14), the scaling factor 1/2π has migrated from the Fourier integral to the series. Now, the inverse transform takes the form of  π γτ =

−π

eiωτ f (ω)dω.

(44)

 Setting τ = 0 gives γ0 = f (ω)dω, which indicates that the power of the process, which is expressed by its variance, is equal to the integral of the spectral density function over the interval [−π, π]. This a form of Parseval’s theorem. The sequence y(t) is not square summable and, therefore, its energy cannot be expressed in terms of a sum of squares. Instead, its power is expressed, in the time domain, via the statistical moment γ0 = E(yt2 ). In the frequency domain, this power is attributed to an infinite set of sinusoidal components, of which the measure of power is provided by the continuous spectral density function f (ω). One can see, by a direct analogy with a continuous probability density function, that the power associated with a component at a particular frequency is vanishingly small, i.e. of measure zero. Often, it is assumed that, in addition to  square summability, the autocovariances obey the stronger condition that |γτ | < ∞, which is the condition of absolute summability. This condition is satisfied by processes generated by finite-order autoregressive moving-average (ARMA) models, which can be represented by setting d = 0 in the equation y(z) =

θ(z) ε(z), (1 − z)d φ(z)

(45)

which describes an autoregressive integrated moving-average (ARIMA) model. Here ε(z) and y(z) are, respectively, the z-transform of a white-noise forcing function and of the output sequence and θ(z) and φ(z) are polynomials of finite degree. Provided that d = 0 and that the roots of the equation 17

D.S.G. POLLOCK: Statistical Fourier Analysis φ(z) = φ0 + φ1 z + · · · + φp z p = 0 lie outside the unit circle, the spectral density function of the process will be bounded and everywhere continuous. The absolute summability of the ARMA autocovariances is a consequence of the summability of the infinite sequence of moving-average coefficients generated by the series expansion of the rational transfer function θ(z)/φ(z) that maps from the white-noise forcing function ε(t) to the output y(t). These coefficients constitute the impulse response of the transfer function; and the condition of their absolute summability is the bounded input–bounded output (BIBO) stability condition of linear systems analysis. (see Pollock 1999, for example.) The autocovariance generating function of an ARMA process, which is the z-transform of the two-sided sequence {γτ ; τ = 0, ±1, ±2, . . .}, is given by γ(z) = σ 2

θ(z)θ(z −1 ) , φ(z)φ(z −1 )

(46)

where σ 2 = V {ε(t)} is the variance of the white-noise forcing function. The spectral density function f (ω) = γ(exp{iω}) is produced when the locus of z is the unit circle in the complex plane. The conditions of absolute and square summability are violated by the ARIMA models that can be used to describe trended processes. Unless there are finite starting values at a finite distance from the current value, the variance of such a process will be unbounded. (If equation (45) is to remain viable when d = 1, then it is necessary to define y(z) and ε(z) to be the z-transforms of finite sequences that incorporate the appropriate initial conditions—see Pollock (2008), for example.) The spectral density function of an ARIMA process, which is defined in respect of the doubly infinite sequence y(t) will not have a finite-valued integral. Therefore, it is liable to be described as a pseudo spectral density function. See, for example, Maravall and Piece (1987). The unit roots within the factor (1 − z)d of equation (45) contribute infinite power to the ARIMA process from within the neighbourhood of the zero frequency, where the pseudo spectrum is unbounded. In a seasonal ARIMA model, which is used to model persistent seasonal fluctuations, the denominator of the transfer function incorporates the factor S(z) = 1 + z + z + · · · + z 2

s−1

=

s−1 

(1 − e2πj/s z).

(47)

j=1

where s is the number of seasons or months of the year. Here, the generic quadratic factor is 1 − 2 cos(2πj/s) + z 2 = (1 − e2πj/s z)(1 − e2π(s−j)/s z),

(48)

where j < s/2. This factor, which will contribute infinite power to the pseudo spectrum, corresponds to an unbounded spike at the frequency of ωj = 2πj/s which is the jth harmonic of the fundamental seasonal frequency. In the 18

D.S.G. POLLOCK: Statistical Fourier Analysis

1.0 0.5

1

2

3

4

−0.5 −1.0

Figure 5. The values of the function cos{(11/8)πt} coincide with those of its alias cos{(5/8)πt} at the integer points {t = 0, ±1, ±2, . . .}.

numerator of the transfer function, there should be a polynomial R(z) = 1 + ρz + ρ2 z 2 + · · · + ρs−1 z s−1 with a value of ρ < 1 close to unity, which will serve to counteract the effects of S(z) at the non-seasonal frequencies. Latterly, econometricians and others have been focussing their attention on so-called long-memory processes that are characterised by autocovariance sequences that are square summable but not absolutely summable. Such processes are commonly modelled by replacing the difference operator of an ARIMA model by a corresponding fractional difference operator (1−z)d , where d ∈ [0, 0.5). Such models have spectra that are unbounded at the zero frequency. When |d| ≥ 0.5, the condition of the square summability of the autocovariances is violated and the process is no longer stationary. Models incorporating the factional difference operator were proposed by Granger and Joyeux (1980) and Hoskin (1981). They have been described in detail by Beran (1994) and by Palma (2007). Granger and Ding (1996) have considered the provenance of long-memory processes; and they have provided some alternative explanations. Persistent seasonal fluctuations can also be described by long-memory models. Hoskin (1984) noted that, when |d| < 0.5 and |φ| < 1, the process defined by y(z) = ε(z)/(1 − 2φz + z 2 )d would generate a sequence displaying long-term persistence and quasi periodic behaviour. The transfer function of this model, which incorporates a fractional power of the seasonal polynomial of (48), is a so-called Gegenbauer polynomial. The lead of Hoskin has been followed by other authors, including Gray, Zhang and Woodward, (1989) Hidalgo (1997), Artech and Robinson (2000) and McCoy and Stephens (2004), who have defined models with spectra that are unbounded at multiple frequencies. The Problem of Aliasing Notice that the (positive) frequencies in the spectral analysis of a discretetime process are limited to the interval [0, π], bounded by the so-call Nyquist frequency π. The process of sampling imposes a limit of the observability of

19

D.S.G. POLLOCK: Statistical Fourier Analysis

1 0.75 0.5 0.25

A

B

0 −2π

−π

0

π



Figure 6. The function f (λ/2 + π), represented by the semi-continuous line, superimposed upon the function f (λ/2). The sum of these functions is a 2π -periodic function fH (λ), which represents the spectral density of a subsampled process. The segments A and B of f (λ/2 + π) that fall within the interval [−π, π] may be construed as the effects of wrapping f (λ/2), defined over the interval [−2π, 2π], around a circle of circumference 2π .

high-frequency components. To be observable in sampled data, a sinusoidal motion must take no less than two sample periods to complete its cycle, which limits its frequency to no more than π radians per period. A motion of higher angular velocity will be mistaken for one of a velocity that lies within the observable interval, and this is described as the problem of aliasing. The problem is illustrated by Figure 5 which shows that, by sampling it at the integer points {t = 0, ±1, ±2, . . .}, a sinusoid with frequency of (11/8)π radians per period, which is in excess of the limiting Nyquist value of π, will be mistaken for one with a frequency of (5/8)π. The extent to which aliasing is a problem depends upon the structure of the particular time series under analysis. It will be suggested later that, for many econometric time series, the problem does not arise, for the reason that their (positive) frequencies are band-limited to a subinterval of the range [0, π]. However, this fact, which has its own problems, is not commonly recognised in econometric time-series analysis. To understand the statistical aspects of aliasing, we may consider a stationary stochastic process y(t) = {yt ; t = 0, ±1, ±2, . . .} with an autocovariance function γ(τ ) = {γτ ; τ = 0, ±1, ±2, . . .}. The relationship between the spectral density function f (ω) and the autocovariance function is a follows:  γ(τ ) =

π

−π

f (ω)eiωτ dω

←→

f (ω) =

∞ 1  γτ e−iωτ . 2π τ =−∞

(49)

When alternate values are selected from the data sequence, the autocovariance function is likewise subsampled, and there is

20

D.S.G. POLLOCK: Statistical Fourier Analysis

 γ(2τ ) =

π

−π

1 = 2 1 = 2



f (ω)eiω(2τ ) dω 2π

f (λ/2)eiλτ dλ

−2π  −π −2π

(50) 

iλτ

f (λ/2)e



π iλτ

dλ +

−π

f (λ/2)e



2π iλt

dλ +

f (λ/2)e

dλ .

π

Here, we have defined λ = 2ω, and we have used the change of variable technique to obtain the second equality. Within the integrand of (50), there is exp{iλτ } = exp{i(λτ ± 2π)}. Also, f ({λ/2} − π) = f ({λ/2} + π), by virtue of the 2π-periodicity of the function. Therefore, 



−π

−2π

iλτ

f (λ/2)e

π

dλ =

f ({λ/2} − π)ei(λτ −2π) dλ

0



(51)

π iλτ

=

f ({λ/2} + π)e

,

0

where the first equality is an identity and the second exploits the results above. Likewise,  0  2π iλτ f (λ/2)e dλ = f ({λ/2} + π)ei(λτ +2π) dλ π −π (52)  0 =

−π

f ({λ/2} + π)eiλτ .

It follows that within the expression of (50), the first integral may be translated to the interval [0, π], whereas the third integral may be translated to the interval [−π, 0]. After their translations, the first and the third integrands can be combined to form the segment of the function f (π + λ/2) that falls in the interval [−π, π]. The consequence is that 1 γ(2τ ) = 2



π

−π

{f (λ/2) + f (π + λ/2)}eiλτ dλ.

(53)

1 {f (λ/2) + f (π + λ/2)}, 2

(54)

It follows that γ(2τ ) ←→ fH (λ) =

where fH (λ) = fH (λ + 2π) is the spectral density function of the subsampled process. Figure 6 shows the relationship between f (λ/2) and f (π + λ/2). The superimposition of the shifted frequency function f (π + λ/2) upon the function f (λ/2) corresponds to a process of aliasing. To visualise the process, one can imagine a single cycle of the original function f (ω) over the interval 21

D.S.G. POLLOCK: Statistical Fourier Analysis [−π, π]. The dilated function f (λ/2) manifests a single cycle over the interval [−2π, 2π]. By wrapping this segment twice around the circle defined by exp{−iλ} with λ ∈ [−π, π], we obtain the aliased function {f (λ/2)+f (π+λ/2)}. Observe that, if f (ω) = 0 for all |ω| > π/2, then the support of the dilated function would be the interval [−π, π], and the first and third integrals would be missing from the final expression of (50). In that case, there would be no aliasing, and the function fH (λ) would represent the spectrum of the original process accurately. In the case of a finite sample, the frequency-domain representation and the time-domain representation are connected via the discrete Fourier transform. We may assume that the size T of the original sample is an even number. Then, the elements of the sequence {x0 , x2 , x4 , . . . , xT −1 }, obtained by selecting alternate points, are described by x2t =

T −1 

ξj exp{iω1 2tj},

where

ω1 =

j=0



2π T (55)

(T /2)−1

=

{ξj + ξj+(T /2) } exp{iω2 tj},

where

ω2 = 2ω1 =

j=0

4π . T

Here, the second equality follow from the fact that exp{iω2 j} = exp{iω2 [j mod (T /2)]}. To envisage the effect of equation (55), we can imagine wrapping the sequence ξj ; j = 0, 1, . . . , T − 1 twice around a circle of circumference T /2 so that its elements fall on T /2 points at angles of ω2j = 4πj/T ; j = 1, 2, . . . , T /2 from the horizontal. Undoubtedly, the easiest way to recognise the effect of aliasing in the case of a finite sample is to examine the effect upon the matrix representation of the discrete Fourier transform. For conciseness, we adopt the expressions under (33), and we may take the case were T = 8. Subsampling by a factor of 2 is ¯ alternate rows corresponding to the a matter of removing from the matrix U odd-valued indices. This gives ⎡ ⎤ ξ0 ⎢ ξ1 ⎥ ⎤⎢ ⎥ ⎡ ⎤ ⎡ 1 1 1 1 1 1 1 1 x0 ⎢ ξ2 ⎥ ⎢ ⎥ 6 4 2 6 4 2 ⎢ x2 ⎥ ⎢ 1 W8 W8 W8 1 W8 W8 W8 ⎥ ⎢ ξ3 ⎥ ⎦⎢ ⎥ ⎣ ⎦ =⎣ x4 1 W84 1 W84 1 W84 1 W84 ⎢ ξ4 ⎥ ⎢ ⎥ (56) x6 1 W82 W84 W86 1 W82 W84 W86 ⎢ ξ5 ⎥ ⎣ ⎦ ξ6 ⎡ ⎤⎡ ⎤ 1 1 1 1 ξ0 + ξ4 ξ7 ⎢ 1 W43 W42 W4 ⎥ ⎢ ξ1 + ξ5 ⎥ =⎣ ⎦⎣ ⎦. 1 W42 1 W42 ξ2 + ξ6 1 W4 W42 W43 ξ3 + ξ7 We see that, in the final vector on the RHS, the elements ξ0 , ξ1 , ξ2 , ξ3 are conjoined with the elements ξ4 , ξ5 , ξ6 , ξ7 , which are associated with frequencies that exceed the Nyquist rate. 22

D.S.G. POLLOCK: Statistical Fourier Analysis 5. Linear Filters in Time and Frequency In order to avoid the effects of aliasing in the process of subsampling the data, it is appropriate to apply an anti-aliasing filter to remove from the data those frequency components that are in excess of the Nyquist frequency of the subsampled date. In the case of downsampling by a factor of two, we should need to remove, via a preliminary operation, those components of frequencies in excess of π/2. A linear filter is defined by a set of coefficients {ψk ; k = −p, . . . , q}, which are applied to the data sequence y(t) via a process of linear convolution to give a processed sequence q  ψk y(t − k). (57) x(t) = k=−p

  k t On the z-transforms ψ(z) = k ψk z , y(z) = t yt z and x(z) =  defining t t xt z , we may represent the filtering process by the following polynomial or series equation: x(z) = ψ(z)y(z). (58) Here, ψ(z) is described as the transfer function of the filter. To ensure that the sequence x(t) is bounded if y(t) is bounded, it is necessary and sufficient for the coefficient sequence to be absolutely summable, such that k |ψk | < ∞. This is known by engineers as the bounded input– bounded output (BIBO) condition. The condition guarantees the existence of the discrete-time Fourier transform of the sequence. However, a meaning must be sought for this transform. Consider, therefore, the mapping of a (doubly-infinite) cosine sequence y(t) = cos(ωt) through a filter defined by the coefficients {ψk }. This produces the output x(t) =



ψk cos(ω[t − k])

k

=



ψk cos(ωk) cos(ωt) +

k



ψk sin(ωk) sin(ωt)

(59)

k

= α cos(ωt) + β sin(ωt) = ρ cos(ωt − θ),   √ 2 where α = (α + β 2 ) and θ = k ψk cos(ωk), β = k ψk sin(ωk), ρ = −1 tan (β/α). These results follow in view of the trigonometrical identity of (3). The equation indicates that the filter serves to alter the amplitude of the cosine signal by a factor of ρ and to displace it in time by a delay of τ = θ/ω periods, corresponding to an angular displacement of θ radians. A (positive) phase delay is inevitable if the filter is operating in real time, with j ≥ 0, which is to say that only the present and past values of y(t) are comprised by the filter. However, if the filter is symmetric, with ψ−k = ψk , which necessitates working off-line, then β = 0 and, therefore, θ = 0. Observe that the values of ρ and θ are specific to the frequency ω of the cosine signal. The discrete-time Fourier transform of the filter coefficients 23

D.S.G. POLLOCK: Statistical Fourier Analysis allows the gain and the phase effects to be assessed over the entire range of frequencies in the interval [0, π]. The transform may be obtained by setting z = exp{−iω} = cos(ω) − i sin(ω) within ψ(z), which constrains this argument to lie on the unit circle in the complex plane. The resulting function 

ψ(exp{−iω}) =

ψk cos(ωk) − i



k

ψk sin(ωk) (60)

k

= α(ω) − iβ(ω) is the frequency response function, which is, in general, a periodic complexvalued function of ω with a period of 2π. In the case of a symmetric filter, it becomes a real-valued and even function, which is symmetric about ω = 0. For a more concise notation, we may write ψ(ω) in place of ψ(exp{−iω}). In that case, the relationship between the coefficients and their transform can be conveyed by writing {ψk } ←→ ψ(ω). The effect of a linear filter on a stationary stochastic process, which is not summable, can be represented in terms of the autocovariance generating functions fy (z) and fx (z) of the sequence and its transform, respectively. Thus fx (z) = |ψ(z)|2 fy (z),

(61)

where |ψ(z)|2 = ψ(z)ψ(z −1 ). Setting z = exp{−iω} gives the relationship between the spectral density functions, which entails the squared gain |ψ(ω)|2 = |ψ(e−iω )|2 of the filter. The Ideal Filter Imagine that it is required to remove from a data sequence the components of frequencies in excess of ωd within the interval [0, π]. This requirement may arise from an anti-aliasing operation, preliminary to sub sampling, or it may arise from a desire to isolate a component that lies in the frequency range [0, ωd ]. We shall discuss the motivation in more detail later. The ideal filter for the purpose would be one that preserves all components with (positive) frequencies in the subinterval [0, ωd ) and nullifies all those with frequencies in the subinterval (ωd , π]. The specification of the frequency response function of such a filter over the interval [−π, π] is ⎧ 1, if |ω| ∈ (0, ωd ), ⎪ ⎨ ψ(ω) = 1/2, if ω = ±ωd , ⎪ ⎩ 0, otherwise.

(62)

The coefficients of the filter are given by the sinc function: 1 ψk = 2π



ωd

−ωd

iωk

e

dω =

⎧ ωd ⎪ ⎨ π,

if k = 0;

⎪ ⎩ sin(ωd k) , if k = 0 . πk

24

(63)

D.S.G. POLLOCK: Statistical Fourier Analysis

0.5 0.4 0.3 0.2 0.1 0 −0.1 −15

−10

−5

0

5

10

15

Figure 7. The central coefficients of the Fourier transform of the frequency response of an ideal low pass filter with a cut-off point at ω = π/2. The sequence of coefficients extends indefinitely in both directions.

1.25 1 0.75 0.5 0.25 0 −π

−π/2

0

π/2

π

Figure 8. The result of applying a 17-point rectangular window to the coefficients of an ideal lowpass filter with a cut-off point at ω = π/2.

1 0.75 0.5 0.25 0 −π

−π/2

0

π/2

π

Figure 9. The result of applying a 17-point Blackman window to the coefficients of an ideal lowpass filter with a cut-off point at ω = π/2.

25

D.S.G. POLLOCK: Statistical Fourier Analysis These coefficients form a doubly-infinite sequence and, in order to apply such a filter to a data sample of size T , it is customary to truncate the sequence, retaining only a limited number of its central elements (Figure 7). The truncation gives rise to a partial-sum approximation of the ideal frequency response that has undesirable characteristics. In particular, there is a ripple effect whereby the gain of the filter fluctuates within the pass band, where it should be constant with a unit value, and within the stop band, where it should be zero-valued. Within the stop band, there is a corresponding problem of leakage whereby the truncated filter transmits elements that ought to be blocked (Figure 8). This is nothing but Gibbs’ effect, which has been discussed already in section 3 and illustrated in Figure 2. The classical approach to these problems, which has been pursued by electrical engineers, has been to modulate the truncated filter sequence with a so-called window sequence, which applies a gradual taper to the higher-order filter coefficients. (A full account of this has been given by Pollock, 1999.) The effect is to suppress the leakage that would otherwise occur in regions of the stop band that are remote from the regions where the transitions occur between stop band and pass band. The detriment of this approach is that it exacerbates the extent of the leakage within the transition regions (Figure 9). The implication here seems to be that, if we wish to make sharp cuts in the frequency domain to separate closely adjacent spectral structures, then we require a filter of many coefficients and a very long data sequence to match it. However, Figures 8 and 9 are based upon the discrete-time Fourier transform, which entails an infinite sequence in the time domain and a continuous function in the frequency domain; and their appearances are liable be misleading. In practice, one must work with a finite number of data points that transform into an equal number of frequency ordinates. If one is prepared to work in the frequency domain, then one can impose an ideal frequency selection upon these ordinates by setting to zero those that lie within the stop band of the ideal filter and conserving those that lie within the pass band. One can ignore whatever values a continuous frequency response function might take in the interstices between these frequency ordinates, because they have no implications for the finite data sequence. The same effects can be achieved by operations performed within the time domain. These entail a circular filter, which is applied to the data by a process of circular convolution that is akin to the process of circular correlation described and illustrated in section 3. In the case of circular convolution, the data sequence and the sequence of filter coefficients are disposed around the circle in opposite directions—clockwise and anti clockwise. The set of T coefficients of a circular filter are derived by applying the (inverse) discrete Fourier transform to a sample of the ordinates of the ideal frequency response, taken at the T Fourier frequencies ωj = 2πj/T ; j = 0, 1, . . . , T − 1. To understand the nature of the filter, consider evaluating the frequency response of the ideal filter at one of the points ωj . The response, expressed in

26

D.S.G. POLLOCK: Statistical Fourier Analysis terms of the doubly-infinite coefficient sequence of the ideal filter, is ψ(ωj ) =

∞ 

ψk W jk ,

(64)

k=−∞

where W j = exp{−iωj } = cos(ωj ) − i sin(ωj ). However, since W q is a T periodic function, there is W ↑ q = W ↑ (q mod T ), where the upward arrow signifies exponentiation. Therefore,       ∞ ∞ ψqT + ψqT +1 W j + · · · ψ(ωj ) = q=−∞

q=−∞

+

  ∞



ψqT +T −1 W j(T −1)

(65)

q=−∞

=

ψ0◦

+

ψ1◦ W j

+ · · · + ψT◦ −1 W j(T −1) ,

for j = 0, 1, 2, . . . , T − 1.

Thus, the coefficients ψ0◦ , ψ1◦ , . . . , ψT◦ −1 of the circular filter would be obtained by wrapping the infinite sequence of sinc function coefficients of the ideal filter around a circle of circumference T and adding the overlying coefficients. By applying a sampling process to the continuous frequency response function, we have created an aliasing effect in the time domain. However, summing the sinc function coefficients is not a practical way of obtaining the coefficients of the finite-sample filter. They should be obtained, instead, by transforming the frequency-response sequence or else from one of the available analytic formulae. In the case where ωd = dω1 = 2πd/T , the coefficients of the filter that fulfils the specification of (61) within a finite sample of T points are given by ⎧ 2d ⎪ ⎪ , if k = 0 ⎨ T ◦ φd (k) = (66) cos(ω1 k/2) sin(dω1 k) ⎪ ⎪ ⎩ , for k = 1, . . . , [T /2], T sin(ω1 k/2) where ω1 = 2π/T and where [T /2] is the integral part of T /2. A more general specification for a bandpass filter is as follows: ⎧ 1, if |ω| ∈ (−ωa , ωb ), ⎪ ⎨ φ(ω) = 1/2, if ω = ±ωa , ±ωb , ⎪ ⎩ 0, for ω elsewhere in [−π, π),

(67)

where ωa = aω1 and ωb = bω1 . This can be fulfilled by subtracting one lowpass filter from another to create φ◦[a,b] (k) = φ◦b (k) − φ◦a (k) cos(ω1 k/2){sin(bω1 k) − sin(aω1 k)} T sin(ω1 k/2) cos(ω1 k/2) sin(dω1 k) . = 2 cos(gω1 k) T sin(ω1 k/2)

=

27

(68)

D.S.G. POLLOCK: Statistical Fourier Analysis Here, 2d = b − a is the width of the pass band (measured in terms of a number of sampled points) and g = (a + b)/2 is the index of its centre. These results have been demonstrated by Pollock (2009). The relationship between the time-domain and the frequency-domain filtering can be elucidated by considering the matrix representation of the filter, which is obtained by replacing the generic argument W j = exp{−iωj } in (64) and (65) by the matrix operator K. This gives ¯ ψ ◦ (D)U. ψ ◦ (K) = Ψ◦ = U

(69)

Replacing z in the z-transform y(z) of the data sequence by K gives the cir¯ y(D)U . Multiplying the data matrix by the filter matrix culant matrix Y = U gives ¯ ψ ◦ (D)U }{U ¯ y(D)U } x(K) = X = Ψ◦ Y = {U (70) ¯ ψ ◦ (D)U Y, ¯ {ψ ◦ (D)y(D)}U = U =U where the final equality follows in view of the identity U Y = y(D)U . Here, there is a convolution in the time domain, represented by the product X = Ψ◦ Y of two circulant matrices, and a modulation in the frequency domain, represented by the product x(D) = ψ ◦ (D)y(D) of two diagonal matrices. Of these matrices, ψ ◦ (D) comprises the zeros and units of the ideal frequency response, whereas y(D) comprises the spectral ordinates of the data. It should be emphasised that ψ(ω) and ψ ◦ (ωj ) are distinct functions of which, in general, the values coincide only at the roots of unity. Strictly speaking, these are the only points for which the latter function is defined. Nevertheless, there may be some interest in discovering the values that ψ ◦ (ωj ) would deliver if its argument were free to travel around the unit circle. Figure 10 is designed to satisfy such curiosity. Here, it can be seen that, within the stop band, the function ψ ◦ (ω) is zero-valued only at the Fourier frequencies. Since the data are compounded from sinusoidal functions with Fourier frequencies, this is enough to eliminate from the sample all elements that fall within the stop band and to preserve all that fall within the pass band. In Figure 10, the transition between the pass band and the stop band occurs at a Fourier frequency. This feature is also appropriate to other filters that we might design, that have a more gradual transition with a mid point that also falls on a Fourier frequency. Figure 11 shows that it is possible, nevertheless, to make an abrupt transition within the space between two adjacent frequencies. An alternative way of generating the filtered sequence within the time domain is to create a periodic extension of the data sequence via successive replications of the data. Then, the filtered values can be generated via an ordinary linear convolution of the wrapped filter coefficients of ψ0 (z) with the data. It can be seen that these products are synonymous with the product of the infinite sequence of sinc function coefficients with an infinite periodic extension of the data. A modulation in the frequency domain, which requires only T multiplications, is a far less laborious operation than the corresponding circular convolution in the time domain, which requires T 2 multiplications and T (T − 1) 28

D.S.G. POLLOCK: Statistical Fourier Analysis

1 0.75 0.5 0.25 0 −π

−π/2

0

π/2

π

Figure 10. The frequency response of the 16-point wrapped filter defined over the interval [−π, π). The values at the Fourier frequencies are marked by circles and dots.

1.25 1 0.75 0.5 0.25 0 −0.25 −π

−π/2

0

π/2

π

Figure 11. The frequency response of the 17-point wrapped filter defined over the interval [−π, π). The values at the Fourier frequencies are marked by circles.

additions. To be set against the relative efficiency of the frequency-domain operations is the need to carry the data to that domain via a Fourier transform and to carry the products of the modulation back to the time domain via an inverse transformation. However, given the efficiency of the Fast Fourier transform, the advantage remains with the frequency-domain operations. Also, the time-domain filter may have to be formed by applying an inverse Fourier transform to a set of ordinates sampled from the frequency response of the filter. The ideal lowpass filter has been implemented in a program that can be downloaded from the following web address: http://www.le.ac.uk/users/dsgp1/ Various devices are available within the program for dealing with trended data sequences. A trend function can be interpolated through the data to deliver a sequence of residual deviations to which the filter can be applied. Thereafter, the filtered residuals and the trend can be recombined to produce 29

D.S.G. POLLOCK: Statistical Fourier Analysis a trend–cycle trajectory. Alternatively, the trend can be eliminated from the data by a twofold application of the difference operator. The filter can be applied to the differenced data and the result can be reinflated via the summation operator, which is the inverse of the difference operator. The program also uses the method of frequency-domain filtering to create an ideal bandstop filter that is able to eliminate the seasonal component of a trended econometric data sequence. Wiener–Kolmogorov Theory The classical Wiener–Kolmogorov filtering theory concerns a doubly infinite data sequence y(t) = {yt ; t = 0, ±1, ±2, . . .}, which is the sum of a signal sequence ξ(t) and a noise sequence η(t): y(t) = ξ(t) + η(t).

(71)

The signal and noise components are assumed to be generated by mutually independent stationary stochastic processes of zero mean. Therefore, the autocovariance generating function of the data, which may be represented by γy (z) = γξ (z) + γη (z),

(72)

is the sum of the autocovariance generating functions of the components. On this basis, a filter may be formulated that produces a minimum meansquare-error estimate of the signal. Its transfer function is given by ψ(z) =

γξ (z) . γξ (z) + γη (z)

(73)

The infinite-sample filter is a theoretical construct from which a practical implementation must be derived. A straightforward way of deriving a filter that is applicable to the finite data sequence contained in a vector y = [y0 , y1 , . . . , yT −1 ] is to replace the argument z within the various autocovariance generating functions by the matrix lag operator LT = [e1 , . . . , eT −1 , 0]and to replace z −1 by FT = LT . Then, the ∞ generating function γy (z) = γ0 + τ =1 γτ (z −τ + z τ ) gives rise to matrix Ωy = γ0 IT +

T −1 

γτ (FTτ + LτT ),

(74)

τ =1

where the limit of the summation is when τ = T −1, since LqT = 0 for all q ≥ T . This is an ordinary autocovariance matrix. The autocovariance matrices of the components Ωξ and Ωη are derived likewise, and then there is Ωy = Ωξ + Ωη . Using the matrices in place of the autocovariance generating functions in (72) gives rise to the following estimate x of the signal component ξ: x = Ωξ (Ωξ + Ωη )−1 y 30

(75)

D.S.G. POLLOCK: Statistical Fourier Analysis It is necessary to preserve the order of the factors of this product since, unlike −1 do not commute the corresponding z-transforms, Ωξ and Ω−1 y = (Ωξ + Ωη ) in multiplication. Such filters have been discussed in greater detail by Pollock (2000), (2001), where the matter of trended data sequences is also dealt with. For an alternative finite-sample representation of the filter, we may replace ¯ DU and z −1 by K  = K T −1 = U DU ¯ . Then, the z in (72) and (73) by KT = U T T generating functions are replaced by the following matrices of circular autocovariances: ¯ γξ (D)U, Ω◦ξ = γξ (K) = U and

¯ γη (D)U Ω◦η = γη (K) = U

¯ γy (D)U. Ω◦y = Ω◦ξ + Ω◦η = γy (K) = U

(76)

Putting these matrices in place of the autocovariance matrices of (75) gives rise to the following estimate of the signal: ¯ γξ (D){γξ (D) + γη (D)}−1 U y = U ¯ Jξ U y. x=U

(77)

The estimate may be obtained in the following manner. First, a Fourier transform is applied to the data vector y to give U y. Then, the elements of the transformed vector are multiplied by those of the diagonal weighting matrix Jξ = γξ (D){γξ (D) + γη (D)}−1 . Finally, the products are carried back to the time domain by the inverse Fourier transform. The same method is indicated by equation (70), which represent the application of the ideal filter. When it is post multiplied by e0 , the latter delivers ¯ ψ ◦ (D)U y, which can also be given the foregoing interprethe expression x = U tation. 6. The Processes Underlying the Data In this section, we shall discuss the relationship between the sampled data and the processes that generate it, which are presumed to operate in continuous time. If the continuous process contains no sinusoidal elements with frequencies in excess of the Nyquist value of π radians, then a perfect reconstruction of the underlying function can be obtained from its sampled values by using the sinc function sin(πt)/πt of Figure 3 as an interpolator. To effect the interpolation, each of the data points is associated with a sinc function scaled by the sampled value. These functions, which are set at unit displacements along the real line, are added to form a continuous envelope. The effect of a sinc function at any integer point t is to disperse the mass of the sampled value over the adjacent points of the real line. The function has a value of unity at t and a value of zero on all other integer points. Therefore, in the process of interpolation, it adds nothing to the values at the adjacent integer points. However, it does attribute values to the non-integer points that lie in the interstices, thereby producing a continuous function from discrete data. Moreover, the data can be recovered precisely by sampling the continuous function at the integers. 31

D.S.G. POLLOCK: Statistical Fourier Analysis Only if the sampled data is free of aliasing, will this process of interpolation succeed in recovering the original continuous function. The question of whether aliasing has occurred is an empirical one that relates to the specific nature of the process underlying the data. However, the models of continuous-time processes that are commonly adopted by econometricians and other statisticians suggest that severe aliasing is bound to occur. The discrete-time white-noise process ε(t) that is the primum mobile of linear stochastic processes of the ARMA and ARIMA varieties is commonly thought to originate from sampling a Wiener process W (t) defined by the following conditions: (a) W (0) = 0, (b) E{W (t)} = 0, for all t, (c) W (t) is normal, (d) dW (s), dW (t) for all t = s are independent stationary increments, (e) ε(t) =

1 {W (t + h) − W (t)} for some h > 0. h

The increments dW (s), dW (t) are impulses that have a uniform power spectrum distributed over the entire real line, with infinitesimal power in any finite interval. Sampling W (t) at regular intervals entails a process of aliasing whereby the spectral power of the cumulated increments gives rise to a uniform spectrum of finite power over the interval [−π, π]. An alternative approach is to imagine that the discrete-time white noise is derived by sampling a continuous-time process that is inherently limited in frequency to the interval [−π, π]. The sinc function of Figure 3, which is limited in frequency to this interval, can also be construed as a wave packet centered on time t = 0. A continuous stochastic function that is band-limited by the Nyquist frequency of π will be generated by the superposition of such functions arriving at regular or irregular intervals of time and having random amplitudes. No aliasing would occur in the sampling of such a process. (The wave packet is a concept from quantum mechanics—see, for example Dirac (1958) and Schiff (1981)—that has become familiar to statisticians via wavelet analysis—see, for example, Daubechies (1992) and Percival and Walden (2000).) Band-Limited Processes It is implausible to assume that a Wiener process could be the primum mobile of a stochastic process that is band limited within a sub interval of the (positive) frequency range [0, π]. Such processes appear to be quite common amongst the components of econometric time-series. Figure 12 displays a sequence of the logarithms of the quarterly series of U.K. Gross Domestic Product (GDP) over the period from 1970 to 2005. Interpolated through this sequence is a quadratic trend, which represents the growth path of the economy. 32

D.S.G. POLLOCK: Statistical Fourier Analysis

12.75 12.5 12.25 12 11.75 11.5 11.25 0

25

50

75

100

125

Figure 12. The quarterly sequence of the logarithms of the GDP in the U.K. for the years 1970 to 2005, inclusive, together with a quadratic trend interpolated by least squares regression.

0.1 0.075 0.05 0.025 0 −0.025 −0.05 −0.075 0

25

50

75

100

125

Figure 13. The residual sequence from fitting a quadratic trend to the income data of Figure 12. The interpolated line represents the business cycle.

0.006 0.004 0.002 0 0

π/4

π/2

3π/4

π

Figure 14. The periodogram of the residuals obtained by fitting a quadratic trend through the logarithmic sequence of U.K. GDP. The parametric spectrum of a fitted AR(2) model is superimposed upon the periodogram.

33

D.S.G. POLLOCK: Statistical Fourier Analysis

0.15 0.1 0.05 0

−20

−15

−10

−5

0

5

10

15

20

Figure 15. The sinc function wave-packet ψ3 (t) = sin(πt/8)/πt comprising frequencies in the interval [0, π/8].

The deviations from this growth path are a combination of the lowfrequency business cycle with the high-frequency fluctuations that are due to the seasonal nature of economic activity. These deviations are represented in Figure 13, which also shows an interpolated continuous function that is designed to represent the business cycle. The periodogram of the deviations is shown in Figure 14. This gives a clear indication of the separability of the business cycle and the seasonal fluctuations. The spectral structure extending from zero frequency up to π/8 belongs to the business cycle. The prominent spikes located at the frequency π/2 and at the limiting Nyquist frequency of π are the property of the seasonal fluctuations. Elsewhere in the periodogram, there are wide dead spaces, which are punctuated by the spectral traces of minor elements of noise. The slowly varying continuous function interpolated through the deviations of Figure 13 has been created by combining a set of sine and cosine functions of increasing frequencies in the manner of equation (1), but with the summation extending no further than the limiting frequency of the business cycle, which is π/8. A process that is band limited to the frequency interval [0, π/8] would be derived from the random arrival of wave packets of random amplitudes, likewise band limited in frequency. The sinc function is but one example such a packet; and it has the disadvantage of being widely dispersed in time. Its advantage is that it provides an orthogonal basis for the set of band-limited functions. The basis set is {ψ3 (t − 8k); k = 0, ±1, ±2, . . .}

with ψ3 (t) =

sin(πt/8) , πt

(78)

wherein the functions are displaced, from one to the next, by 8 sample points. (Here the subscript on ψ3 is in reference to the fact that 8 = 23 and that the interval [0, π/8] arises from the 3rd dyadic subdivision of [0, π].) The function at t = 0 is represented in Figure 15. Whereas a continuous band-limited stochastic function would be constituted by adding the envelopes of the succession of 34

D.S.G. POLLOCK: Statistical Fourier Analysis continuous sinc functions, its sampled version would be obtained by adding the sampled ordinates that are also represented in Figure 15. A further point concerns the autocovariances of a band-limited process. The autocovariance function of an ordinary linear stochastic process of the ARMA variety is positive definite, which corresponds to the nonzero property of the spectral density function. Likewise, the matrix of autocovariances is positive definite, which implies that all of its eignenvalues are positive. Therefore, it is not possible to represent a band-limited process by such means. However, it is straightforward to represent the autocovariances of a bandlimited process via a circulant matrix of the form ¯ γ(D)U. γ(K) = Γ = U

(79)

Here, the diagonal matrix γ(D) represents the spectral density of the process, sampled at T frequency points. The band-limited nature of the process will be reflected by the zero-valued elements of the diagonal of γ(D) and by the singularity of Γ. Fitting ARMA Models to the Business Cycle One might wish to fit a parametric model to the detrended data of Figure 13. For this purpose, the multiplicative seasonal ARIMA model of Box and Jenkins (1976) seems, at first, to be a natural choice. Such a model embodies a seasonal difference operator of the form (1 − z 4 ) = (1 − z)(1 + z + z 2 + z 3 ), of which the ordinary difference operator (1 − z) is a factor. However, the data of Figure 13, have already been detrended. Therefore, the presence or the redundant difference operator renders the model inappropriate. A more appropriate recourse is to remove the seasonal component from the data before fitting an autoregressive model aimed at representing the business cycle component. The seasonal component can be removed from the data of Figure 13 by nullifying the spectral ordinates at the seasonal frequencies and at some adjacent frequencies. This operation, which can be performed by the program mentioned in Section 5, is represented in Figure 14 by the highlighted bands that correspond to the stop bands of the deseasonalising filter. Also represented in Figure 14 is the parametric spectrum of a second-order autoregressive AR(2) model that has been fitted to the seasonally adjusted data, One would expect to be able to characterise the low-frequency dynamics of the business cycle component via a pair of conjugate complex autoregressive roots to be found within the estimated model. The arguments of such roots should characterise the angular velocity of the cycle; and their moduli should represent the degree of its damping. However, it is found that the roots in question are real-valued. It has been reported by Pagan (1997), amongst others, that this is the almost invariable outcome of such exercises. We might wish, nevertheless, to use an ARMA model to characterise the business cycle component that is supported on the interval [0, π/8] and which is represented by the smooth continuous function that has been interpolated through the data of Figure 13. In general, when we wish to fit an ARMA model to a component of the data that is supported on a frequency interval [0, ωd ], 35

D.S.G. POLLOCK: Statistical Fourier Analysis where ωd < π, it is necessary to map the component onto the full frequency interval [0, π]. For this purpose, it may be necessary to synthesise a continuous function from the Fourier ordinates within the subinterval [0, ωd ]. The function can then be resampled at the lesser rate of one observation in every π/ωd periods to produce data that are appropriate to the purpose. In the case of the business cycle component of Figure 13, which has a spectral signature that falls within the interval [0, π/8], it is sufficient to take one in eight of the data points of the anti-aliased sequence that has been obtained by subjecting the original data to an ideal lowpass filter with a cut-off at π/8. The periodogram of the resulting subsampled sequence, which has the frequency range of [0, π], will have a profile identical to that of the spectral of structure that occupies the interval [0, π/8] in Figure 14. The AR(2) model that fits these data has a pair of conjugate complex roots with an argument of θ = 1.35 radians (77◦ ), which corresponds to a cyclical period of 37 quarters, and a modulus of ρ = 0.695. However, the evidence of the interpolated function of Figure 13 and of the periodogram of Figure 14 suggests that the estimated period is an average of the durations longer major cycle and a shorter minor cycle. However, with only 18 points available in the subsampled data, it seems that there is insufficient information to justify the presentation of the AR(4) model that reflects these putative cycles in two pairs of conjugate complex roots. The Recognition of the Frequency Limitation Some reasons should be given for why the prevalence of band limited processes has been overlooked by econometricians. One evident reason is that, in the past, econometricians have rarely examined the periodograms of their data sequences. However, the band limited nature of a process is liable to be obscured whenever there are significant disjunctions in the periodic extension of the data at the points where the end of one replication of the sample joins the beginning of the next. Such disjunctions will give rise to a slew of spectral power that will serve to conceal the underlying structure of a band-limited process. The presence of a barely perceptible trend in the data will be enough to cause such a disjunction. The periodic extension of a finite linearly trending sequence will have the appearance of the serrated edge of a carpenter’s saw. The spectrum of a saw tooth function has the one-over-f profile of a rectangular hyperbola, which will mask the finer spectral details. (See Hamming 1989, for example.) Granger (1966) has described such spectra as typical of econometric time series. His characterisation has greatly influenced the perceptions of econometricians. It has suggested to many of them that the spectral analysis of economic data is liable to be difficult if not fruitless. Several authors, including Granger and Hyung (2004), have observed that structural breaks within the processes generating the data can give rise to slowly decaying autocorrelations and to other properties of long-memory processes, including one-over-f periodogams that decay hyperbolically as the frequency rises. Such breaks are not dissimilar to the end-of-sample disjunctions which are 36

D.S.G. POLLOCK: Statistical Fourier Analysis considered here. However, the latter seem to provide a more fruitful explanation for the occurrence of misleading long-memory properties. A third reason for the failure to perceive the band-limited nature of some of these process may be that econometricians have been unduly influenced by the prevalence of the Wiener process as a model of the forcing functions of continuous-time stochastic processes. A Wiener process, which has a frequency band of infinite width, is the antithesis of a band-limited process. Within the computer program that is an adjunct of this paper, there are several provisions for ensuring that the data can be wrapped seamlessly around the circumference of a circle in a manner that avoids disjunctions in the periodic extension. If the data are to be detrended by the interpolation of a polynomial function, then extra weight can be given to the points at the beginning and the end of the sample to ensure that the polynomial passes through their midst. This may have the desired effect upon the residual sequence, which will be used thereafter in the analysis. If the data have already been detrended, then it may be desirable to add tapered extrapolations at both ends which converge on a common asymptote. The extrapolations are liable to be discarded after the data have been filtered. Alternatively, if the detrended data display prominent seasonal fluctuations of a pattern that is changing over the course of the sample, then a segment may be interpolated into the circular data in which the pattern of the final year of the sample gradually morphs into the pattern of the first year. References Artech, J., and P.M. Robinson, (2000), Semiparametric Inference in Seasonal and Cyclical Long Memory Processes, Journal of Time Series Analysis, 21, 1–25. Baher, H., (1990), Analog and Digital Signal Processing, John Wiley and Sons, Chichester. Beran, J., (1994), Statistics for Long-memory Processes, Chapman and Hall, London. Box, G.E.P., and G.M. Jenkins, (1976), Time Series Analysis: Forecasting and Control: Revised Edition, Holden-Day, San Francisco. Brigham, E.O., (1988), The Fast Fourier Transform and its Applications, Prentice-Hall, Englewood Cliffs, New Jersey. Brillinger, D.R., (1975), Time Series: Data Analysis and Theory, Holt, Rinehart and Winston, New York. Brockwell, P.J., and R.A. Davis, (1987), Time Series: Theory and Methods, Springer-Verlag, New York. Carslaw, H.S., (1925), A Historical Note on Gibbs’ Phenomenon in Fourier Series and Integrals, Bulletin of the American Mathematical Society, 31, 420– 424. 37

D.S.G. POLLOCK: Statistical Fourier Analysis Carslaw, H.S., (1930), Introduction to the Theory of Fourier’s Series and Integrals, Third Edition, Macmillan, London. Catalan, E., (1846), R´echerches sur les D´eterminants, l’Academie Royale de Belgique, 13, 534–555.

Bulletin de

Churchill, R.V., and J.W. Brown, (1978), Fourier Series and Boundary Value Problems, Third edition, McGraw-Hill, New York. Daubechies, I., (1992), Ten Lectures on Wavelets, Society for Industrial and Applied Mathematics, Philadelphia. Davis, P.J., (1979), Circulant Matrices, John Wiley and Sons, New York. Dirac, P.A.M., (1958), The Principles of Quantum Mechanics, Fourth Edition, Oxford University Press, Oxford. Fishman, G.S., (1969), Spectral Methods in Econometrics, Harvard University Press, Cambridge, Massachusetts. Fuller, W.A., (1976), Introduction to Statistical Time Series, John Wiley and Sons, New York. Granger, C.W.J, (1966), The Typical Spectral Shape of an Economic Variable, Econometrica, 34, 150–161. Granger, C.W.J., and Z. Ding, (1996), Varieties of Long Memory Models, Journal of Econometrics, 73, 61–77. Granger, C.W.J., and M. Hatanaka, (1964), Spectral Analysis of Economic Time Series, Princeton University Press, Princeton, New Jersey. Granger, C.W.J., and N. Hyung, (2004), Occasional Structural Breaks and Long Memory with an Application to the S&P 500 Absolute Stock Returns, Journal of Empirical Finance, 3, 399–421. Granger, C.W.J., and R. Joyeux, (1980), An Introduction to Long-memory Time Series and Fractional Differencing, Journal of Time Series Analysis, 1, 15–30. Gray, R.M., (2002), Toeplitz and Circulant Matrices: A Review, Information Systems Laboratory, Department of Electrical Engineering, Stanford University, Stanford, California, [email protected]. Gray, H.L., N.F. Zhang, and W.A. Woodward, (1989), On Generalized Fractional Processes, Journal of Time Series Analysis, 10, 233–57. Grenander, U., and M. Rosenblatt, (1957), Statistical Analysis of Stationary Time Series, John Wiley and Sons, New York. Hamilton, J.D., (1994), Time Series Analysis, Princeton University Press, Princeton, New Jersey. Hamming, R.W., (1989), Digital Filters: Third Edition, Prentice-Hall, Englewood Cliffs, New Jersey. 38

D.S.G. POLLOCK: Statistical Fourier Analysis Hannan, E.J., (1960), Time-series Analysis, Methuen, London. Hidalgo, J., (1997), Spectral Analysis for Bivariate Time Series with Long Memory, Econometric Theory, 12, 773–792. Hosking J.R.M., (1981), Fractional Differencing, Biometrika, 68, 165–176. Jenkins, G.M., and D.G. Watts, (1968), Spectral Analysis and its Applications, Holden-Day, Oakland, California. Kolmogorov, A.N., (1941a), Interpolation und Extrapolation von station¨ aren zuffaligen Folgen, Bulletin de l’academie des sciences de U.S.S.R., Ser. Math., 5, 3–14. Kolmogorov, A.N., (1941b) Stationary Sequences in a Hilbert Space, (in Russian), Bulletin of the Moscow State University, 2, 1–40. Kreider, D.L., R.G. Kuller, D.R. Osterberg and F.W. Perkins, (1966), An Introduction to Linear Analysis, Addison Wesley Publishing Co., Reading, Massachusetts. Maravall A., and D.A. Pierce, (1987), A Prototypical Seasonal Adjustment Model, Journal of Time Series Analysis, 8, 177–193. McCoy, E.J., and D.A. Stephens, (2004), Bayesian Time Series Analysis of Periodic Behaviour and Spectral Structure, International Journal of Forecasting, 20, 713–730. Muir, T., (1906), The Theory of Circulants in the Historical Order of Development up to 1860, Proceedings of the Royal Society of Edinburgh, 26, 390–398. Muir, T., (1911), The Theory of Circulants from 1861 to 1880, Proceedings of the Royal Society of Edinburgh, 32, 136–149. Muir, T., (1915), The Theory of Circulants from 1880 to 1900, Proceedings of the Royal Society of Edinburgh, 36, 151–179. Muir, T., (1923), The Theory of Circulants from 1900 to 1920, Proceedings of the Royal Society of Edinburgh, 44, 218–241. Nerlove, M., D.M. Grether and J.L. Carvalho, (1979), Analysis of Economic Time Series: a Synthesis, Academic Press, New York. Pagan, A., (1997), Towards an Understanding of Some Business Cycle Characteristics, The Australian Economic Review, 30, 1–15. Palma, W., (2007), Long-Memory Time Series: Theory and Methods, John Wiley and Sons, Hoboken, New Jersey. Percival, D.B., and A.T. Walden, (2000), Wavelet Methods for Time Series Analysis, Cambridge University Press, Cambridge. Pollock, D.S.G., (1999), A Handbook of Time-Series Analysis, Signal Processing and Dynamics, Academic Press, London. Pollock, D.S.G., (2000), Trend Estimation and De-Trending via Rational Square-Wave Filters, Journal of Econometrics, 99, 317–334. 39

D.S.G. POLLOCK: Statistical Fourier Analysis Pollock, D.S.G., (2001), Filters for Short Non-stationary Sequences, Journal of Forecasting, 20, 341–355. Pollock, D.S.G., (2008), Investigating Economic Trends and Cycles, Chapter 3.1 in T.C. Mills and K. Patterson (eds.), Palgrave Handbook of Econometrics: Vol. 2 Applied Econometrics, Palgrave Macmillan, Basingstoke. Pollock, D.S.G., (2009), The Realisation of Finite-Sample Frequency-Selective Filters, Journal of Statistical Planning and Inference, 139, 1541–1558. Priestley, M.B., (1981), Spectral Analysis and Time Series, Academic Press, London. Rosenblatt, M., (1985), Stationary Sequences and Random Fields, Birkh` auser, Basel. Rozanov, Y.A., (1967), Stationary Random Processes, Holden-Day, San Francisco. Schiff, L.I., (1981), Quantum Mechanics, McGraw-Hill Book Co., New York. Stade, E., (2005), Fourier Analysis, John Wiley and Sons, Hoboken, New Jersey. Titchmarsh, E.C., (1937), Introduction to the Theory of Fourier Integrals, Clarendon Press, Oxford. Wiener, N., (1930), Generalised Harmonic Analysis, Acta Mathematica, 55, 117–258. Wiener, N., (1941), Extrapolation, Interpolation and Smoothing of Stationary Time Series. Report on the Services Research Project DIC-6037. Published in book form in 1949 by MIT Technology Press and John Wiley and Sons, New York. Yaglom, A.M., (1962), An Introduction to the Theory of Stationary Random Functions, Prentice-Hall, Englewood Cliffs, New Jersey. Appendix: The Spectral Representation of a Stationary Process Consider a sample {y0 , y1 , . . . , yT −1 } of T observations on a stationary stochastic process. These are amenable to a discrete Fourier transform, whereby they can be expressed as n    yt = αj cos(ωj t) + βj sin(ωj t) ,

(A.1)

j=0

where ωj = 2πj/T and n is the integral part of T /2. By allowing n to tend to infinity, it is possible to express a sequence of indefinite length in terms of a sum of sine and cosine functions. However, in the limit as n → ∞, the coefficients αj , βj tend to vanish; and, therefore, an alternative representation in terms of differentials is called for. 40

D.S.G. POLLOCK: Statistical Fourier Analysis By writing αj = dA(ωj ), βj = dB(ωj ), where A(ω), B(ω) are step functions with discontinuities at the points {ωj ; j = 0, . . . , n}, the expression (A.1) can be rendered as   yt = (A.2) cos(ωj t)dA(ωj ) + sin(ωj t)dB(ωj ) . j

In the limit, as n → ∞, the summation is replaced by an integral to give the expression  π  cos(ωt)dA(ω) + sin(ωt)dB(ω) . (A.3) y(t) = 0

Here, cos(ωt) and sin(ωt), and therefore y(t), may be regarded as infinite sequences defined over the entire set of positive and negative integers. Since A(ω) and B(ω) are discontinuous functions for which no derivatives exist, one must avoid using α(ω)dω and β(ω)dω in place of dA(ω) and dB(ω). Moreover, the integral in (A.3) is a Fourier–Stieltjes integral. In order to derive a statistical theory for the process that generates y(t), one must make some assumptions concerning the functions A(ω) and B(ω). So far, the sequence y(t) has been interpreted as a realisation of a stochastic process. If y(t) is regarded as the stochastic process itself, then the functions A(ω), B(ω) must, likewise, be regarded as stochastic processes defined over the interval [0, π]. A single realisation of these processes now corresponds to a single realisation of the process y(t). The first assumption to be made is that the functions dA(ω) and dB(ω) represent a pair of stochastic processes of zero mean, which are indexed on the continuous parameter ω. Thus     E dA(ω) = E dB(ω) = 0.

(A.4)

The second and third assumptions are that the two processes are mutually uncorrelated and that non-overlapping increments within each process are uncorrelated. Thus   E dA(ω)dB(λ) = 0 for all ω, λ,   E dA(ω)dA(λ) = 0 if ω = λ, (A.5)   E dB(ω)dB(λ) = 0 if ω = λ. The final assumption is that the variance of the increments is given by     V dA(ω) = V dB(ω) = 2dF (ω) = 2f (ω)dω.

(A.6)

It can be seen that, unlike A(ω) and B(ω), F (ω) is a continuous function that is differentiable except, at most, on a set of measure zero. The function F (ω) and its derivative f (ω) are the spectral distribution function and the spectral density function, respectively. 41

D.S.G. POLLOCK: Statistical Fourier Analysis In order to express (A.3) in terms of complex exponentials, we may define a pair of conjugate complex stochastic processes:  1 dA(ω) − idB(ω) , 2  1 ∗ dZ (ω) = dA(ω) + idB(ω) . 2 dZ(ω) =

(A.7)

Also, the domain of the functions A(ω), B(ω) may be extended from [0, π] to [−π, π] by regarding A(ω) as an even function such that A(−ω) = A(ω) and by regarding B(ω) as an odd function such that B(−ω) = −B(ω). Then there is dZ ∗ (ω) = dZ(−ω).

(A.8)

From the conditions of (A.5), it follows that   E dZ(ω)dZ ∗ (λ) = 0

if

ω = λ,



E{dZ(ω)dZ (ω)} = f (ω)dω.

(A.9)

These results may be used to reexpress (A.3) as 

 (eiωt + e−iωt ) (eiωt − e−iωt ) y(t) = dA(ω) − i dB(ω) 2 2 0   π iωt {dA(ω) − idB(ω)} −iωt {dA(ω) + idB(ω)} +e e = 2 2 0   π = eiωt dZ(ω) + e−iωt dZ ∗ (ω) . 

π

(A.10)

0

When the integral is extended over the range [−π, π], this becomes  y(t) =

π

−π

eiωt dZ(ω),

(A.11)

which is commonly described as the spectral representation of the process y(t).

42

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.