Time series analysis - ANU [PDF]

Jul 13, 2011 - ANU 2011: Time series analysis. MATLAB Exercises. 2. Introduction to time series analysis. Exercise 1.1 C

3 downloads 5 Views 105KB Size

Recommend Stories


Modulbeschreibung „Time Series Analysis“
In every community, there is work to be done. In every nation, there are wounds to heal. In every heart,

Time Series Analysis
Seek knowledge from cradle to the grave. Prophet Muhammad (Peace be upon him)

time series analysis
Ask yourself: What kind of person do you enjoy spending time with? Next

Time Series Analysis
Ask yourself: Do I feel and express enough gratitude and appreciation for what I have? Next

Time Series Analysis Ebook
Ask yourself: How can you love yourself more today? Next

Financial Time Series Analysis
You're not going to master the rest of your life in one day. Just relax. Master the day. Than just keep

PDF Time Series Analysis and Its Applications
If you feel beautiful, then you are. Even if you don't, you still are. Terri Guillemets

bibliography - ANU Repository [PDF]
Lampung dalam angka (Lampung in figure) 1981, Bandar Lampung. ___ . 1996. Lampung dalam angka (Lampung in figures) 1994-95, Bandar. Lampung. ___ . 2002. Lampung dalam angka (Lampung in figures) 2001, Bandar. Lam pun g. Breman, J. 1982. The village on

Time-Frequency analysis of biophysical time series
Ask yourself: What's one thing I would like to do more of and why? How can I make that happen? Next

Contents - ANU Press [PDF]
others, like those of Rafael Lara Martinez and Silvia Lucinda .... formed part of the volume Cuentos de. Barro, Narrativa ...... García Márquez. Some viewers may have noticed the signs on the streets of Havana in David. Bradbury's recent film, Fond

Idea Transcript


Time series analysis (July 12 - 13, 2011)

Course Exercise Booklet MATLAB function reference

ANU 2011: Time series analysis. MATLAB Exercises

1

Introduction to time series analysis Exercise 1.1 Controlling frequency, amplitude and phase........................................... 3 Exercise 1.2 Signal representation in the Frequency Domain ..................................... 4 Analysis in the Time Domain Exercise 2.1 Event Spacing of the Cave Creek runoff data......................................... 5 Exercise 2.2 Autocorrelation of the Cave Creek runoff data........................................ 6 Exercise 2.3 Autocorrelation of the SPECMAP Stack ................................................. 7 Signal Filtering Exercise 3.1 Removing long-term trends from the Mauna data set............................. 8 Exercise 3.2 Calculating the first difference curve of the Mauna data......................... 9 Exercise 3.3 Using a moving average........................................................................ 10 Exercise 3.4 Filtering in the frequency domain .......................................................... 11 The Frequency Domain and Spectral Analysis Exercise 4.1 Calculation of a periodogram ................................................................ 12 Exercise 4.2 Time in the Frequency domain.............................................................. 13 Exercise 4.3a Trends in the Frequency domain......................................................... 14 Exercise 4.3b Trends in the Frequency domain(2) .................................................... 15 Exercise 4.4 Processing unequally spaced data........................................................ 16 Exercise 4.5 The CLEAN algorithm............................................................................ 17 Exercise 4.6 White noise in the Frequency Domain .................................................. 18 Exercise 4.7 Welch-Overlapped-Segment-Averaging ............................................... 19 Exercise 4.8 Red noise in the Time and Frequency Domains ................................... 20 The Time-Frequency Plane Exercise 5.1 A Stationary Signal ................................................................................ 21 Exercise 5.2 A nonstationary signal ......................................................................... 222 Exercise 5.3 Evolutionary spectral analysis of a nonstationary signal..................... 233 Exercise 5.4: Evolutionary spectral analysis of the ODP677 18O record ............... 244 PCA / EOF Analysis Exercise 7.1 PCA of a collection of time series…………………………………………25 Exercise 7.2 PCA and the “Hockey Stick"……………………………………………….26 Correlation of two time series Exercise 8.1 Correlation and autocorrelation……………………………………………27 Exercise 8.2 A semi-emperical ice volume model………………………………………28

ANU 2011: Time series analysis. MATLAB Exercises

2

Exercise 1.1 Controlling frequency, amplitude and phase. The first task of this exercise is to calculate and plot a sinusoid with a given frequency. To do this we need to make time points, define the frequency of the sinusoid and then calculate the cycle itself. >> time=[0:1:800]’; >> f=0.01 >> signal=sin(2.*pi.*time.*f); >> plot(time,signal)

Time points between 0 and 800 with a spacing of 1 Define the frequency of the sinusoid as 0.01 Calculate the sinusoid Plot the sinusoid against time

Now we can repeat the calculation but also define the amplitude of the sinusoid. >> time=[0:1:800]’; >> f=0.01 >> A=2.5 >> signal=sin(2.*pi.*time.*f).*A; >> plot(time,signal)

Time points between 0 and 800 with a spacing of 1 Define the frequency of the sinusoid as 0.01 Define the amplitude of the sinusoid as 2.5 Calculate the sinusoid Plot the sinusoid against time

Finally, calculate a sinusoid with a defined phase. >> time=[0:1:800]’; Time points between 0 and 800 with a spacing of 1 >> f=0.01 Define the frequency of the sinusoid as 0.01 >> phi=pi./2 Define the phase of the sinusoid as /2 >> signal=sin(2.*pi.*time.*f+phi); Calculate the sinusoid >> plot(time,signal) Plot the sinusoid against time It is possible to calculate three separate sinusoids with the same frequencies as the main Milankovitch cycles. These sinusoids can then be added together and plotted. >> time=[0:1:800]’;

Time points between 0 and 800 with a spacing of 1

>> fe=1./100; >> eccen=sin(2.*pi.*time.*fe);

Define the eccentricity frequency Calculate a sinusoid with frequency fe

>> fo=1./41; >> obliq=sin(2.*pi.*time.*fo);

Define the obliquity frequency Calculate a sinusoid with frequency fo

>> fp=1./21; >> prec=sin(2.*pi.*time.*fp);

Define the precession frequency Calculate a sinusoid with frequency fp

>> final=eccen+obliq+prec; >> plot(time,final);

Sum the 3 sinusoids together to make a final signal Plot the signal against time

Does your signal which contains the Milankovitch periods look like known patterns of orbital scale climate change?

ANU 2011: Time series analysis. MATLAB Exercises

3

Exercise 1.2 Signal representation in the Frequency Domain

The aim of this exercise is for you to investigate how a signal made from a mixture of sine waves is represented in the Frequency Domain (in other words what happens when you perform the Fourier transform. Using MATLAB calculate 3 different sine waves (i.e. with different amplitudes, frequencies and phases) and add them together to give a composite signal. 1. Produce a plot which shows your composite signal 2. Use fft_plot to produce a plot that shows the frequency spectrum of your composite signal. So, the next part is the important bit; 3. Provide an interpretation of the frequency spectrum that you obtained from your composite signal. Think about the positions of the peaks in the diagram and their relative heights.

ANU 2011: Time series analysis. MATLAB Exercises

4

Exercise 2.1 Event Spacing of the Cave Creek runoff data The time series we will study shows the monthly amount of runoff water (measured in inches) from Cave Creek in Kentucky. Use the event spacing method to estimate the period of each runoff cycle. In MATLAB we first have to load the data into the memory. You can load the Cave Creek data using: >> load cave_creek >> whos

This will tell you which variables are in the memory

At the bottom of the screen you will see some text which looks like: Name

Size

Bytes

Class

month runoff

216x1 216x1

1728 1728

double array double array

Grand total is 432 elements using 3456 bytes This shows us that the variables month and runoff have been loaded into the memory. We can now make a plot of the data: >> plot(month,runoff) >> xlabel('month') >> ylabel('runoff (inches)')

Add a label to the x-axis Add a label to the y-axis

We will use the function ginput.m to mark the points on the chart which we think are events: >> [x,y]=ginput

Obtain data from the figure

You can now use the mouse to record the positions that you think correspond to events in the runoff data. When you have clicked on all the points of interest, press Enter to stop the function and return to the MATLAB prompt. The variables x and y now correspond to the coordinates of the positions you clicked on. You can mark these points on your chart in the following way: >> hold on >> plot(x,y,’o’)

This allows you to add more data to the existing plot. Add the click positions to the plot as circles.

We just need to find the difference between the values in x, in MATLAB this is simple: >> x_diff=diff(x) >> x_mean=mean(x_diff); mean is the command for calculating a mean >> x_std=std(x_diff); std is the command for calculating a standard deviation The event spacing of the Cave Creek runoff data = ± months What is your interpretation of the mean event spacing that you calculated?

ANU 2011: Time series analysis. MATLAB Exercises

5

Exercise 2.2 Autocorrelation of the Cave Creek runoff data The time series we will study shows the monthly amount of runoff water (measured in inches) from Cave Creek in Kentucky. Use the event spacing method to estimate the period of each runoff cycle. In MATLAB we first have to remove any data which might still be in the memory using the clear command and then load the Cave Creek data; >> clear all, close all >> load cave_creek

Remove all data from the memory Load the data set

Our first task is to detrend the data. To do this we can use the function detrend_signal.m >> runoff_d=detrend_signal(month,runoff,1); This will produce the detrended data runoff_d, which has had a straight-line trend removed. A figure is produced by the function that shows the original data, the fitted line and the detrended data. The data is now prepared for the autocorrelation. To do the analysis we use the function autocorr.m >> [rt,lags]=autocorr(runoff_d); >> figure >> plot(lags,rt) >> xlabel('Lag number'); >> ylabel('r');

This function has two outputs rt and lags Make a new figure window Plot the final autocorrelogram Add a label to the x-axis Add a label to the y-axis

Again, here is the important part, what is your interpretation of the autocorrelogram and how does it compare to the Event Spacing method?

ANU 2011: Time series analysis. MATLAB Exercises

6

Exercise 2.3 Autocorrelation of the SPECMAP Stack

Load the SPECMAP file into MATLAB, you will find it contains two variables; age and data (the units of age are ‘ka’ and the data units are ‘normalised oxygen isotope’). 

Use MATLAB to produce a plot of the SPECMAP record. Add appropriate labels to both the x and the y axes.



Detrend the SPECMAP signal.



Perform an autocorrelation including significance levels on the detrended series.



Make an interpretation of the autocorrelation in terms of known climate variation.

ANU 2011: Time series analysis. MATLAB Exercises

7

Exercise 3.1 Removing long-term trends from the Mauna data set We can use the function detrend_signal to remove long-term change from the Mauna data and keep only the higher frequency variation. >> load mauna

load the data into MATLAB

There are two variables time (units of days) and mauna (CO2 data, units of ppm) First, try removing a simple linear trend (a straight line) of the form: y  a1 x  a 0 >> xc=detrend_signal(time,mauna,1)

In this case the 1 tells MATLAB to remove a 1st order polynomial from the data, i.e. a straight line. Next, try removing a quadratic function of the form: y  a 2 x 2  a1 x  a 0 >> xc=detrend_signal(time,mauna,2)

In this case the 2 tells MATLAB to remove a 2nd order polynomial from the data, i.e. a quadratic function. Finally, try removing a cubic function of the form: y  a3 x 3  a 2 x 2  a1 x  a 0 >> xc=detrend_signal(time,mauna,3)

In this case the 3 tells MATLAB to remove a 3rd order polynomial from the data, i.e. a cubic function.

ANU 2011: Time series analysis. MATLAB Exercises

8

Exercise 3.2 Calculating the first difference curve of the Mauna data Calculate the first difference curve for the Mauna CO2 data. Then plot both the original data and the first-difference curve. >> load mauna

load the data into MATLAB

There are two variables time (units of days) and mauna (CO2 data, units of ppm) >> xd=diff(mauna); Calculate the first-difference of the CO2 data >> xt=(diff(time)./2)+time(1:229); Calculate the mid-points of the time data Now we can start the plotting, we want to plot two sets of axes in one window, so we can use the subplot command. >> subplot(2,1,1) This tells MATLAB to set the figure window into a 2 x 1 matrix of plotting areas, and chooses the 1st area to be active. >> plot(time,mauna) >> xlabel('Time (yrs)') >> ylabel('CO2 content (ppm)')

Plot the original data Puts a title onto the x-axis Puts a title onto the y-axis

Now we want to make the second plot so we give the subplot command again, but now choose the 2nd area to be active. >> subplot(2,1,2) >> plot(xt,xd) >> xlabel('Time (yrs)') >> ylabel('First Difference (ppm)')

ANU 2011: Time series analysis. MATLAB Exercises

Plot the first-difference data Puts a title onto the x-axis Puts a title onto the y-axis

9

Exercise 3.3 Using a moving average The MATLAB file porosity contains normalised porosity data from core GeoB4311-02 taken from the equatorial Atlantic. Use the maverage function to produce a smooth record of the data. What is a good smoothing window to use? >> load porosity

load the porosity data

First look at the data by making a simple plot >> plot(age,porosity,'g') >> xlabel('Age (kyr)') >> ylabel('Normalised porosity') >> hold on

This plots the data as a green line Add a x-axis title Add a y-axis title Allow lines to be added to the plot

You can read the instructions on the use of maverage. One required input is a matrix that describes the moving average window to be used. For example; >> f=[1 1 1 1 1] >> f=[1 3 5 3 1]

5pt moving average with equal weights 5pt weighted moving average

>> [Xout,Yout]=maverage(age,porosity,f); >> plot(Xout,Yout,'r')

Calculate the smoothed data Plot the smoothed data as a red line

If you want to try different smoothing windows in a new plot give the command; >> figure this will produce a new plot window in which you can plot the new data.

ANU 2011: Time series analysis. MATLAB Exercises

10

Exercise 3.4 Filtering in the frequency domain Use the function filter_signal to filter the normalised porosity from core GeoB4311. Use the subplot command to make plots which compare the filtered porosity record to the SPECMAP stack. The normalised oxygen isotope data for the SPECMAP stack can be found in the MATLAB file SPECMAP.

ANU 2011: Time series analysis. MATLAB Exercises

11

Exercise 4.1 Calculation of a periodogram In this exercise we will construct two signals with different frequencies and amplitudes and determine their variance. The signals are then combined together and the periodogram (unsmoothed variance spectrum) is calculated using the function pdg. We can then check that the variance returned in the spectrum is the same as in the individual input components. >> time=[0:1:1000]'; >> f1=1./100; >> f2=1./40 >> data1=sin(2.*pi.*time.*f1); >> data2=sin(2.*pi.*time.*f2);

Form a time array between 0 and 1000 Frequency of the first signal Frequency of the second signal Calculate the first sinusoid Calculate the second sinusoid

Now we have the sinusoids but they have equal amplitudes and equal variances. If we change the amplitude then that will also change the variance. >> data1=data1.*4.5; >> data2=data2.*7.0; >> v1=var(data1) >> v2=var(data2)

Set the amplitude of the first signal to 4.5 Set the amplitude of the second signal to 7.0 Calculate the variance of the first signal Calculate the variance of the second signal

>> input_data=data1+data2;

Combine the signals to give the final time series

>> subplot(2,1,1) >> plot(time,data1,time,data2) >> xlabel('Time') >> ylabel('Signal') >> subplot(2,1,2) >> plot(time,input_data) >> xlabel('Time') >> ylabel('Signal')

Create a plot in the upper half of the figure Plot the individual sinusoids Add a x-axis title Add a y-axis title Create a plot in the lower half of the figure Plot the final input time series Add a x-axis title Add a y-axis title

The final part of this exercise is to calculate the periodogram and plot the results. >> [f,Pyy]=pdg(time,input_data); >> figure >> plot(f,Pyy) >> xlim([0 0.05]) >> xlabel('Frequency') >> ylabel('Variance')

Calculate the periodogram Make a new figure window Plot frequency against the variance Set the maximum plotted frequency to 0.05 Add a x-axis title Add a y-axis title

You should now be able to compare the peak heights in the periodogram (i.e. the variances of the different spectral components) to the variances you calculated for the individual input sinusoids, are they the same?

ANU 2011: Time series analysis. MATLAB Exercises

12

Exercise 4.2 Time in the Frequency domain We can compare the frequency spectra of the original SPECMAP stack and a modified version. These signals are basically the opposite of each other, but how are their frequency spectra different? >> clear all >> close all >> load SPECMAP2

Remove all existing data from the memory Close all the existing figure windows Load the data into the memory

There should now be 3 variables in MATLAB’s memory. The first is age, then data1 which is the normal SPECMAP record and data2 which is the flipped and reversed stack. First you can plot the data to check they are okay. >> subplot(2,1,1) >> plot(age,data1); >> subplot(2,1,2) >> plot(age,data2);

Create a plot in the upper half of the figure Plot the original SPECMAP data Create a plot in the lower half of the figure Plot the reversed SPECMAP data

Now a periodogram can be calculated for each signal and they can be compared in the frequency domain. >> [f1,Pyy1]=pdg(time,data1) >> figure >> subplot(2,1,1) >> plot(f1,Pyy1) >> xlabel('Frequency') >> ylabel('Variance (original)')

Periodogram of the original SPECMAP data Create a new plot window Create a plot in the upper half of the figure Plot the periodogram Add a x-axis title Add a y-axis title

>> [f2,Pyy2]=pdg(time,data2) >> subplot(2,1,2) >> plot(f2,Pyy2) >> xlabel('Frequency') >> ylabel('Variance (flipped)')

Periodogram of the reversed SPECMAP data Create a plot in the lower half of the figure Plot the periodogram Add a x-axis title Add a y-axis title

You should now be able to compare the periodograms of the two signals. What differences can you see?

ANU 2011: Time series analysis. MATLAB Exercises

13

Exercise 4.3a Trends in the Frequency domain Using the raw Mauna Loa CO2 data we can investigate the effect long-term trends have in the frequency domain >> clear all >> close all >> load mauna

Remove all existing data from the memory Close all the existing figure windows Load the Mauna Loa data into the memory

There are two variables in the data set, time (measured in days) and mauna which is the CO2 data (measured in ppm). >> subplot(2,1,1) >> plot(time,mauna); >> xlabel('Time (days)') >> ylabel('CO2 cotent (ppm)')

Create a plot in the upper half of the figure Plot the original CO2 data

>> subplot(2,1,2) >> [f,Pyy]=pdg(time,mauna) >> plot(f,Pyy); >> xlabel('Frequency (1/days)') >> ylabel('Variance')

Create a plot in the lower half of the figure Periodogram of the CO2 data Plot the periodogram

What structure do you see in the periodogram?

ANU 2011: Time series analysis. MATLAB Exercises

14

Exercise 4.3b Trends in the Frequency domain(2) You can now try detrending the Mauna Loa CO2 data using the function detrend_signal. What effect does the detrending have on the periodogram. >> xc=detrend_signal(time,mauna)

Detrend the data with a 1st order polynomial

We can no repeat the previous part of the exercise, but using the detrended rather than the raw data. >> figure >> subplot(2,1,1) >> plot(time,xc); >> xlabel('Time (days)') >> ylabel('Detrended signal')

Make a new figure window Create a plot in the upper half of the figure Plot the detrended CO2 data

>> subplot(2,1,2) Create a plot in the lower half of the figure >> [f,Pyy]=pdg(time,xc) Periodogram of the detrended data >> plot(f,Pyy); Plot the periodogram >> xlabel('Frequency (1/days)') >> ylabel('Variance') How has the structure of the periodogram changed now the data has been detrended?

ANU 2011: Time series analysis. MATLAB Exercises

15

Exercise 4.4 Processing unequally spaced data If we have data that are unequally spaced in the time domain then we must interpolate them onto an equally spaced time axis before calculating a traditional periodogram. In the file interpolation there are two versions of the same signal, one equally space and one unequally spaced in time. >> load interpolation The variables time_reg and data_reg make up the equally spaced data set, whilst time_irreg and data_irreg are the unequally spaced data set. First we can plot the signal. >> subplot(3,1,1) >> plot(time_reg,data_reg); >> xlabel('Time') >> ylabel('Signal')

Activate the top third of the plot window Plot the regularly spaced signal

You will see this is a complicated signal and to understand it we should study the periodogram of the equally space data. >> [f,Pyy]=pdg(time_reg,data_reg); >> subplot(3,1,2) >> plot(f,Pyy) >> xlabel('Frequency') >> ylabel('Variance')

Periodogram of the regularly spaced signal Activate the middle third of the plot window Plot the Periodogram

Examination of the periodogram shows that the signal is made up of 8 sinusoids with equal amplitudes and frequencies at spacings of 0.05. To calculate a periodogram for the irregularly spaced signal we must interpolate the data onto a regularly spaced time array. The simplest way to do this is to interpolate it onto the original time_reg points. >> data_interp=interp1(time_irreg,data_irreg,time_reg,'linear') The above command performs a one-dimensional linear interpolation of the irregular signal (time_irreg and data_irreg) onto the regular time array (time_reg). We can now use the original time_reg points with the data_interp array to calculate a periodogram for the interpolated signal. >> [f,Pyy]=pdg(time_reg,data_interp); Periodogram of the interpolated signal >> subplot(3,1,3) Activate the bottom third of the plot window >> plot(f,Pyy) Plot the Periodogram >> xlabel('Frequency') >> ylabel('Variance') What effect does the interpolation appear to have on the signal?

ANU 2011: Time series analysis. MATLAB Exercises

16

Exercise 4.5 The CLEAN algorithm The CLEAN algorithm is just one method where frequency spectra can be calculated directly from unequally spaced data without a need for interpolation. To apply CLEAN to the data simply give the command: >> [f,Pyy]=clean(time_irreg,data_irreg) You can see from the input variables that we are putting the irregularly spaced data directly into the algorithm without any interpolation. >> figure >> plot(f,Pyy) >> xlabel('Frequency') >> ylabel('Variance')

Plot the CLEAN spectrum

The high frequency parts of the signal are not attenuated and CLEAN has successfully produced an accurate spectrum for the signal.

¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ There are a number of alternative methods available for spectral analysis of unevenly spaced data. A few references are: Roberts D.H., Lehar J., Dreher J.W., (1987) Time Series Analysis with Clean - I Derivation of a Spectrum. Astronomical J. 93, (4) 968 J.D. Scargle, (1982) Studies in astronomical time series analysis II. Statistical aspects of spectral analysis of unevenly sampled data. Astrophysical J. 263, 835853. And with specific reference to palaeoclimatic data:

Schulz, M. and Stattegger, K. (1997): SPECTRUM: Spectral analysis of unevenly spaced paleoclimatic time series. Comput. Geosci., 23, 929-945. Heslop D. and Dekkers M.J. (2002). Spectral analysis of unevenly spaced climatic time series using CLEAN: signal recovery and derivation of significance levels using a Monte Carlo simulation, Phys. Earth Planet. Inter., 130, 103-116.

ANU 2011: Time series analysis. MATLAB Exercises

17

Exercise 4.6 White noise in the Frequency Domain We will construct 3 white noise signals with different lengths (64, 256 and 1024 data points) and study the form of their periodograms. We can generate normally distributed random numbers (with a mean = 0 and variance = 1.0) in MATLAB using the randn function. >> time1=[0:1:63]; Generate a time array ranging from 0 to 63 >> signal1=randn(1,64); 64 normally distributed random numbers for the signal >> time2=[0:1:255]; Generate a time array ranging from 0 to 255 >> signal2=randn(1,256); 256 normally distributed random numbers for the signal >> time3=[0:1:1023]; Generate a time array ranging from 0 to 1023 >> signal3=randn(1,1024);1024 normally distributed random numbers for the signal It is a good idea to first plot the signals so you have some idea what they look like. >> subplot(3,1,1) >> plot(time1,signal1)

Activate the top one-third of the plot window Plot signal1

>> subplot(3,1,2) >> plot(time2,signal2)

Activate the middle one-third of the plot window Plot signal2

>> subplot(3,1,3) >> plot(time3,signal3)

Activate the bottom one-third of the plot window Plot signal3

We can now calculate the periodogram for each signal and plot them in a new figure window. >> figure >> [f,Pyy]=pdg(time1,signal1) >> subplot(3,1,1) >> plot(f,Pyy) >> xlabel('Frequency') >> ylabel('Variance')

Make a new plot window Calculate the periodogram of signal1 Activate the top one-third of the plot window Plot the periodogram of signal1

>> [f,Pyy]=pdg(time2,signal2) >> subplot(3,1,2) >> plot(f,Pyy) >> xlabel('Frequency') >> ylabel('Variance')

Calculate the periodogram of signal2 Activate the middle one-third of the plot window Plot the periodogram of signal2

>> [f,Pyy]=pdg(time3,signal3) >> subplot(3,1,3) >> plot(f,Pyy) >> xlabel('Frequency') >> ylabel('Variance')

Calculate the periodogram of signal3 Activate the bottom one-third of the plot window Plot the periodogram of signal3

ANU 2011: Time series analysis. MATLAB Exercises

18

Exercise 4.7 Welch-Overlapped-Segment-Averaging Using the pwelch function it is possible to investigate how the power spectra of a white noise signal changes as the WOSA method is applied >> clear all Remove existing variables from the memory >> close all Close all the existing figure windows >> signal=randn(1,1024); 1024 normally distributed random numbers for the signal We can now calculate the spectra for the signal with different segment lengths (1024, 256 and 64) and then plot the results. >> [Pyy,f]=pwelch(signal,1024,[],[],1) Calculate with a 1024 point segment >> subplot(3,1,1) Activate the top one-third of the plot window >> plot(f,Pyy) Plot the spectrum >> xlabel('Frequency') >> ylabel('Power') >> [Pyy,f]=pwelch(signal,256,[],[],1) Calculate with 256 point segments >> subplot(3,1,2) Activate the middle one-third of the plot window >> plot(f,Pyy) Plot the spectrum >> xlabel('Frequency') >> ylabel('Power') >> [Pyy,f]=pwelch(signal,64,[],[],1) Calculate with 64 point segments >> subplot(3,1,3) Activate the bottom one-third of the plot window >> plot(f,Pyy) Plot the spectrum >> xlabel('Frequency') >> ylabel('Power')

Compare the frequency spectra. It is important to consider how close they are to the theoretical spectra for white noise and their relative frequency resolutions.

ANU 2011: Time series analysis. MATLAB Exercises

19

Exercise 4.8 Red noise in the Time and Frequency Domains The function AR1n can be used to generate red noise series with a given value of . We will generate 3 different series and see what they look like in both the time and frequency domains. >> clear all Remove existing variables from the memory >> close all Close all the existing figure windows >> rho=0.0 Define the lag-one autocorrelation coefficient >> [Rn1,Wn1]=AR1n(rho,2048); Create a series of length 2048 The AR1 function has two outputs; Rn1 is the red noise series and Wn1 is the white noise series that was used in the construction of Rn1. >> figure >> subplot(2,1,1) >> plot([0:1:2047],Wn1,’b’) >> hold on >> plot([0:1:2047],Rn1,’r’)

Generate a new plot window Activate the upper half of the plot window Plot the white noise as a blue line Allow more data to be added to the chart Plot the red noise as a red line

We saw in the previous exercise that to obtain a consistent spectrum we should use the WOSA method included in the function pwelch. >> [Pyy,f]=pwelch(Rn1,64,[],[],1) >> subplot(2,1,2) >> hold off >> plot(f,Pyy) >> xlabel('Frequency') >> ylabel('Power')

Calculate with a 64 point segment Activate the lower half of the plot window Switch off the hold command Plot the spectrum

Because we set  = 0 the calculated series is simply white noise and should have a flat spectrum. What happens if you have a non-zero value for the lag-one autocorrelation coefficient? Repeat the exercise with  = 0.8 and  = 0.99.

ANU 2011: Time series analysis. MATLAB Exercises

20

Exercise 5.1 A Stationary Signal First construct a vector containing the points in time, in this case starting at 0, and finishing at 3199 in intervals of 1 unit. >> time=[0:1:3199]'; We will calculate 4 sine waves with different frequencies, so first we define the frequency values: >> f1=1./400; Long period Eccentricity >> f2=1./100; Eccentricity >> f3=1./41; Obliquity >> f4=1./23; Precession

Now calculate a sine wave for each of the 4 frequencies: >> signal1=sin(2.*pi.*f1.*time); >> signal2=sin(2.*pi.*f2.*time); >> signal3=sin(2.*pi.*f3.*time); >> signal4=sin(2.*pi.*f4.*time);

Finally we add all 4 signals together to produce a composite signal: >> signal=signal1+signal2+signal3+signal4; >> plot(time,signal) Now determine the periodogram using pdg and plot the resulting frequency spectrum and set the x-axis limit between 0 and 0.1 >> [f,Pyy]=pdg(time,signal); The frequency data is given in the variable f and the power of the signal at each frequency is stored in the variable Pyy. >> figure >> plot(f,Pyy) >> xlim([0 0.1]) >> xlabel('Frequency') >> ylabel('Variance')

ANU 2011: Time series analysis. MATLAB Exercises

21

Exercise 5.2 A nonstationary signal Now we’ll construct a signal that changes through time. We will work with 4 different time segments and each one will contain a cycle with a different frequency. Finally, we will combine the segments and calculate a periodogram. Segment 1: spans 0 to 799 kyr with a frequency of 1/400 kyr-1 Segment 2: spans 800 to 1599 kyr with a frequency of 1/100 kyr-1 Segment 3: spans 1600 to 2399 kyr with a frequency of 1/41 kyr-1 Segment 4: spans 2400 to 3199 kyr with a frequency of 1/23 kyr-1 As a first step we construct our 4 different time segments: >> clear all, close all >> time1=[0:1:799]'; >> time2=[800:1:1599]'; >> time3=[1600:1:2399]'; >> time4=[2400:1:3199]';

Time segment spanning 0 to 799 Time segment spanning 800 to 1599 Time segment spanning 1600 to 2399 Time segment spanning 2400 to 3199

Then define the 4 different frequencies: >> f1=1./400; >> f2=1./100; >> f3=1./41; >> f4=1./23;

Long period Eccentricity Eccentricity Obliquity Precession

Now calculate the sine waves for each time segment: >> signal1=sin(2.*pi.*f1.*time1); >> signal2=sin(2.*pi.*f2.*time2); >> signal3=sin(2.*pi.*f3.*time3); >> signal4=sin(2.*pi.*f4.*time4);

Long eccentricity spanning time 0 to 799 Eccentricity spanning time 800 to 1599 Obliquity spanning time 1600 to 2399 Precession spanning time 2400 to 3199

Now we must combine the time segments and signal segments together. We do this using the square brackets operator, which allows us to glue existing variables together: >> time=[time1;time2;time3;time4]; This tells MATLAB to make a new variable called time that starts with the vector time1. Underneath time1 it places vector time2, then underneath that places time3 and finally places time4 at the bottom. Now we do the same procedure for the signal, plot the result and perform the spectral analysis: >> signal=[signal1;signal2;signal3;signal4]; >> plot(time,signal) >> [f,Pyy]=pdg(time,signal); >> plot(f,Pyy) >> xlim([0 0.1]) ANU 2011: Time series analysis. MATLAB Exercises

22

Exercise 5.3 Evolutionary spectral analysis of a nonstationary signal Perform evolutionary spectral analysis on the nonstationary signal you produced in Exercise 5.2. Perform the analysis with a spacing of 10 kyr and try some different window lengths (for example 200 kyr, 500 kyr, 800 kyr, 1200 kyr). e.g. >> evolpsd(time,signal,200,10); >> evolpsd(time,signal,500,10); >> evolpsd(time,signal,800,10); >> evolpsd(time,signal,1200,10); What influence does varying the window length have? What size window lengths do you need to obtain a good resolution in time and frequency?

ANU 2011: Time series analysis. MATLAB Exercises

23

Exercise 5.4: Evolutionary spectral analysis of the ODP677 18O record Load the data into MATLAB from the file ODP677 >> load odp677 There are two variables: age and data The ODP677 18O record is nonstationary, make an interpretation of its spectral content and climatic information using evolutive spectral analysis. Do the relative contributions of the different Milankovitch cycles change through time? >> evolpsd(age,data,window,spacing) adjust the window size to get the best balance between resolution time and frequency. Extra Hint: Before you start the analysis plot the data, do you think it is necessary to detrend the signal first?

ANU 2011: Time series analysis. MATLAB Exercises

24

Exercise 7.1 PCA of a collection of time series The data are stored in the file pca_climate.mat. There are 4 variables, age (a common time scale for the data), ms (magnetic susceptibility), gs (grain size) and d18O (oxygen isotopes). We’ll now plot the 3 time series. >> clear all, close all >> load pca_climate

Clear the memory and close all figures Load the data file

>> figure Create a new figure >> subplot(3,1,1) axes in 3 rows & 1 column, activate first set of axes >> plot(age,zscore(ms)) plot the standardized magnetic susceptibility data >> ylabel('Normalized Susceptibility') label the y-axis >> subplot(3,1,2) axes in 3 rows & 1 column, activate second set of axes >> plot(age,zscore(gs)) plot the standardized grain size data >> ylabel('Normalized Grain size') label the y-axis >> subplot(3,1,3) axes in 3 rows & 1 column, activate second set of axes >> plot(age,zscore(d18O)) plot the standardized oxygen isotope data >> ylabel('Normalized d18O') label the y-axis >> xlabel('Age [ka]') label the x-axis We’ll combine 3 time series which are on a common timescale into a single matrix and then standardize the columns. If the time series were not on a common timescale then they would have to be interpolated first to obtain versions that are on a common timescale. PCA is performed using the function princomp. With the results we can find out what proportion of the variance the 1st PC explains and plot the scores as a function of age. >> X=[ms,gs,d18O]; form a matrix composed of the time series >> X=zscore(X); standardize the columns of X >> [loadings,scores,latent]=princomp(X); perform the PCA >> latent=latent./sum(latent) normalize the PC contributions >> latent(1) the variance explained by the 1st PC >> figure Generate a new figure >> plot(age,scores(:,1)) plot the scores of the 1st PC >> ylabel('PC Score') label the y-axis >> xlabel('Age [ka]') label the x-axis Using the example above plot the scores of the 2nd PC as a function of age. What proportion of the total variance does the 2nd PC account for?

ANU 2011: Time series analysis. MATLAB Exercises

25

Exercise 7.2 PCA and the “Hockey Stick” We’ll run a test with a collection of red noise time series to analyze the PCA technique employed by Mann et al. (1999) and see how it compares to the classical PCA approach. To start we’ll create a collection of age points between 1400 and 2000 AD and then generate 112 red noise time series with =0.8. >> clear all, close all >> age=[1400:1:2000]'; >> Rn1=AR1n(0.8,numel(age),112); >> figure >> plot(age,Rn1) >> xlabel('Age [Yr]') >> ylabel('Red Noise input')

Create the age points Generate 112 red time series New figure Plot the time series Label the x axis Label the y axis

Now calculate the 1st principal component of the data using the classical approach where the standard deviation of each record is set to 1 and the mean is set to 0. >> A=bsxfun(@minus,Rn1,mean(Rn1)); >> A=bsxfun(@rdivide,A,std(A)); >> [coeffA,scoreA] = princomp_hs(A); >> figure >> plot(age,zscore(scoreA(:,1)),'b') >> xlabel('Age [Yr]') >> ylabel('PCA Score')

Subtract the mean of each record Divide by the std of each record Perform the PCA on the data in A New figure Plot the scores of the 1st component Label the x axis Label the y axis

Now we’ll calculate the 1st principal component of the data using the Mann approach where the standard deviation of each record is set to 1 during the period 1900-2000 AD and the mean is set to 0 during the same period. >> idx=find(age>=1900 & age> B=bsxfun(@minus,Rn1,mean(Rn1(idx,:))); Normalized the mean >> B=bsxfun(@rdivide,B,std(B(idx,:))); Normalized the standard deviation >> [coeffB,scoreB] = princomp_hs(B); Perform the PCA on the data in B >> hold on Add to the current plot >> plot(age,zscore(scoreB(:,1)),'r') Plot the scores of the 1st component >> legend('Classical PCA','Mann PCA',0) Add a legend Look at the final plot and compare the time series generated by the classical and Mann approaches. Remember, we are working with random data, that will show no preference to increasing or decreasing in the calibration period.

ANU 2011: Time series analysis. MATLAB Exercises

26

Exercise 8.1 Correlation and autocorrelation We’ll generate two random red noise time series consisting of 200 points and test the significance of their correlation with and without employing the effective sample size. >> clear all, close all

Clear the memory, close all existing figures

>> N=200; >> X=AR1n(0.95,N); >> Y=AR1n(0.99,N);

Define the length of the time series 200 point red noise series with =0.95 200 point red noise series with =0.99

>> figure >> subplot(2,1,1) >> plot(1:200,X,'k') >> ylabel('X Series')

Generate a new figure 2 x 1 subplot, activate the first plot Plot the X time series label the y-axis

>> subplot(2,1,2) >> plot(1:200,Y,'k') >> ylabel('Y Series') >> xlabel('Time')

2 x 1 subplot, activate the second plot Plot the X time series label the y-axis label the x-axis

We can now form a bivariate plot and calculate the correlation >> figure, plot(X,Y,'.') >> xlabel('X'), ylabel('Y') >> r = corrcoef(X,Y) >> r = abs(r(2,1));

Form an XY plot of the data Label the axis Calculate the correlation matrix Find the absolute value of r

First we’ll assess if the correlation is significant whilst ignoring the existence of any autocorrelation. >> t0 = r.*sqrt((N-2)./(1-r.^2)) >> t_crit0=tinv(1-0.05./2,N-2)

Calculate the t statistic Calculate the critical value of t

Is t0 > t_crit0, if so we can state that the two time series are significantly correlated at the  = 0.05 level. If not then the correlation is not significant. Now we’ll assess if the correlation is significant whilst taking the existence of autocorrelation into account. >> Neff=N.*(1-0.95*0.99)./(1+0.95*0.99) >> t1 = r.*sqrt((Neff-2)./(1-r.^2)) >> t_crit1=tinv(1-0.05./2,Neff-2)

Effective number of samples Calculate the t statistic Calculate the critical value of t

Is t0 > t_crit0, if so we can state that the two time series are significantly correlated at the  = 0.05 level. If not then the correlation is not significant.

ANU 2011: Time series analysis. MATLAB Exercises

27

Exercise 8.2 A semi-empirical ice volume model We’ll test the significance of the correlation upon which ice volume predictions are made. First we’ll load the data and plot the time series. >> clear all, close all >> load sealevel_data

Clear the memory load the data

>> temp=detrend(temp) >> rate=detrend(rate)

detrend the temperature rise detrend the rate

>> subplot(2,1,1) >> plot(temp,'.k-') >> ylabel('Rate of Change (cm/yr)')

2 x 1 plots, use the 1st plot plot the temperature rise label the y-axis

>> subplot(2,1,2) >> plot(rate,'.k-') >> ylabel('Warming above 1951-1980 mean') >> xlabel('Time')

2 x 1, use the 2nd plot plot the rate label the y-axis label the x-axis

Now test the significance of the relationship whilst not taking the autocorrelation into account. >> r = corrcoef(temp,rate); >> r = abs(r(2,1));

Calculate the correlation matrix The absolute correlation coefficient

>> N=length(rate)

Number of values in the time series

>> t0 = r.*sqrt((N-2)./(1-r.^2)) >> t_crit0=tinv(1-0.05./2,N-2)

calculate the t statistic calculate the critical value of t

Now we’ll repeat the test, but first estimating the effective number of samples based on the autocorrelation coefficients. >> r = corrcoef(temp,rate); >> r = abs(r(2,1));

Calculate the correlation matrix The absolute correlation coefficient

>> N=length(rate) >> rho_temp=ar1(temp); >> rho_rate=ar1(rate);

Number of values in the time series AR1 coefficient of temp AR1 coefficient of rate

>> Neff=N.*(1-rho_temp*rho_rate)./(1+rho_temp*rho_rate) >> t1 = r.*sqrt((Neff-2)./(1-r.^2)) >> t_crit1=tinv(1-0.05./2,Neff-2)

effective N

calculate the t statistic calculate the critical value of t

Draw conclusions concerning the validity of the model, given the values in t0, t_crit0, t1 and t_crit1.

ANU 2011: Time series analysis. MATLAB Exercises

28

Using the maverage.m function [Xout,Yout]=maverage(x,y,f) This function applies a moving average to your data Inputs x: x-axis data, this will normally be depth or time y: y-axis data, this is the signal data f: the moving average window.

Output Xout: the smoothed version of x Yout: the smoothed version of y.

Examples For a traditional 3-point moving average use: [Xout,Yout]=maverage(x,y,[1 1 1]) For a traditional 5-point moving average use: [Xout,Yout]=maverage(x,y,[1 1 1 1 1]) To apply different weights to each position just enter them into f For a 5-point weighted moving average use: [Xout,Yout]=maverage(x,y,[1 3 5 3 1]) (this applies a weight of 1 to the first point, 3 to the second point, 5 to the third point, 3 to the forth point and 1 to the fifth point). You can use any window length and weightings you want.

ANU 2011: Time series analysis. MATLAB Exercises

29

Using the filter_signal.m function [Xout,Yout]=filter_signal(x,y,f,type) This function applies a frequency domain filter to your data. Inputs x: x-axis data, this will normally be depth or time y: y-axis data, this is the signal data f = cut-off frequency, for a low-pass or high-pass filter this is a single cut-off value. For a band-pass filter you must provide two frequencies which are the minimum and maximum frequencies that will be allowed through the filter, i.e. [fmin fmax]. type: what kind of filter should the function perform you can enter: 'low' for a low-pass filter 'high' for a high-pass filter 'band' for a band-pass filter

Output Xout: the output of x. Yout: the filtered version of y.

Examples [Xout,Yout]=filter_signal(x,y,f,'low') for a low-pass filter. [Xout,Yout]=filter_signal(x,y,f,'high') for a high-pass filter. [Xout,Yout]=filter_signal(x,y,[fmin fmax],'band') for a band-pass filter.

ANU 2011: Time series analysis. MATLAB Exercises

30

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.