Sentry Page Protection

**Time Series Modeling**

**[6-15]**

The Partial Autocorrelation function is another concept that is essential to the understanding of time series analysis.

It is defined as the partial correlation of a variable with its lagged values that is not explained by the shorter lags.

This definition is confusing. Let's look at the following example.

We will first simulate a series of data for variable X.

__At time = 0__

Let the initial value of X be 100.

*X0 = 100*

__At time = 1__

X at time 1 will be a random step away from X at time 0.

*X*

*1 = X0 + N*

*1*

where N1 is a random value generated from the normal distribution with µ=0 and σ=3.

For example, since X0 is 100, X1 is 100 + a randomly generated value based on the normal distribution with zero mean and a standard deviation of 3.

X at time 2 will also be based on X at time 1.

X2 is again a random step away from X1.

__At time = 2__X at time 2 will also be based on X at time 1.

*X2 = X1 + N2*X2 is again a random step away from X1.

__At time = t__

In general, you can define Xt as:

*Xt = Xt-1 + Nt*

Now, let's run the code below to simulate X for 200 observations.

retain X 100;

call streaminit(222);

do time = 1 to 200;

X + round(rand('normal', 0, 3), 0.01);

output;

end;

run;

X is simulated for 200 observations:

Let's plot the time series as well as the ACF.

proc timeseries data=ex2 plots=(series acf);

var x;

run;

var x;

run;

The time series plot shows the movement of X:

X starts off at 100 and moves between -90 to 150 over a period of 200 time points.

Let's look at the ACF plot:

Unlike the ACF plot that we saw in the last section, the ACF plot for X slowly decreases to zero.

Let's focus on the autocorrelation at lag 1 for now.

The autocorrelation at lag 1 is the second bar in the plot. It is very close to 1:

This indicates that X at time (t) is highly correlated with X at time (t-1).

This makes perfect sense considering how our data is simulated.

*Xt = Xt-1 + Nt*

X at time (t) is derived from X at the previous time point plus a small random error.

The values of X at two consecutive time points are definitely highly correlated.

Now, let's look at the autocorrelation at lag 2.

The autocorrelation at lag 2 is also very high.

This indicates that X at time (t) is highly correlated with X at time (t-2).

This is a little strange!

Our Xt is derived from:

*Xt = Xt-1 + Nt*

Xt derives straight from Xt-1. It does not depend on Xt-2.

The high autocorrelation at lag 2 does not quite make sense.

It turns out that the high correlation between Xt and Xt-2 comes from the correlation between Xt-1 and Xt-2.

We have:

*Xt = Xt-1 + Nt*

and

*Xt-1 = Xt-2 + Nt-1*

As a result, Xt is indirectly correlated with Xt-2.

**Partial Autocorrelation Function (PACF)**

The PACF shows the partial autocorrelation of a variable with itself at lag t that is NOT explained by the shorter lag.

Let's first plot the PACF plot using the TIMESERIES procedure.

proc timeseries data=ex2 plots=(pacf);

var x;

run;

var x;

run;

The PACF plot is shown below:

The partial autocorrelation at lag 1 is still very high.

There is a high correlation between Xt and Xt-1, as we have seen from the ACF plot.

However, unlike the ACF plot, the partial autocorrelation drops off immediately at lag 2:

This indicates that the correlation between Xt and Xt-2 that is NOT explained by the correlation between X1 and Xt-1 is very small.

Xt is correlated with Xt-2 but it is mostly because of Xt-1.

Together with the ACF and PACF plots, we can further understand the correlation between the variable at different time points.

Further reading on PACF:

**Exercise**

If you haven't created the SALES3 data set, copy and run the code from the yellow box below:

Plot the PACF for the TOTALSALES column.

Do you see any spikes on the PACF plot?

*Need some help?*

Fill out my online form.