Project 2 section 12

Search the site...

Sentry Page Protection

Time Series Modeling
[12-15]

In this section, we are going to look at an example of how an MA term can correct the autocorrelation in the residuals.

Copy and run the code from the yellow box below:

data ts3;
input time x;
datalines;
1 6.62
2 7.09
3 6.75
4 7.76
5 5.1
6 3.51
7 9.84
8 8.28
9 8.73
10 10.52
11 3.21
12 2.87
13 -3.8
14 1.8
15 5.61
16 4.53
17 4.98
18 6.32
19 4.1
20 5.59
21 5.85
22 3.08
23 2.82
24 6.74
25 12.45
26 10.77
27 6.19
28 2.05
29 2.26
30 4.32
31 5.6
32 5.7
33 3.62
34 2.48
35 4.25
36 2.49
37 -1.09
38 3.04
39 -0.5
40 -0.83
41 1.86
42 4.76
43 -0.48
44 1.85
45 5.08
46 3.11
47 3.93
48 2.48
49 -1.17
50 -2.08
51 4.51
52 9.63
53 7
54 5.69
55 2.36
56 8.19
57 8.77
58 5.18
59 2.28
60 0
61 -0.09
62 3.11
63 3.7
64 2.28
65 2.79
66 3.76
67 2.29
68 2.71
69 1.95
70 6.94
71 0.36
72 -5.66
73 4.37
74 4.95
75 9.33
76 16.16
77 12.26
78 4.21
79 1.58
80 3.97
81 8.52
82 5.33
83 2.41
84 8.62
85 9.48
86 5.94
87 1.98
88 0.07
89 1.53
90 1.32
91 2.97
92 0.82
93 -1.39
94 1.06
95 7.84
96 2.75
97 1.45
98 1.85
99 4.34
100 8.47
101 4.13
102 0.27
103 -0.48
104 3.28
105 0.73
106 -4.81
107 -4.72
108 3.7
109 8.05
110 2.55
111 -1.65
112 -4.43
113 0.99
114 3.99
115 2.68
116 6.37
117 5.42
118 1.98
119 -5.02
120 -3.29
121 5.53
122 8.84
123 4.17
124 1.68
125 1.2
126 1.81
127 9.42
128 5.44
129 2.62
130 4.18
131 7.62
132 9.27
133 4.75
134 0.18
135 -0.1
136 1.28
137 3.22
138 5.31
139 4.82
140 1.67
141 3.89
142 7.17
143 6.12
144 4.98
145 3.54
146 -0.27
147 2.02
148 5.92
149 3.54
150 0.87
151 1.48
152 11.55
153 2.95
154 0.17
155 5.08
156 0.62
157 3.58
158 3.98
159 2.62
160 3.47
161 5.64
162 7.29
163 4.24
164 8.76
165 11.66
166 7.37
167 0.02
168 -0.23
169 2.54
170 3.42
171 2.92
172 -1.29
173 -7.44
174 -6
175 -3.75
176 -3.03
177 -0.1
178 4.04
179 7.12
180 9.48
181 6.52
182 -1.01
183 -1.08
184 1.39
185 1.92
186 -2.53
187 0.05
188 8.67
189 9.32
190 3.26
191 0.16
192 -0.32
193 2.98
194 7.62
195 9.2
196 6.77
197 6.27
198 6.07
;
run;

We are going to run the Proc ARIMA to display all of the plots and Ljung-box test for the time series:

proc arima data=ts3 ;
identify var=x;
run;
quit;

The time series plot shows the mean is fairly stable.

Differencing might not be needed.

The ACF and PACF show spikes at various lags:

Also, the p-values from the Ljung-box test are significant at each of the four lags.

This indicates the data is autocorrelated. An AR or MA term can be used to model the autocorrelation.

Let's try adding an AR term:

proc arima data=ts3 ;
identify var=x;
estimate p=1;
run;
quit;

The parameter estimate is significant at lag 1:

However, the Ljung-box test (i.e. white noise test) still shows significant results at each lag values:

There is unexplained correlation in the residuals even after adding one AR term to the model.

Now, let's try adding two AR terms.

proc arima data=ts3 ;
identify var=x;
estimate p=2;
run;
quit;

There is an improvement on the Ljung-box test. However, there is still a significant p-value at lag 12:

In general, the lag where ACF plot drops off indicates the number of MA terms for the model.

In this example, the ACF plot drops off after lag 1:

This indicates the autocorrelation could possibly be explained by an MA term.

Brief Introduction to an MA Model

The MA (Moving Average) model models the time series (X) based on the previous error of X (i.e. Wt-1).

With just one MA term, it has the following forecasting equation:

Xt = µ + Wt + ϕ1Wt-1

With two MA terms, it has an additional term in the equation:

Xt = µ + Wt + ϕ1Wt-1 + ϕ2Wt-2

Let's add an MA term to the model to see if it explains the autocorrelation in the residuals.

proc arima data=ts3 ;
identify var=x;
estimate q=1;
run;
quit;

There is a huge improvement on the Ljung-box test.

The p-values are insignificant at each lag level:

The ACF and PACF plots show a small spike at lag 12.

However, since the Ljung-box test fails to reject the hypothesis that the time series is white noise, we can conclude that the autocorrelation in the residuals is insignificant.

Done! We have identified the model for the data to be ARIMA (0 0 1) (i.e.an MA(1) model).

In the following sections, we will learn how to do forecasting with the models we have built so far.

Note: identifying the right number of AR and/or MA terms is a complex and intuitive process.

For an in-depth tutorial on this topic, please visit this site.

Exercise

Copy and run the EXER data set from the yellow box below:

data exer;
input time x;
datalines;
1 1.67
2 -4.49
3 -3.23
4 3.06
5 2.94
6 4.76
7 7.23
8 7.18
9 4.92
10 5.28
11 7.15
12 4.19
13 0.66
14 5.43
15 5.6
16 7.69
17 9.53
18 2.73
19 -0.77
20 3.59
21 11.35
22 6.17
23 4.38
24 2.33
25 2.91
26 4.68
27 2.52
28 -3.58
29 0.18
30 3.76
31 4.41
32 6.73
33 8.57
34 6.82
35 0.04
36 1.48
37 2.94
38 2.66
39 -0.04
40 -0.76
41 -5.97
42 -9.08
43 0.1
44 3.31
45 3.1
46 -2.42
47 -1.62
48 -3.65
49 -2.93
50 0.68
51 3.58
52 0.57
53 4.61
54 4.37
55 3.98
56 7.02
57 2.8
58 2.1
59 6.06
60 3.27
61 -3.7
62 -0.29
63 -0.71
64 2.53
65 1.5
66 4.56
67 4.16
68 -0.02
69 0.86
70 5.18
71 6.61
72 8.9
73 14.03
74 11.18
75 5.28
76 3.17
77 6.03
78 4.12
79 1.23
80 1.6
81 3.9
82 4.81
83 6.3
84 5.63
85 2.1
86 -0.59
87 2.11
88 8.03
89 8.12
90 3.73
91 2.45
92 1.14
93 -1.7
94 -0.2
95 4.45
96 3.21
97 1.1
98 2.12
99 1.3
100 1.18
;
run;

Perform the necessary steps to identify one ARIMA model where the residuals are purely white noise.

Need some help?

Fill out my online form.