Search the site...

SASCRUNCH TRAINING
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
Time Series Modeling
​[3-10]


Properly setting up the time variable for the x-axis is crucial to your time series analysis.

Before we go ahead and create the time variable, let's take a look at how the data is structured.

If you haven't created the SALES data set, copy and run the code from the yellow box below:

Except for the very first week, the sales data is organized into weeks of five days:
Picture

If you scroll down, you will notice that some weeks are missing a day or two.

Let's scroll down to the week 3 in month 2:
Picture

There are only four days in this week. There is no sales record for Friday.
Picture

In addition, the following week is missing sales record for Monday as well.
Picture

The missing records are likely due to holidays. 

The store is closed and there is no sales record for these days.



Creating the Time Variable

Now, there are a number of ways to create the time variable.

​The simplest way to create the variable is to treat each observation as an independent time point.

Let's look at an example.
data sales2;
set sales;
time = _n_;
run;

This creates a TIME variable that goes up from one to 60:
Picture

The problem with treating each observation as an individual time point is that the time variable created does not reflect the seasonal pattern of the data.

Let's look at time 29 to 34:
Picture

Time 29 is a Monday:
Picture

In the previous section, we learned that there is a seasonal spike (i.e. higher sales) on Monday. 

When modeling the data, we would expect that there is also a spike at time 34 (which is five periods after day 29):
Picture

However, due to the missing values, time 34 is actually a Wednesday!
Picture

Treating each observation as an individual time point changes the seasonal pattern of the data.

This will affect the model that we are going to build.

A better way to create the time variable is to create a sequential date column that matches the weekday from (1=Sunday to 7=Saturday).

Let's look at an example.
data sales2;
set sales;
weekno = week + 4*(month-1);
time = day + 7*(weekno-1);
drop weekno;
run;

data time_temp;
do time = 4 to 90;
day = mod(time, 7);
if day = 0 then day = 7;
output;
end;
run;

data sales3;
merge time_temp sales2;
by time;
drop month week;
run;

The breakdown of the code can be found here.

The TIME column created begins at time 4.
Picture

It corresponds to the time point at Day 4 of the very first week.
Picture

Data are collected starting on a Wednesday.

Time 4-7 represents the sales record from Wednesday to Saturday in the very first week:
Picture

Note: Saturday is a holiday and the totalsales are set as missing.

​This row is also included in the data set:
Picture

Similarly, time 8-14 represents the second week of the data. 

It goes from (1=Sunday) to (7=Saturday):
Picture

Similarly, additional rows are added for the weekends:
Picture

The data set is now structured by weeks of 7 days (except for the very first week).

Let's scroll down to the weeks that contain holiday(s).

Time 43-49 represents the third week of month two:
Picture

The Friday in this week is a holiday. The store is closed.

​However, an additional row is added to the data set as well:
Picture

The TIME column is structured in a group of 7 days (except for the very first week).

Whether the week contains weekends or holidays, there are 7 rows of data for the week.

The benefits of creating the TIME column in such a way is that the seasonal pattern can be modeled every seven periods.

The seasonal pattern is now incorporated into the TIME column.

Exercise

Are there any significant difference in sales between the different week of the month?

Create a frequency table for the total sales for each week of the month.

In addition, fit an ANOVA model and test the difference in sales between five weeks of the month.
Next

Need some help?
Get Hint
Get Solution
Fill out my online form.
Already a member? Go to member's area.