Search the site...

SASCRUNCH TRAINING
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
Fill out my online form.
Task [4-4]


From the previous exercises, we know that we have to:
  • Consolidate the invoice, day, month, year and time into one variable.
  • Convert the CUSTOMERID variable from character to numeric.

Run the following code on your SAS Studio:
data france_sales_201110_clean;
 set a.france_sales_201110(rename=(customerID=CustomerID_char));
 
 CustomerID=input(CustomerID_char,best32.);
 format customerid best12.;
 
 invoice_date=mdy(invoice_mon,invoice_day,invoice_year);
 
 invoicedate=dhms(invoice_date,0,0,invoice_time);
 format invoicedate datetime.;
 
 drop CustomerID_char invoice_date invoice_mon invoice_day invoice_year invoice_time;

run;

The code above restructures the France data set into the france_sales_201110_clean daa set.

It consolidates the invoice information into one variable.

It also converts the CUSTOMERID variable into a numeric variable using the INPUT function.

Let's run a CONTENTS procedure on the newly created france_sales_201110_clean daa set.
proc contents data=france_sales_201110_clean;
​run;

There are only eight variables in the data set, which is good.

There is only one INVOICEDATE variable containing the invoice information. 

​The CUSTOMERID variable is a character variable.
Picture

Let's compare the online_retail data set with the france_sales_201110_clean again using the COMPARE procedure.
proc compare
 base = a.online_retail
 compare = france_sales_201110_clean
 listvar
 novalues;
run; 

Both of the data sets have eight variables.

Two variables contain differing attributes:
Picture

The differing attributes could be a problem. 

The second table from the COMPARE procedure displays the differing attributes in more details:
Picture

It turns out that the two variables are differing in the INFORMAT attribute.

This will not affect the data values or any analysis results and it can be ignored.

Now that the two data sets contain the same data structure, we can aggregates the ​france_sales_201110_clean data set into the online_retail data set.
data online_retail_new;
 set a.online_retail
     france_sales_201110_clean;
run; 

The online_retail_new data set is created.

It combines the original online_retail data set with the france_sales_201110_clean data set.
Picture
Go to task 5
Do you need help?
Already a member? Go to member's area.