Search the site...

SASCRUNCH TRAINING
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
Sentry Page Protection
Please Wait...
Data Manipulation [8-18]


Merging data sets
Picture
Another way to combine data sets is to merge them side-by-side. 

Let's take a look at an example.
Picture
Data set: PROFILE
Picture
Data set: EXAM

A tutoring school wants to find out which grade performs the best in the previous semester. 

They have the PROFILE and EXAM data set:

The PROFILE data set contains the profile of each student such as gender and grade.
Picture

The EXAM data set contains the recent exam results.
Picture

In order to analyze the relationship between grade and exam result, you must first combine the two data sets.

Example

Proc Sort Data=Profile;
By Student;
Run;

Proc Sort Data=Exam;
By Student;
Run;

Data Combined;
Merge Profile Exam;
By Student;

Run;
Picture


The PROFILE and EXAM data sets are merged into one data set. 

The COMBINED data set now contains all of the four variables: Student, Gender, Grade, and Result.


Two Steps Merging Data Sets

When merging the data sets, you can follow the 2 steps below:

Step 1: Sort each of the input data sets by the merging variable(s).

Step 2: Merge the data sets by the merging variable(s).

These will be explained in the next session.

Exercise

Copy and run the CUST and PURCHASE data set from the yellow box below:

A local retail store wants to find out which geographic area responds the best to the marketing campaign.

They have two data sets: 
  • CUST: the data set that contains the customer ID and their address (zip code)
  • PURCHASE: the data set that contains the customer ID and the recent purchase amount

Merge the two data sets by the customer ID. 

Create any data set or variable if needed.
Next

Need some help? 


HINT:
Did you include the MERGE and the BY statements in the data step?


SOLUTION:
proc sort data=cust;
by ID;
run;

proc sort data=purchase;
by ID;
run;

data Cust2;
merge cust purchase;
by id;
run;


Fill out my online form.

Already a member? Go to member's area.