Sentry Page Protection
Proc SQL II [8-17]
Overlaying Columns
When using the Union, Except or Intersect Set operators, SAS does not overlay the columns by the column names.
Instead, it overlays the column by its position.
This could cause major issues when not handled properly.
Let's look at an example.
Instead, it overlays the column by its position.
This could cause major issues when not handled properly.
Let's look at an example.
The DS1 and DS2 tables each contain a list of three names.
The DS1 table contains the NAME and OCCUP columns:
The DS2 table contains the NAME and GENDER column:
Note: the second column from the two tables are not the same.
One contains the occupation data and the other contains the gender information.
Now, let's concatenate the two tables using the Union Set operator.
proc sql;
select * from ds1
union
select * from ds2;
quit;
select * from ds1
union
select * from ds2;
quit;
Although the second columns from the two tables do not share the same column name, they are overlaid into one column!
SAS does not overlay the column by the column names when using the Union Set operator.
The same is also true for the Except and Intersect operators.
Instead, SAS overlays the column by its position in the table.
The second column from one table will overlay the second column from the other table, regardless of the column name.
The column name from the second table will be ignored.
CORR Keyword
If you'd like the columns to be overlaid by the column name, you can use the CORR keyword after the Union Set operator.
Example
proc sql;
select * from ds1
union corr
select * from ds2;
quit;
select * from ds1
union corr
select * from ds2;
quit;
The CORR keyword tells SAS to overlay the columns by its corresponding column name.
Columns that have the same name across the two tables will be overlaid into one column.
However, the columns that don't appear in both tables will be removed from the output.
In our example, the output table contains only the NAME column. The OCCUP and GENDER columns are removed, since they don't appear in both input tables.
ALL Keyword
As we have learned in the previous sections, SAS removes the duplicate rows when concatenating the tables using the Union, Except or Intersect Set operators.
The ALL keyword can be used to tell SAS to keep the duplicate rows in the output.
Copy and run the code from the yellow box below:
As we have learned in the previous sections, SAS removes the duplicate rows when concatenating the tables using the Union, Except or Intersect Set operators.
The ALL keyword can be used to tell SAS to keep the duplicate rows in the output.
Copy and run the code from the yellow box below:
The CLASS1 and CLASS2 tables each contain a list of students.
CLASS1
CLASS2
You can clearly see that some rows, such as Tom and John, are duplicated.
Let's concatenate the tables using the Intersect Set operator.
proc sql;
select * from class1
intersect
select * from class2;
quit;
select * from class1
intersect
select * from class2;
quit;
Tom is the only student who appears in both classes. A unique row of Tom is included in the output:
Now, if you are hoping to keep all of the duplicate rows of Tom, you can use the ALL keyword after the Intersect Set operator.
Example
proc sql;
select * from class1
intersect all
select * from class2;
quit;
select * from class1
intersect all
select * from class2;
quit;
The two rows of Tom are now included in the output:
Exercise
Copy and run the code from the yellow box below:
Copy and run the code from the yellow box below:
The LIST1 and LIST2 tables each contain a list of three customers and their gender.
Write a Proc SQL step to concatenate the two tables using the Union Set operator. Make sure the same columns are overlaid in the output table.
Need some help?
HINT:
Use the CORR keyword to overlay the columns by the column name.
SOLUTION:
proc sql;
select * from list1
union corr
select * from list2;
quit;
* or *;
proc sql;
select name, gender from list1
union
select name, gender from list2;
quit;
Fill out my online form.