Sentry Page Protection
Proc SQL II [9-17]
Outer Union Set Operator
The Outer Union Set operator is the fourth Set operator in Proc SQL.
It is very different from the other three that we have discussed.
It concatenates the tables by selecting the rows from each input table.
However, duplicate rows are not removed and columns are not overlaid by default.
Let's look at some examples.
It is very different from the other three that we have discussed.
It concatenates the tables by selecting the rows from each input table.
However, duplicate rows are not removed and columns are not overlaid by default.
Let's look at some examples.
We have seen the CUST1 and CUST2 tables from previous sections.
Each of the CUST1 and CUST2 tables contain three customers.
CUST1
CUST2
Now, let's concatenate the two tables using the Outer Union Set operator.
proc sql;
select * from cust1
outer union
select * from cust2;
quit;
select * from cust1
outer union
select * from cust2;
quit;
The two tables are concatenated into one.
However, it looks quite different than the other concatenated tables we have seen in previous sections.
First of all, the duplicate rows are not removed:
Second, none of the columns are overlaid, even when the columns have the same name across the two tables.
Overlapping Columns
The Outer Union Set operator is a different beast than the other set operators.
However, you can still use the CORR keyword to overlap the columns with the same name.
Example
proc sql;
select * from cust1
outer union corr
select * from cust2;
quit;
select * from cust1
outer union corr
select * from cust2;
quit;
The CORR keyword is added after the Outer Union Set operator.
The columns with the same name across the two tables will now be overlaid:
Note 1: using the Outer Union operator with the CORR keyword (such as above) will give you the same result as using the SET statement in the data step.
Example
Data Cust;
set cust1 cust2;
Run;
set cust1 cust2;
Run;
The code above will give you the same result as using the Outer Union operator with the CORR keyword.
Using the Proc SQL technique usually requires more computing power.
However, it could be more flexible than using the SET statement when dealing with the more complex situations that involve multiple tasks to be done within a single step.
Note 2: the ALL keyword cannot be used with the Outer Union Set operator.
As we have learned in the last section, the ALL keyword is used to tell SAS to NOT remove duplicate rows.
By default, the Outer Union Set operator keeps the duplicate rows.
Using the ALL keyword along with the Outer Union Set operator will result in an error.
Exercise
Copy and run the code from the yellow box below:
Copy and run the code from the yellow box below:
The TEMP1 and TEMP2 tables each contain 4000 rows.
Use both the SET statement and the Outer Union operator to concatenate the two tables.
Which method is faster?
Need some help?
HINT:
Use the CORR keyword to overlay the columns by the column name.
SOLUTION:
** Data set method **;
data temp;
set temp1 temp2;
run;
** Proc SQL method **;
proc sql;
select * from temp1
outer union corr
select * from temp2;
quit;
** The SET statement takes less than 0.01 seconds to complete the task. **;
** The Proc SQL method takes about 2.83 seconds to complete the same task (this varies depending on the machine). **;
** The SET statement runs much faster for these particular tables. **;
Fill out my online form.