Sentry Page Protection
Proc SQL [14-14]
Subqueries
SELECT *
FROM sashelp.cars
WHERE msrp >
(SELECT mean(msrp)
FROM sashelp.cars);
FROM sashelp.cars
WHERE msrp >
(SELECT mean(msrp)
FROM sashelp.cars);
A subquery is a query that is nested within another query.
Let's look at an example.
Copy and run the SCHOOL table from the yellow box below if you haven't done so.
The SCHOOL table contains a list of 20 students.
Let's assume we'd like to retrieve students who have an above average height in the class.
We will first compute the average student height using Proc SQL.
Example
Proc sql;
select mean(height)
as mean
from school;
quit;
select mean(height)
as mean
from school;
quit;
The average student height is 59.66 inches:
Now, we will filter the table and display only students whose height is above average (i.e. 59.66).
Proc sql;
select *
from school
where height > 59.66;
quit;
select *
from school
where height > 59.66;
quit;
The Proc SQL step does work. The output table displays only students whose height is above 59.66:
The program works. However, it requires us to manually compute the average student height (i.e. 59.66 inches) before manually entering it into the program.
This program will not work when the data set is updated with a different set of data.
A better approach to writing the program is to use a subquery.
Example
Proc sql;
select *
from school
where height >
(select mean(height)
from school);
quit;
select *
from school
where height >
(select mean(height)
from school);
quit;
The Proc SQL step above contains a subquery in the WHERE clause.
where height >
(select mean(height)
from school);
The subquery is evaluated first. It returns the average student height in the WHERE clause.
The WHERE clause then selects the rows where the height is above the computed average height.
The output table is exactly the same as before:
A subquery can also be used to look up data from a different table.
Let's look at an example.
Copy and run the AWARD table from the yellow box below:
The AWARD table contains a list of three students who have achieved an award in the Mathematics competition.
Now, we'd like to have a table that displays the background information such as the age, gender, school and height for only these three students.
We can create a query on the SCHOOL table with a WHERE clause that selects the students from the AWARD table using a subquery.
Example
Proc sql;
select *
from school
where name in
(select name
from award);
quit;
select *
from school
where name in
(select name
from award);
quit;
The subquery in the WHERE clause selects the names from the AWARD table, which are Jeffery, Krystal and Chau.
These are the only three students selected and displayed in the output:
Exercise
Locate the CARS table from the SASHelp library.
Write a Proc SQL step to display the average car price (MSRP) for each car maker. Filter the output and display only the car makers whose average car price is above average.
Locate the CARS table from the SASHelp library.
Write a Proc SQL step to display the average car price (MSRP) for each car maker. Filter the output and display only the car makers whose average car price is above average.
Need some help?
HINT:
When filtering the table that contains summarized data, the HAVING clause must be used.
SOLUTION:
Proc sql;
select make, mean(msrp) as mean
from sashelp.cars
group by make
having mean >
(select mean(msrp)
from sashelp.cars);
quit;
Fill out my online form.