Search the site...

SASCRUNCH TRAINING
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
  • Home
  • Member's Area
  • How to Start
  • SAS Interface
  • Creating a Data Set
  • Practical SAS Training Course
  • SAS Certified Specialist Training Program
  • Proc SQL Course
  • Introduction to Time Series Analysis
  • SAS Project Training Course
  • Full Training / Membership
  • Sign up
  • About us
  • Contact us
Advanced Data Manipulation [1-5]


Proc Transpose
Transposing a data set is a new topic for the SAS certification exam.

It has been discussed in ​data manipulation [12-18]. 

In this module, we will provide additional training on this topic that will help you to prepare for the exam.

​Copy and run the code from the yellow box below:

The DEMO data set contains five columns:
  • Patient
  • Visitno
  • SBP (Systolic Blood Pressure)
  • DBP (Diastolic Blood Pressure)
  • Pulse
Picture

Let's first get familiar with the data set.

The first two columns are PATIENT and VISITNO.

Both are character variables. 

These columns are used to identify the unique rows in the data set.

For example, let's look at the first observation:
Picture

In the first observation, the PATIENT is ID1001, and the VISITNO is "Visit_1".

This uniquely identifies that it is patient ID1001's first visit to the clinic.


The other columns are:
  • SBP 
  • DBP
  • PULSE
Picture

​These are all numeric variables containing the measurements at each visit.


Now, let's do a simple proc transpose on this data set:
proc transpose data=demo out=t_demo;
run;

This example is very simple. 

It is simply a TRANSPOSE procedure with the input data set (i.e. DEMO) and output data set (T_DEMO).
Picture

​​The default behavior of proc transpose is to transpose all the numeric columns into rows.

​Yes, numeric columns only.

For example, the SBP column is a numeric column.

​It was transposed from a column into a row in the output data set T_DEMO:
Picture

​The character columns, such as the PATIENT and VISITNO, are not transposed by default.

All three numeric columns (SBP, DBP and PULSE) are transposed into rows in the T_DEMO data set:
Picture


Transposing Character Columns

If you want to transpose a character column, you must specify it in the VAR statement.

Let's look at an example.
proc transpose data=demo out=t_demo;
var visitno;
run;

The VAR statement tells SAS to transpose only the VISITNO column.

The output data set has one row for VISITNO:
Picture


Now, let's go back to the original data set for a moment.
​
The very first row is the data for ID1001's first visit to the clinic.

The SBP, DBP and Pulse measurements are 92, 87, 62, respectively.
Picture

Let's run the proc transpose step on the SBP, DBP and PULSE columns:
proc transpose data=demo out=t_demo;
var sbp dbp pulse;
run;

The step above transposed the SBP, DBP and PULSE columns.

There is one issue.
​
In the output data set, we cannot identify which patient and visit the transposed values belong to.

​For example, below is the COL1 column in the output data set:
Picture

In this column, the SBP, DBP and PULSE are 92, 87 and 62, respectively.

We know that these values belong to ID1001's first visit to the clinic.

However, there is nothing in the output data set that indicates this!
Picture

The two row-identifying values, PATIENT and VISITNO, must be added back to the output data set.

Fortunately, this can be done using:
  • The BY statement and/or
  • The ID statement.

These are explained below.


ID Statement

The ID statement allows you to use the identifier values as the variable names.

Without the ID statement, the variable names are automatically assigned as COL1, COL2, etc.
Picture

When we use the ID statement, the identifier values are used in place of the "COL" prefix.

​Let's look at an example.
proc transpose data=demo out=t_demo;
var sbp dbp pulse;
id patient visitno;
run;

The PATIENT and VISITNO column are specified in the ID statement.

We now have variable names that are more descriptive:
Picture

The columns created are ID1001Visit_1, ID1002Visit_2, etc.

These are more descriptive. We now know which patient and visit each measurement belongs to!


BY Statement

The BY statement tells SAS to transpose the data set for each by-group.

Let's look at an example.
proc sort data=demo;
by patient visitno;
run;

proc transpose data=demo out=t_demo;
var sbp dbp pulse;
by patient visitno;
run;

In this example, the PATIENT and VISITNO columns are specified in the BY statement.

The transpose is done for each of the PATIENT and VISITNO by-groups.
Picture

The SBP, DBP and PULSE are transposed for each patient and visit:
Picture

By using the BY statement, the row values (i.e. SBP, DBP and PULSE) can now be identified by the patient and the visit no.


​Combining the use of the BY statement and ID statement

It is more common to use both the ID and BY statement together when transposing a data set.


Let's look at an example below.​
proc sort data=demo;
by patient;
run;

proc transpose data=demo out=t_demo;
var sbp dbp pulse;
by patient;
id visitno;

run;

In this example, we have a BY statement for the PATIENT column, and an ID statement for the VISITNO column.
Picture

The transpose is done for each patient by-group.

The VISITNO values is assigned as the variable name (e.g. VISIT_1, VISIT_2, etc.).
Picture

Exercise

The SASHELP.CLASS data set contains five columns:
  • NAME
  • SEX
  • AGE
  • HEIGHT
  • WEIGHT

Transpose SEX, AGE, HEIGHT and WEIGHT columns into rows for each student.

What is the name and type (numeric or character) of the column that contains the transposed value?
Next

Need some help? 


HINT:
The BY statement would be needed so that the transpose is done for each student by-group.


SOLUTION:

proc transpose data=sashelp.class out=t_class;
var sex age height weight;
by name;
run;

The column created is called COL1 and it is a character column.


Fill out my online form.
Already a member? Go to member's area.