Sentry Page Protection
SAS Functions [7-14]
SUBSTR Function
You can extract texts or phrases from a character variable using the SUBSTR function.
Let's take a look at the PHONELIST data set below.
The PHONELIST data set above contains a list of phone numbers.
[Reminder: To see the PHONELIST data set on SAS Studio, run the code in the yellow box above]
Now, let's take a look at an example of the SUBSTR function.
Example
Data PhoneList2;
Set Phonelist;
Ext1 = substr(Tel, 9);
Run;
Note: the 2nd parameter is 9.
The 2nd parameter tells SAS the position where the extraction begins.
In our example, the extraction begins at the 9th position.
Tel: 780-444-8989
- 1st: 7
- 2nd: 8
- 3rd: 0
- 4th: -
- 5th: 4
- 6th: 4
- 7th: 4
- 8th: -
- 9th: 8
- 10th: 9
- 11th: 8
- 12th: 9
The SUBSTR function will extract the portion of the character value from the 9th position till the end of the character value.
E.g. 780-444-8989 --> 8989
Make sense?
Specify # of characters to extract
You can also specify the number of characters to extract.
Example
Data PhoneList3;
Set Phonelist;
Ext2 = substr(Tel, 9, 2);
Run;
You can also specify the number of characters to extract.
Example
Data PhoneList3;
Set Phonelist;
Ext2 = substr(Tel, 9, 2);
Run;
When using the SUBSTR function, you can add a 3rd parameter to tell SAS the # of characters to extract.
In our example, the third parameter is 2.
Ext2 = substr(Tel, 9, 2);
SAS will start the extraction at the 9th position and read 2 characters.
As a result, only the characters from the 9th and 10th position are extracted, as demonstrated in the image above.
[Tips: play around with the different parameters and you will get more familiar with the SUBSTR function.]
As a result, only the characters from the 9th and 10th position are extracted, as demonstrated in the image above.
[Tips: play around with the different parameters and you will get more familiar with the SUBSTR function.]
Exercise
Write a SAS program to extract the area code (first 3 digit) from the variable TEL.
Create any data set or variables if necessary.
Write a SAS program to extract the area code (first 3 digit) from the variable TEL.
Create any data set or variables if necessary.
Need some help?
HINT:
The area code starts at the first position.
SOLUTION:
Data PhoneList4;
Set Phonelist;
Area = substr(Tel, 1, 3);
Run;
Fill out my online form.