Sentry Page Protection
Coding Exercise (Answer)
Exercise 1
Answer:
Proc Univariate Data=AQI;
Var AQI;
Run;
The requested statistics are:
The Median AQI in the month of September is 85 µg/m³.
Exercise 2
Exercise 1
Answer:
Proc Univariate Data=AQI;
Var AQI;
Run;
The requested statistics are:
- Mean: 110.47
- Mode: 11
- Median: 85
- Variance: 9550
- Range: 462
The Median AQI in the month of September is 85 µg/m³.
Exercise 2
Answer:
Proc Univariate Data=AQI normal plots;
Var AQI;
histogram;
Run;
Numerical Method:
All of the p-values are less than 0.05. There is sufficient evidence to suggest that the distribution is not normally distributed.
Graphical method:
The histogram clearly shows a declining trend of the AQI. The distribution is not normally distributed at all.
Exercise 3
Answer:
Proc Means Data=AQI noprint;
Var AQI;
Class Hour;
Output Out=AQIHour Mean=Mean STD=STD;
Run;
Proc Sort Data=AQIHour;
By descending Mean;
Run;
Look at the AQIHour data set.
The 5 highest AQI are reported at:
The air pollution is definitely at its worst during mid-night.
Exercise 4
Answer:
Proc Freq Data=Repair;
Table Brand / plots=freqplot;
Run;
Lenovo has the largest number of defective laptops between 2012 to 2015 at 153 pieces.
Exercise 5
Answer:
No. The result did not take into consideration the total number of laptops in the region.
For example, Lenovo might not have the highest defective rate if it has 5 times the users than all the other brands.
Exercise 6
Answer:
Proc Format;
Value period 1 = "< 1 year"
2 = "Between 1-2 year"
3 = "Between 2-3 year"
4 = "> 3 years";
Run;
Data Repair2;
Set Repair;
** Correct Incomplete Date **;
Format Purdatenum date9. Period Period.;
If length(Purdate) = 9 then PurdateNum = input(Purdate, date9.);
else if length(Purdate) = 7 then PurdateNum = input("15" || Purdate, date9.);
else if length(Purdate) = 4 then PurdateNum = input("15JUN" || Purdate, date9.);
Diff = Repdate - PurdateNum;
if Diff <365 then Period = 1;
else if Diff <730 then Period = 2;
else if Diff <1095 then Period = 3;
else if Diff <1460 then Period = 4;
Run;
Exercise 7
Answer:
Proc Freq Data=Repair2;
Table Period / nocum;
Run;
The majority of the laptop starts experiencing problem between 1-2 years after the initial purchase.
Exercise 8
Answer:
Proc Freq Data=Repair2;
Table Brand * Parts;
Run;
Out of the 72 defective laptops from Dell, only 6 of them are related to their software (8.33%). It is definitely a strength of their computer.
On the other hand, Dell's laptop is weak in its keyboard, with 26 defective laptops related to their keyboard (36.11%).
Exercise 9
Answer:
Proc Means Data=Repair2;
Var Cost;
Class Brand;
Run;
All of the brands have comparable repair cost, with Lenovo having the highest average repair cost at $230.
Exercise 10
Answer:
ODS PDF file='/folders/myfolders/analysis.pdf';
proc means data=profile;
var age height weight income premium;
run;
proc freq data=profile;
table gender edu race region;
run;
ods pdf close;
Proc Univariate Data=AQI normal plots;
Var AQI;
histogram;
Run;
Numerical Method:
All of the p-values are less than 0.05. There is sufficient evidence to suggest that the distribution is not normally distributed.
Graphical method:
The histogram clearly shows a declining trend of the AQI. The distribution is not normally distributed at all.
Exercise 3
Answer:
Proc Means Data=AQI noprint;
Var AQI;
Class Hour;
Output Out=AQIHour Mean=Mean STD=STD;
Run;
Proc Sort Data=AQIHour;
By descending Mean;
Run;
Look at the AQIHour data set.
The 5 highest AQI are reported at:
- Hour 23 at 125.64
- Hour 0 at 124.42
- Hour 22 at 123.22
- Hour 1 at 122.87
- Hour 2 at 122.06
The air pollution is definitely at its worst during mid-night.
Exercise 4
Answer:
Proc Freq Data=Repair;
Table Brand / plots=freqplot;
Run;
Lenovo has the largest number of defective laptops between 2012 to 2015 at 153 pieces.
Exercise 5
Answer:
No. The result did not take into consideration the total number of laptops in the region.
For example, Lenovo might not have the highest defective rate if it has 5 times the users than all the other brands.
Exercise 6
Answer:
Proc Format;
Value period 1 = "< 1 year"
2 = "Between 1-2 year"
3 = "Between 2-3 year"
4 = "> 3 years";
Run;
Data Repair2;
Set Repair;
** Correct Incomplete Date **;
Format Purdatenum date9. Period Period.;
If length(Purdate) = 9 then PurdateNum = input(Purdate, date9.);
else if length(Purdate) = 7 then PurdateNum = input("15" || Purdate, date9.);
else if length(Purdate) = 4 then PurdateNum = input("15JUN" || Purdate, date9.);
Diff = Repdate - PurdateNum;
if Diff <365 then Period = 1;
else if Diff <730 then Period = 2;
else if Diff <1095 then Period = 3;
else if Diff <1460 then Period = 4;
Run;
Exercise 7
Answer:
Proc Freq Data=Repair2;
Table Period / nocum;
Run;
The majority of the laptop starts experiencing problem between 1-2 years after the initial purchase.
Exercise 8
Answer:
Proc Freq Data=Repair2;
Table Brand * Parts;
Run;
Out of the 72 defective laptops from Dell, only 6 of them are related to their software (8.33%). It is definitely a strength of their computer.
On the other hand, Dell's laptop is weak in its keyboard, with 26 defective laptops related to their keyboard (36.11%).
Exercise 9
Answer:
Proc Means Data=Repair2;
Var Cost;
Class Brand;
Run;
All of the brands have comparable repair cost, with Lenovo having the highest average repair cost at $230.
Exercise 10
Answer:
ODS PDF file='/folders/myfolders/analysis.pdf';
proc means data=profile;
var age height weight income premium;
run;
proc freq data=profile;
table gender edu race region;
run;
ods pdf close;