**Statistical Analysis [7-7]**

**Correlation Analysis**

The correlation coefficient (r) can be used to measure the strength of association between two variables.

The

**ICECREAM**data set contains the list of ice-cream sales and the daily temperature for 50 days.

The owner of the ice-cream truck wants to find out whether the ice-cream sales is associated with the daily temperature.

The correlation coefficient (r) between ice-cream sales and the temperature is computed.

__Example__

Proc Corr Data=Icecream;

var temp sales;

run;

Proc Corr computed the correlation coefficient:

The correlation coefficient is 0.31235.

This indicates a weak positive correlation between temperature and ice-cream sales.

When temperature goes up, the ice-cream sales tend to go up as well.

**Correlation Matrix**

You can also plot the correlation matrix using the PLOTS option.

__Example__

Proc Corr Data=Icecream plots=matrix;

var temp sales;

run;

The plots option plots the matrix on the output:

A slight upward trend can be seen from the plot.

**Hypothesis Testing**

Proc Corr also computes the p-value for the following hypothesis:

H0: r=0

H1: r≠0

In our example, the p-value is less than 0.05:

This rejects the hypothesis that r is 0.

The positive correlation found from the samples is unlikely to have happened by chance.

**Exercise**

Copy and run the WINE data set from the yellow box below:

The WINE data set contains the price and demand for a list of wines.

Compute the correlation coefficient between the price and demand of the wines.

Briefly describe the association between the two.

