The data center in Germany has submitted the German sales data to you for October 2011. Unlike France, they have provided you with two data sets, GERMAN_SALES_201010 and GERMANY_PRODUCTS
Their sales data set does not contain the description or unit price of their products, but the GERMANY_PRODUCTS data set has all the descriptions and prices offered at the German store, based on the Stock Code.
- Using data step, merge the sales and product information datasets together to get all variables into a single dataset and match the structure to ONLINE_RETAIL.
Create a new variable to indicate records which are:
(a) Found in both data sets.
(b) Found in the sales data set only.
(c) Found in the product list data set only.
- Use PROC FREQ on the variable created in (1) to determine the match rate in the different categories.
- Create a data set called GERMANY_201110_CLEAN which only contains the records found in both the sales and products data. Drop the match flag created in (1).