Labels

Friday, August 13, 2010

Macro counting Missing Values in a data set

*Calculate all missing values for each variable in a database;
%macro miscount(dsn);
* Determine the number of misssing values
*of each variable in the submitted data set;
data temp_char(keep=_character_); *copy all character variables into a temp file;
set &dsn;
run;
data temp_num(keep=_numeric_); *copy all numeric variables into a temp file;
set &dsn;
run;
proc contents data= temp_char
out= cont noprint; *output the contents to a file;
run;

* make the variable names into Macro variables;
proc sql noprint;
select distinct name
into :varname1-:varname999
from cont;
quit;

*make character variables into numeric variables;
*then we can use the PROC MEANS to find the missing numbers;
*this is better then using the PROC FREQ for each variable;
data temp_char;
set temp_char;
%do i = 1 %to &sqlobs; *the &sqlobs tells you how many variables were read in previous SQL;
if compress(&&varname&i)='' then new&i=.;
else new&i=0;
*Drop and rename the variables;
drop &&varname&i;
rename new&i=&&varname&i;
%end;

*Use PROC MEANS to do the tables;
proc means data=temp_char n nmiss;
title 'Missing number table for Char Var of ' &dsn;
proc means data=temp_num n nmiss;
title 'Missing number table of Num Var of ' &dsn;
%mend miscount;

No comments:

Post a Comment