Google SAS Search

Add to Google

Tuesday, May 29, 2007

Where Did the Observation Come From?

Here is a little snippet of code I created to address the problem of assigning a value to a variable based on what data set an observation came from in a data step. Here is an example:

Suppose I have a whole bunch of data sets each representing a different country. I want to set a lot of them in one data step and create one resulting data set with a variable called language. In order to create the language variable correctly, we need to know which data set the observation is coming from. Typically we would use the IN= option on the data set to create a flag and then check that flag using if/then logic.


data selectedCountries;
set
chile(in=chile)
china(in=china)
costa_rica(in=costa)
egypt(in=egypt)
fiji(in=fiji)
turkey(in=turkey)
usa(in=usa)
saudi_arabia(in=saudi)
;

if chile then language = 'SPANISH';
else if china then language = 'CHINESE';
else if costa then language = 'SPANISH';
etc etc etc...
run;

One of the major problems with this approach is it does not scale well. The more countries you set, the more problematic your if/then logic becomes.

Here is a slightly more elegant solution that uses arrays and variable information functions. You still use the IN= option on the data set, however you want to name the in= variable the same as the value we want to assign. Then you create an array of all those in=variables. Finally, you loop through the array of in= variables and check for their boolean value. If it is true then you assign your new variable the value derived from the vname() function.

data selectedCountries;
set
chile(in= SPANISH)
china(in= CHINESE)
costa_rica(in= SPANISH)
egypt(in= ARABIC)
fiji(in= ENGLISH)
turkey(in= TURKISH)
usa(in= ENGLISH)
saudi_arabia(in= ARABIC)
;
array names[*] SPANISH CHINESE ARABIC ENGLISH TURKISH;
do i = 1 to dim(names);
if names[i] eq 1
then language = vname( names[i] );
end;
run;

No comments:

Post a Comment