SAS Macro Do Loop Issues - loops

I have a very simple request. Loop through a dataset, turning each observation into a macro variable, and then doing a comparison on that macro variable. here's what my code looks like:
%do n = 1 %to &i2.;
data want;
set have;
%if _N_ = &n. %then %do;
call symputx("Var1",var1);
call symputx("var2",var2);
%end;
run;
data want;
retain FinalCount
set have;
where Variable1="&var1.";
by SomeVariable
if first.SomeVariable then FinalCount=0;
if final="FINAL" then FinalCount+1;
if Finalcount=&var2. then Final_Samples=1;
finalCount=FinalCount;
run;
%end
The part that is failing in the _N_ = &n. section. I keep getting the error "Variable N has been defined as both character and numeric." Basically I just need to set each observation as a macro variable once to do the next comparison, and then move on to the next guy. So, if there's a better way of doing that, please let me know. Otherwise, could you help me figure out why that comparison is not working?

If you could explain your larger problem then you might get a better answer that does not require you to convert your data values into macro variables. Converting values to strings and then trying to compare them again introduces a number of sources of errors.
To your question of how to set macro variables based on the Nth observation in a dataset, try one of these.
If it supports the FIRSTOBS= and OBS= dataset options.
data _null_;
set have (firstobs=&n obs=&n);
call symputx("Var1",var1);
call symputx("var2",var2);
run;
If the dataset supports direct access then use that.
data _null_;
p = &n;
set have point=p;
call symputx("Var1",var1);
call symputx("var2",var2);
stop;
run;
If not then use an IF (not a macro %if).
data _null_;
set have ;
if _n_ = &n then do;
call symputx("Var1",var1);
call symputx("var2",var2);
stop;
end;
run;

Related

SAS: How to loop a macro over rows of data to change to missing

Can anyone help with this issue I'm having where the macro is only taking the final row value of the data?
I have some data that looks like this:
data data1 ;
infile datalines dsd dlm='|' truncover;
input id :$2. year_age_15 EDU_2000 EDU_2001 EDU_2002 ;
datalines4;
10|2000|3|4|5
11|2000|5|5|6
12|2001|1|2|3
13|2002|5|5|6
14|2001|2|2|2
15|2000|3|3|4
;;;;
However I need it to use the year variable to determine which data to keep, and then change all the values for the years after that value to missing, like so:
data data1 ;
infile datalines dsd dlm='|' truncover;
input id :$2. year_age_15 EDU_2000 EDU_2001 EDU_2002 ;
datalines4;
10|2000|3|.|.
11|2000|5|.|.
12|2001|1|2|.
13|2002|5|5|6
14|2001|2|2|.
15|2000|3|.|.
;;;;
I've been trying to get this macro to work, but it only works intermittently and works just for the final row of the data rather than looping through the rows.
%macro macro2 (output=, input=);
data &output;
set &input;
%DO I = 1 %TO 6;
%do; call symput('value2',trim(left(put(year_age_15,8.))));
temp_col=&value2.;
%let year_end=&value2.;
%put YEAR END IS: &year_end.;
%put EDU YEAR IS: EDU_&year_end.;
%do year = &year_end. %TO 2002;
%put &year.;
EDU_&year.=.;
%end;
%end;
%end;
run;
%MEND macro2;
%macro1(input=testset, output=output_testset);
In R it could be something simple like :
for(i in 1:6){.
do this
}
Any advice? I can't figure out which bit is going wrong, thanks!
So, I think the issue here is your data is at the wrong level. You certainly can do what Reeza suggests, and I think it's probably reasonable to do so, but the reason why this is a bit complicated is that you have data in your variable name. That's not a best practice - your variable name should be "education" and your data should have a row for each year. Then this would be a simple WHERE statement!
Here's a simple PROC TRANSPOSE that turns it to the right structure, and then if you really need it the other way, a second one will turn it back. The where statement can be in the proc transpose or could be used somewhere else.
proc transpose data=data1 out=data_t (where=(year_Age_15 ge input(scan(_NAME_,2,'_'),4.)));
by id year_Age_15;
var edu_:;
run;
proc transpose data=data_t out=want;
by id year_age_15;
id _name_;
var col1;
run;
Create an array and index it by years rather than default 1:n
Loop through your array starting at year+1 and set to missing
data want;
set data1;
array educ(2000:2002) edu_2000-edu_2002;
if (year_age_15 +1) <= hbound(educ) then do i= (year_age_15 +1) to hbound(educ);
call missing(educ(i));
end;
run;
As #Joe mentions, the year to match is part of a variable name, which is tremor inducing 'data in the metadata'
You can use the VNAME to retrieve the variable name of an index accessed array element. Use that feature to compare to expected variable name whilst looping over a variable array based on variables named EDU*.
Example:
data have ;
infile datalines dsd dlm='|' truncover;
input id :$2. year_age_15 EDU_2000 EDU_2001 EDU_2002 ;
datalines4;
10|2000|3|4|5
11|2000|5|5|6
12|2001|1|2|3
13|2002|5|5|6
14|2001|2|2|2
15|2000|3|3|4
;;;;
data want;
set have;
array edus edu_:;
* find index of element corresponding to variable name having year;
do _n_ = 1 to dim(edus) until (upcase(vname(edus(_n_))) = cats('EDU_',year_age_15));
end;
* fill in elements at indices post the found one with missing values;
do _n_ = _n_+1 to dim(edus);
call missing(edus(_n_));
end;
run;

Trouble looping through variables in sas loop

Very simple: i'm trying to convert many character variables into numeric. The following code gives the "syntax error, expecting on of the following: a name, -, :, ;" for the drop and rename line.
data ex; set ex;
array numeric{3} var1 var2 var3;
do i=1 to 8;
temp = input(strip(numeric(i)),10.);
drop numeric(i);
rename temp = numeric(i);
end;
run;
can you not use drop or rename statements in do loops??
The dataset structure has to be decided when the data step is compiled. So there is no way you could use an array reference in a rename statement.
If you really have simple numerically suffixed variable names then you could use a simple RENAME statement.
rename new1-new3=var1-var3;
So your program might be as simple as this:
data want;
set have;
array ch var1-var3;
array new new1-new3;
do index=1 to dim(ch);
new[index]=input(left(ch[index]),32.);
end;
drop index var1-var3;
rename new1-new3=var1-var3;
run;
If the list of names is more complex, like AGE HEIGHT WEIGHT for example, then you will need to use a more complex RENAME statement like:
rename new1=AGE new2=HEIGHT new3=WEIGHT ;
So use some type of code generation method. Like macro code or using a data step to write lines of code to a file that can be included into the program using %include statement.
For example you could make a macro like this:
%macro rename(varlist);
%local i;
rename
%do i=1 %to %sysfunc(countw(&varlist));
new&i=%scan(&varlist,&i)
%end;
;
%mend ;
And use it like this:
%let charvars=AGE HEIGHT WEIGHT;
data want;
set have;
array ch &charvars;
array new [%sysfunc(countw(&charvars))];
do index=1 to dim(ch);
new[index]=input(left(ch[index]),32.);
end;
drop index &charvars;
%rename(&charvars);
run;
You're doing a couple of things incorrectly for SAS.
Don't use the same data set name in the DATA/SET statements. It's bad practice and makes it much harder to debug your code.
You cannot change the type to the same variable name in the same data step. Often you can rename ahead of time to make this slightly easier.
Ideally, especially if the file was read from a text file you fix these issues at the data import stage, not after the fact.
I don't know if the DROP/RENAME statements will take the array variable appropriately, that's something that would need to be tested.
data ex1;
set ex;
*original character variables;
array _chars(3) var1-var3;
*new numeric variables;
array _nums{3} new_var1-new_var3;
do i=1 to 3; *should match size of array;
_nums(i) = input(strip(_chars(i)), 10.);
end;
drop var1-var3;
*not sure if this will work in the same step;
rename new_var1-new_var3 = var1-var3;
run;

reading a data set multiple times in SAS

I am new here. I am trying to read in a data set multiple times. so for example, assume that I have 3 observations in a data set (called tempfile) for a variable called temp. the three observations are 4,6, and 5.. so I want to read in the set x number of times so the 4th observation would be 4, fifth would be 6 and sixth, would be 5. the 7th would be 4, etc etc. I have tried this literally a few dozen ways, by doing something like
data new;
do i=1 to 100;
set tempfile;
end;
output;
run;
I have tried this by moving the do statement, moving the output statement, omitting the output statement..... every which way, trying macros also. can somebody help? thanks John
followup....
Hello:
Thanks for response. That did work. I would like to now do several things involving some “if then” statements inside the loop (more than just reading in the data set).
I want to read in a data set n number of times, and each time, there will be two if then statements
So, assume I read in 3 numbers any number of times; 7, 15, and 12
As each number is read, it will ask if it is less than 10. And each time it will create a random number.
If less than 10, then
If rand(uniform) < .4 then 1 is added to counter1, else 1 is added to counter2
And if >= 10,
Then
If rand(uniform) < .2 then 1 is added to counter1, else 1 is added to counter2
Any help is much appreciated.
Thanks
John
The way that most data steps actually stop is when SAS reads past the end of the input. So you need a method that prevents SAS from doing that.
The easiest way to replicate the data is to just execute multiple output statements. So the first record is repeated three times, then the second record is repeated three times, etc.
data want;
set tempfile ;
do i=1 to 3;
output;
end;
run;
Another method is to just list the dataset multiple times on the SET statement. So to read it in 3 times just use
data want;
set tempfile tempfile tempfile;
run;
You could probably use macro logic or even just a macro variable to make the number of repetitions variable.
data _null_; call symputx('list',repeat('tempfile ',3-1)); run;
data want; set &list; run;
Other method is to use the POINT= and NOBS= options on the SET statement so that SAS never reads past the end and you can jump back to the beginning. But since it never reads past the end of the input data you will need to manually tell it when to stop.
data want ;
do i=1 to 3;
do p=1 to nobs ;
set tempfile point=p nobs=nobs;
output;
end;
end;
stop;
run;
Or more in the spirit of your original post you might want to use the MOD() function to figure out which observation to read next.
data want;
if _n_ > 100 then stop;
p=1+mod(_n_-1,nobs);
set tempfile point=p nobs=nobs;
run;
If you have SAS/STAT software SURVEYSELECT.
data have;
do temp=4,6,5;
output;
end;
run;
proc surveyselect reps=10 rate=1 out=temp2 noprint;
run;
The data step is designed for serial processing. In this case, you need to "remember" previous observations. You can do it using only the data step, but for that use case, there are other solutions in the SAS environment that are simpler. The one I suggest is a macro that appends the original file n times:
%macro replicate( data=, out=, n=)/des='&out is &data repeated &n times.';
data &out;
set
%do i=1 %to &n;
&data
%end;
; /* This ; ends the data step `set` statement */
run;
%mend;
You could test your example with this helper:
%macro test;
data have; /* create the example data set */
temp = 4; output;
temp = 6; output;
temp = 5; output;
run;
%replicate( data=have, out=want, n=4 );
proc print; quit;
%mend;
Here is a portion of the SAS doc that adds lots of detail with many examples.

How to resolve macro variable in a loop in SAS

I am trying to figure out how to call a macro variable in a loop within a data step in SAS, but I am lost; so I have 14 macro variables and I have to compare each of them to the entries of a vector. I tried:
data work.calendrier;
set projet.calendrier;
do i=1 to 3;
if date= "&vv&i"D then savinglight = 1;
end;
run;
But it is not working. The variable vv1 up to vv3 are date variables. For instance this code works:
data work.calendrier;
set projet.calendrier;
*do i=1 to 3;
if date= "&vv1"D then savinglight = 1;
*end;
run;
But with the loop it can not resolve the macro variable.
If you want to reference a macro variable with a number index like vv1,vv2,vv3 you need to resolve &i first.
SAS has a separate macro processor that resolves values before they reach the data step processor.
Essentially, you need to add extra ampersands at the beginning of your macro variable:
&&vv&i -> &vv1 -> "Value of vv1"
&&vv&i -> &vv2 -> "Value of vv2"
&&vv&i -> &vv3 -> "Value of vv3"
What happens here is that SAS reads in the information after the ampersand until it finds a break. SAS then resolves && as a single &, it then continues reading across until it resolves &i as a numeric value. You're then left with your required &vvi variable.
A couple of sources about this interesting topic:
http://www2.sas.com/proceedings/sugi29/063-29.pdf
http://www.lexjansen.com/nesug/nesug04/pm/pm07.pdf
Macro variable references are resolved before SAS compiles and runs your data step. You need to first figure out how to do what you want using SAS statements then, if necessary, you can use macro code to help you generate those statements.
If you want to test if a variable's value matches one of a list of values then consider using the IN operator.
data work.calendrier;
set projet.calendrier;
savinglight = date in ("&vv1"d,"&vv2"d,"&vv3"d);
run;
you need to use a macro. Here's the basic approach:
%let vv1 = 9;
%let vv2 = 2;
%let vv3 = 10;
data have;
drop i;
do i = 1 to 5;
date = i;
output;
end;
run;
%macro test;
data test;
set have;
%do i=1 %to 3;
if date= &&vv&i then savinglight = 1;
%end;
run;
%mend test;
%test;

How to loop sas macro by date?

I defined a macro which get data of 1 day.
For example
%macro getOneday(stnd_d);
data _null_;
call symputx('numdate',&stnd_d.);
run;
data oneday&numdate;
set alldata;
where stdd = &stnd_d;
run;
%mend;
Now I want to loop that macro from start date to end date.
data _null_;
do i = '01mar2015'd to '30mar2015'd;
%getOneday(stnd_d = i)
end;
run;
I don't know how can I pass the date expression value to %getOneday() as a parameter.
I hope you understand that macro - getOneday would simply write all the code written inside it, to the data _null_ statement by replacing the %getOneday and since you cannot write a data step inside a data step, its throwing an error. You simply have to replace the data _NULL_ statement with a macro like below.
Also using date like that would not work as Macro would treat them as char, you will have to convert them into date format, before using them in %do loop.
%macro test;
data _null_;
date1='01mar2015'd;
date2='30mar2015'd;
call symputx("date1",date1);
call symputx("date2",date2);
run;
%put &date1.;
%put &date2.;
%do i = &date1. %to &date2.;
%getOneday(stnd_d = &i.)
%end;
%mend;
%test;

Resources