I am very new to SAS and have a basic question.
I am writing a macro containing a Do-To loop from i = 1 to n. I want n to be conditioned on whether a year is less than 2005 or greater than it. If less than than n=10, otherwise n=11.
The year variable is already contained within the macro call so I feel like this should be easy but I'm struggling.
For example something like this code would be ideal:
%do i= 1 %to (if &year. < 2005 then 10; else 11)
This, however, does not seem to work. Is there another way I could easily implement this? Or use something similar to what's above?
Thanks! Your help is greatly appreciated!
For this problem you can take advantage of the fact that SAS converts logical expressions to 0/1 results.
%do i= 1 %to %eval(10 + (&year >= 2005)) ;
For a more general condition just make another variable for the upper bound and use %IF/%THEN logic to set it.
%if &year < 2005 %then %let upper=10;
%else %let upper=11;
%do i= 1 %to &upper;
Related
I need to loop through some date format like 'yyyy-mm-dd' in SAS Macro since my main query body uses Teradata SQL Pass-Thru however my code below is not working where %let wk_first_dt is not picking up 'yyyy-mm-dd' format. The error says
%MACRO DO_APPEND;
%let first_dt_list = '2020-03-11' '2020-03-18';
%local i wk_first_dt;
%do i=1 %to %sysfunc(countw(&first_dt_list));
%let wk_first_dt = %scan(&first_dt_list, &i);
...
proc sql
...
where BILL_DT >= Date &wk_first_dt
AND SL_INVC.BILL_DT <= (Date &wk_first_dt + 7)
...
quit;
...
%END;
%MEND;
%DO_APPEND;
ERROR: Literal contains unmatched quote.
ERROR: The macro DO_APPEND will stop executing.
Did a lot of research and I believe the issue was due to the single quotes in this format 'yyyy-mm-dd' since there is special treatment dealing with single quotes in SAS Macro. however the most popular recommendations like
%let first_dt_list = %str(%')yyyy-mm-dd.%str(%')
won't work in my case. Please kindly point me to the right direction. Thanks in advance!
Btw, in the code above, if I change %scan(&first_dt_list, &i) to '2020-03-11', the whole Macro works - but i just need to loop through multiple dates. This makes me believe once 'yyyy-mm-dd' is passed to %let wk_first_dt, the issue would be fixed.
Your %SCAN() function call is wrong.
75 %let list = '2020-03-11' '2020-03-18';
76 %put %qscan(&list,1);
'2020
Since you didn't tell %SCAN() what delimiter to use it used ANY of the default set of delimiters, which includes the hyphen.
Try telling it that only space should be used as the delimiter.
%do i=1 %to %sysfunc(countw(&first_dt_list, %str( )));
%let wk_first_dt = %scan(&first_dt_list, &i,%str( ));
I have a bunch of character variables which I need to sort out from a large dataset. The unwanted variables all have entries that are the same or are all missing (meaning I want to drop these from the dataset before processing the data further). The data sets are very large so this cannot be done manually, and I will be doing it a lot of times so I am trying to create a macro which will do just this. I have created a list macro variable with all character variables using the following code (The data for my part is different but I use the same sort of code):
data test;
input Obs ID Age;
datalines;
1 2 3
2 2 1
3 2 2
4 3 1
5 3 2
6 3 3
7 4 1
8 4 2
run;
proc contents
data = test
noprint
out = test_info(keep=name);
run;
proc sql noprint;
select name into : testvarlist separated by ' ' from test_info;
quit;
My idea is then to just use a data step to drop this list of variables from the original dataset. Now, the problem is that I need to loop over each variable, and determine if the observations for that variable are all the same or not. My idea is to create a macro that loops over all variables, and for each variable counts the occurrences of the entries. Since the length of this table is equal to the number of unique entries I know that the variable should be dropped if the table is of length 1. My attempt so far is the following code:
%macro ListScanner (org_list);
%local i next_name name_list;
%let name_list = &org_list;
%let i=1;
%do %while (%scan(&name_list, &i) ne );
%let next_name = %scan(&name_list, &i);
%put &next_name;
proc sql;
create table char_occurrences as
select &next_name, count(*) as numberofoccurrences
from &name_list group by &next_name;
select count(*) as countrec from char_occurrences;
quit;
%if countrec = 1 %then %do;
proc sql;
delete &next_name from &org_list;
quit;
%end;
%let i = %eval(&i + 1);
%end;
%mend;
%ListScanner(org_list = &testvarlist);
Though I get syntax errors, and with my real data I get other kinds of problems with not being able to read the data correctly but I am taking one step at a time. I am thinking that I might overcomplicate things so if anyone has an easier solution or can see what might be wrong to I would be very grateful.
There are many ways to do this posted around.
But let's just look at the issues you are having.
First for looping through your space delimited list of names it is easier to let the %do loop increment the index variable for you. Use the countw() function to find the upper bound.
%do i=1 %to %sysfunc(countw(&name_list,%str( )));
%let next_name = %scan(&name_list,&i,%str( ));
...
%end;
Second where is your input dataset in your SQL code? Add another parameter to your macro definition. Where to you want to write the dataset without the empty columns? So perhaps another parameter.
%macro ListScanner (dsname , out, name_list);
%local i next_name sep drop_list ;
Third you can use a single query to count all of variables at once. Just use count( distinct xxxx ) instead of group by.
proc sql noprint;
create table counts as
select
%let sep=;
%do i=1 %to %sysfunc(countw(&name_list,%str( )));
%let next_name = %scan(&name_list,&i,%str( ));
&sep. count(distinct &next_name) as &next_name
%let sep=,;
%end;
from &dsname
;
quit;
So this will get a dataset with one observation. You can use PROC TRANSPOSE to turn it into one observation per variable instead.
proc transpose data=counts out=counts_tall ;
var _all_;
run;
Now you can just query that table to find the names of the columns with 0 non-missing values.
proc sql noprint ;
select _name_ into :drop_list separated by ' '
from counts_tall
where col1=0
;
quit;
Now you can use the new DROP_LIST macro variable.
data &out ;
set &dsname ;
drop &drop_list;
run;
So now all that is left is to clean up after your self.
proc delete data=counts counts_tall ;
run;
%mend;
As far as your specific initial question, this is fairly straightforward. Assuming &testvarlist is your macro variable containing the variables you are interested in, and creating some test data in have:
%let testvarlist=x y z;
data have;
call streaminit(7);
do id = 1 to 1e6;
x = floor(rand('Uniform')*10);
y = floor(rand('Uniform')*10);
z = floor(rand('Uniform')*10);
if x=0 and y=4 and z=7 then call missing(of x y z);
output;
end;
run;
data want fordel;
set have;
if min(of &testvarlist.) = max(of &testvarlist.)
and (cmiss(of &testvarlist.)=0 or missing(min(of &testvarlist.)))
then output fordel;
else output want;
run;
This isn't particularly inefficient, but there are certainly better ways to do this, as referenced in comments.
I am creating multiple datasets named "Taxes&i.(&i notes each new dataset according to the counter I. ) I then append all of the tables at the end with the table " want". When I finish going through the first macro loop i would like to go through another loop that changes the dates. So the line
"DATE BETWEEN '14Feb2016:0:0:0'dt AND '16Feb2016:0:0:0'dt);
would look like
date between 'date1' and 'date2';
I don't know how to create that loop though, so it goes back into the first loop. Then that loop finishes and it goes into the second loop changes the dates and back into the first loop finishes...into the second loop.
Also there may be a way to make this less bulky and some how when the first loop is done executing maybe the dates can automatically increase by one day without them being declared. That will work also. I am just not sure which is best and possible.
%macro loop(list1, list2);
%let n=%sysfunc(countw(&list1, %str('')));
%do i=1 %to &n;
%let O_list1 = %scan(&list1, &i, %str('');
%let O_list2 = %scan(&list2, &i, %str('');
/* another macro here called date_loop(date1, date2); */
proc sql;
create table taxes&i;
select t1.tax_info
FROM work.taxes&1 as t1
WHERE (t1.O_LIST1 = &O_List2) AND
(DATE BETWEEN '14Feb2016:0:0:0'dt AND '16Feb2016:0:0:0'dt);
%end;
%mend;
run;
%list('1' '2', '3' '4') /*( this is "O_List1", "O_List2") */
data want;
set abc.taxes: ;
run;
Thanks for help!
Currently, my macro is running to insert a constant number of rows:
%MACRO ADD_PERIOD;
%DO P = 1 %TO 39;
Would I be able to modify this macro or create a new macro to run this, not 39 times, but replace the number of loops with a variable I have from another table?
Thank you!
Use the call symput to turn that variable(my_var) into a macro variable(loop_var)
data _null_;
set your_table;
call symput("loop_var", my_var);
run;
and use & to resolve the macro variable into your code
%MACRO ADD_PERIOD;
%DO P = 1 %TO &loop_var;
You could also pass that macro variable as parameters into your macro.
%MACRO ADD_PERIOD(loop_var);
Following is some C code. How do I do the same in sas?
For(i=30, j=1; i<=41, j<=12; i++, j++)
(
closure(i,j) /*calling function with input parameters I and j */
);
I basically want to do the following macro calls using a loop with two counters I and J
%closure(30,201201);
%closure(31,201202);
%closure(32,201203);
%closure(33,201204);
%closure(34,201205);
%closure(35,201206);
%closure(36,201207);
%closure(37,201208);
%closure(38,201209);
%closure(39,201210);
%closure(40,201211);
%closure(41,201212);
Please note that I do not want to use a nested loop.
Tips are appreciated.
Doing this in SAS depends on how your data is structured. It's possible to do this:
%do i = 1 to 12;
%closure(%eval(29+i),%eval(201200+i));
%end;
That's a bit odd, but it should work fine.
You could also do it in the %closure macro. Pass i and then determine the value of the other parameters inside the macro, if they always have this relationship. If they always have some relationship, but the 2012 and 18 parts are variable, then you have several options:
Define 2012 and 29 as macro variables ahead of this step, and replace them in the code with such.
%let year=2012;
%let startrec=29;
%do i = 1 to 12;
%closure(%eval(&startrec.+&i.),%eval(&year.00+&i.));
%end;
Use date functions to determine the value of j, if it is not always 01-12.
%closure(30,%sysfunc(intnx(month,'01JUN2011'd,&i.-1)))
(you may want to format the result in YYYYMM format, or you may be just as well off using the date result, depending on %closure)
Define all of these terms in a dataset, and call the macro from the dataset.
data to_call;
input i j;
datalines;
30 201201
31 201202
.... ;
run;
proc sql;
select cats('%closure(',i,',',j,')') into :calllist separated by ' ' from to_call;
quit;
&calllist.
That's a more 'SAS'sy way to do things, making the process data driven. Most commonly used when the i and j parameters are stored as data element somewhere (in a control table, for example, or derived from some other data source).
So If you want
%closure(30,201201);
%closure(31,201202);
%closure(32,201203);
%closure(33,201204);
%closure(34,201205);
%closure(35,201206);
%closure(36,201207);
%closure(37,201208);
%closure(38,201209);
%closure(39,201210);
%closure(40,201211);
%closure(41,201212);
then it would be better either you calculate the value of J and bring it to 201200 aur something near about.
Or you should start the j loop with 201201 and end it to 201212
simply go for
For(i=30, j=201201; i<=41, j<=201212; i++, j++)
(
closure(i,j)
);