I am very new to SAS so I apologize in advance. I am using the SAS university edition.
I have 20 datasets each from a certain year (1997-2017), all containing information captured in 30 variables. Now, I want to apply the same code to all of the datasets, however some code chunks only to variables of certain years. Therefore, I wanted to use a macro that ranges from 1997-2017 doing something like...
LIBNAME IN '/folders/myfolders/fake_data';
%let j= 1997 to 2017;
data fake_&j;
set fake_data;
proc import out= fake_&j datafile = "/folders/myfolders/fake_data/mz_&j.dta" replace
* Year;
year = j;
to access the dataset fake_1997.dta, create a year variable that takes on the value of the dataset's name (1997) apply the code (see below) to it, then do the same with mz_1998.dta and so on.
An example of the code that I want to apply to all of the data would be
* Weights;
if (j GE 1997 AND j LE 2004) then
shrf = x;
else if (j GE 2005 AND j LE 2017) then
shrf = y;
Thank you so much in advance!
Macro code, in part, is code that writes code. The writing is not so much an active process like a 'write' or 'print' or 'echo', but more akin to a boilerplate or template system.
The macro %DO loop can not exist in 'open code', so it must be coded inside a macro definition. The macro is 'invoked' in order to have it write (or generate) the code. You might sometimes see the term 'gencode' to mean generated code produced by invoking a macro.
Proc IMPORT is great for reading consistently data on a regular basis, or for dealing with first explorations. IMPORT does not do any data transformations or allow you to add new variables during the import. You will need a second step, a DATA step, to perform those actions.
Name you macros according to their purpose. Any macro variables used inside the macro should be declared as %LOCAL to prevent unwanted interaction with global macro variables.
Example:
%macro getData(fromYear=, toYear=);
%local year;
%DO year = &fromYear %to &toYear;
* step 1;
* get initial data set from raw data file;
* double dot needed because &<NAME>. is a token specifying macro variable resolution;
proc import
datafile = "/folders/myfolders/fake_data/mz_&YEAR..dta" /* double dot */
replace
out= import_fake_&YEAR.
;
* step 2;
data fake_&YEAR;
set import_fake_&YEAR.;
year = &YEAR;
%* macro %if codegens a data step statement specific to year;
%if &YEAR GE 1997 AND &YEAR LE 2004 %then %do;
/* anything that is not consumed by macro processing is emitted as a codegen */
/* so here the macro is emitting a data step assignment statement */
shrf = x;
%end;
%else
%if &YEAR GE 2005 AND &YEAR LE 2017 %then %o
shrf = y;
%end;
%else %do;
shrf = 1; * uniform weighting ;
%end;
run;
%END;
%mend;
%* invoke;
%getData(fromYear=1997, toYear=2017);
%* this point your might want to combine (stack) all the data sets together
%* so that other Procs can use the 'all' data and utilize CLASS, BY and WHERE
%* statements that are so effective in SAS;
data fake_duodecade;
set fake_1997-fake2017; %* special data set name list construct;
run;
Specify macro parameters in your macro definition to make it more useful and reuseable. Don't write macros the reproduce the capabilities of existing Procedures. Don't write macros when you don't need to. Don't gencode that you can't write yourself. Don't mix (misunderstand) the scope of macro variables and know how they are not the same as DATA step variables.
An approach that doesn't involve nesting macros is the magical data null and call execute. I use this all the time. It's most helpful if the datasets are already in a SAS format.
libname HAVELIB "path-to-sas-datasets";
data _null_;
set sashelp.vtable(where=(libname="HAVELIB"));
call execute("%mymacro(HAVELIB." || strip(memname) || ");");
run;
Creating a loop that imports a bunch of .dta files as sas7bdat is just as easy, create a dataset based on the output of infile pipedir and do a similar loop using call execute.
More info here:
https://www.lexjansen.com/phuse/2014/cc/CC06.pdf
Related
I need to loop through some date format like 'yyyy-mm-dd' in SAS Macro since my main query body uses Teradata SQL Pass-Thru however my code below is not working where %let wk_first_dt is not picking up 'yyyy-mm-dd' format. The error says
%MACRO DO_APPEND;
%let first_dt_list = '2020-03-11' '2020-03-18';
%local i wk_first_dt;
%do i=1 %to %sysfunc(countw(&first_dt_list));
%let wk_first_dt = %scan(&first_dt_list, &i);
...
proc sql
...
where BILL_DT >= Date &wk_first_dt
AND SL_INVC.BILL_DT <= (Date &wk_first_dt + 7)
...
quit;
...
%END;
%MEND;
%DO_APPEND;
ERROR: Literal contains unmatched quote.
ERROR: The macro DO_APPEND will stop executing.
Did a lot of research and I believe the issue was due to the single quotes in this format 'yyyy-mm-dd' since there is special treatment dealing with single quotes in SAS Macro. however the most popular recommendations like
%let first_dt_list = %str(%')yyyy-mm-dd.%str(%')
won't work in my case. Please kindly point me to the right direction. Thanks in advance!
Btw, in the code above, if I change %scan(&first_dt_list, &i) to '2020-03-11', the whole Macro works - but i just need to loop through multiple dates. This makes me believe once 'yyyy-mm-dd' is passed to %let wk_first_dt, the issue would be fixed.
Your %SCAN() function call is wrong.
75 %let list = '2020-03-11' '2020-03-18';
76 %put %qscan(&list,1);
'2020
Since you didn't tell %SCAN() what delimiter to use it used ANY of the default set of delimiters, which includes the hyphen.
Try telling it that only space should be used as the delimiter.
%do i=1 %to %sysfunc(countw(&first_dt_list, %str( )));
%let wk_first_dt = %scan(&first_dt_list, &i,%str( ));
i have bellow currently
%macro sqlloop (event_id);
...lots of code, mostly proc sql segments ...
%mend;
that generates an output table (named export_table2). I need to be able to run this code dozens of time for every value in another table (named vars). my trial code testing what I want it to do is below (basically manually typing in the first two values of this 68 row table)
data ;
%let empl_nbr_var = '222';
%let fleet = '7ER';
%let position = 'A';
%let base = 'BWI';
%sqlloop(event_id = 1);
run;
data summary_pilots;
set work.export_table2;
run;
data;
%let empl_nbr_var = '111';
%let fleet = '320';
%let position = 'B';
%let base = 'CHS';
%sqlloop(event_id = 2);
run;
data summary_pilots;
set summary_pilots work.export_table2;
run;
This produces the final output of each execution stacked into one table called summary_pilots. How can I do this in a loop, prehaps using call execute to iterate through each row of vars? The columns of vars are exactly what I need for the macro variables, and I want to iterate through every single row to assign those macro variable and run my %sqlloop again. Thanks for the help!
EDIT:
currently figuring out how call execute works and see how its helpful here but still a bit stuck... code below works exactly as youd think, printing out all the variables in the table vars into the log.
data ;
set work.vars;
call execute( '%put='|| strip(empl_nbr_var) || ';
%put = ' || strip(fleet) ||';
%put = '|| strip(position) ||';
%put = ' || strip(base) ||';' );
run;
I am trying to use the below code, but am getting a crazy amount of errors due to the macros being assigned weirdly. The types in the columns of vars match exactly what I want them to be in the macros, but it still looks like that might be the issue here?
data ;
set work.vars;
call execute( '
%let empl_nbr_var =' || strip(empl_nbr_var) || ';
%let fleet = ' || strip(fleet) ||';
%let position = '|| strip(position) ||';
%let base = ' || strip(base) ||';
%sqlloop(event_id = 17);' );
run;
and the event ID doesnt actually matter here so i just left that as a random number for now.
Assuming your work.Vars contain data like this:
empl_nbr_var
fleet
position
base
222
7ER
A
BWI
111
320
B
CHS
...
...
...
...
Consider extending your macro to receive such input parameters:
%macro sqlloop(event_id, empl_nbr_var, fleet, position, base);
...lots of code, mostly proc sql segments ...
%mend;
Then, build run macro with concatenated data values via call execute. Below passes 17 into event_id parameter.
data _null_;
set Work.Vars;
args = catx("', '", empl_nbr_var, fleet, position, base);
args = '%sqlloop(17,'''|| strip(args) || ''');';
put args $char.; /* VIEW CALL COMMAND */
call execute(args); /* RUN CALL COMMAND */
run;
It makes no sense to code %LET statements in the middle of a data step. The macro processor will evaluate them before it passes the text of the data step code to SAS to process. Avoid confusing yourself by moving the %LET statements before the data step.
If the macro needs values of macros variables, like FLEET, as input then make those things parameters to the macro. Don't create a macro that references "magic" macro variables, macro variables that are neither input parameters nor created by the macro. Instead the reference to them just appears in the middle of the macro definition as if their values will appear by magic somehow.
%macro sqlloop(empl_nbr_var,fleet,position,base);
... code that uses &fleet.
%mend;
If you have a lot of combinations of parameters you want run through your macro then collect them into a dataset first.
data inputs ;
input empl_nbr_var fleet $ position $ base $ ;
cards;
222 7ER A BWI
111 320 B CHS
;
Then you can use those dataset variables to generate the calls to the macro. You could try using call execute() to do this, but personally I find it a lot easier to use a data step to write the code to a file. Then you can examine the file and make sure the code generation logic is correct. Plus you can use the power of the PUT statement to make the code generation easier. For example if the variable names match the parameter names you can use named output.
filename code temp;
data _null_;
set inputs;
file code ;
put '%sqlloop(' empl_nbr_var= ',' fleet= ',' position= ',' base= ')';
run;
Which will generate code like:
%sqlloop(empl_nbr_var=222 ,fleet=7ER ,position=A ,base=BWI )
%sqlloop(empl_nbr_var=111 ,fleet=320 ,position=B ,base=CHS )
Once you are confident that it is generating the right code use the %INCLUDE command to run the code it generates.
%include code / source2;
If the macro does not have its own step for aggregating the results you could include that step in the code generation.
filename code temp;
data _null_;
set inputs;
file code ;
put '%sqlloop(' empl_nbr_var= ',' fleet= ',' position= ',' base= ')';
put 'proc append base=summary_pilots data=export_table force; run;' ;
run;
%include code / source2;
I have a very simple request. Loop through a dataset, turning each observation into a macro variable, and then doing a comparison on that macro variable. here's what my code looks like:
%do n = 1 %to &i2.;
data want;
set have;
%if _N_ = &n. %then %do;
call symputx("Var1",var1);
call symputx("var2",var2);
%end;
run;
data want;
retain FinalCount
set have;
where Variable1="&var1.";
by SomeVariable
if first.SomeVariable then FinalCount=0;
if final="FINAL" then FinalCount+1;
if Finalcount=&var2. then Final_Samples=1;
finalCount=FinalCount;
run;
%end
The part that is failing in the _N_ = &n. section. I keep getting the error "Variable N has been defined as both character and numeric." Basically I just need to set each observation as a macro variable once to do the next comparison, and then move on to the next guy. So, if there's a better way of doing that, please let me know. Otherwise, could you help me figure out why that comparison is not working?
If you could explain your larger problem then you might get a better answer that does not require you to convert your data values into macro variables. Converting values to strings and then trying to compare them again introduces a number of sources of errors.
To your question of how to set macro variables based on the Nth observation in a dataset, try one of these.
If it supports the FIRSTOBS= and OBS= dataset options.
data _null_;
set have (firstobs=&n obs=&n);
call symputx("Var1",var1);
call symputx("var2",var2);
run;
If the dataset supports direct access then use that.
data _null_;
p = &n;
set have point=p;
call symputx("Var1",var1);
call symputx("var2",var2);
stop;
run;
If not then use an IF (not a macro %if).
data _null_;
set have ;
if _n_ = &n then do;
call symputx("Var1",var1);
call symputx("var2",var2);
stop;
end;
run;
I am trying to figure out how to call a macro variable in a loop within a data step in SAS, but I am lost; so I have 14 macro variables and I have to compare each of them to the entries of a vector. I tried:
data work.calendrier;
set projet.calendrier;
do i=1 to 3;
if date= "&vv&i"D then savinglight = 1;
end;
run;
But it is not working. The variable vv1 up to vv3 are date variables. For instance this code works:
data work.calendrier;
set projet.calendrier;
*do i=1 to 3;
if date= "&vv1"D then savinglight = 1;
*end;
run;
But with the loop it can not resolve the macro variable.
If you want to reference a macro variable with a number index like vv1,vv2,vv3 you need to resolve &i first.
SAS has a separate macro processor that resolves values before they reach the data step processor.
Essentially, you need to add extra ampersands at the beginning of your macro variable:
&&vv&i -> &vv1 -> "Value of vv1"
&&vv&i -> &vv2 -> "Value of vv2"
&&vv&i -> &vv3 -> "Value of vv3"
What happens here is that SAS reads in the information after the ampersand until it finds a break. SAS then resolves && as a single &, it then continues reading across until it resolves &i as a numeric value. You're then left with your required &vvi variable.
A couple of sources about this interesting topic:
http://www2.sas.com/proceedings/sugi29/063-29.pdf
http://www.lexjansen.com/nesug/nesug04/pm/pm07.pdf
Macro variable references are resolved before SAS compiles and runs your data step. You need to first figure out how to do what you want using SAS statements then, if necessary, you can use macro code to help you generate those statements.
If you want to test if a variable's value matches one of a list of values then consider using the IN operator.
data work.calendrier;
set projet.calendrier;
savinglight = date in ("&vv1"d,"&vv2"d,"&vv3"d);
run;
you need to use a macro. Here's the basic approach:
%let vv1 = 9;
%let vv2 = 2;
%let vv3 = 10;
data have;
drop i;
do i = 1 to 5;
date = i;
output;
end;
run;
%macro test;
data test;
set have;
%do i=1 %to 3;
if date= &&vv&i then savinglight = 1;
%end;
run;
%mend test;
%test;
Following is some C code. How do I do the same in sas?
For(i=30, j=1; i<=41, j<=12; i++, j++)
(
closure(i,j) /*calling function with input parameters I and j */
);
I basically want to do the following macro calls using a loop with two counters I and J
%closure(30,201201);
%closure(31,201202);
%closure(32,201203);
%closure(33,201204);
%closure(34,201205);
%closure(35,201206);
%closure(36,201207);
%closure(37,201208);
%closure(38,201209);
%closure(39,201210);
%closure(40,201211);
%closure(41,201212);
Please note that I do not want to use a nested loop.
Tips are appreciated.
Doing this in SAS depends on how your data is structured. It's possible to do this:
%do i = 1 to 12;
%closure(%eval(29+i),%eval(201200+i));
%end;
That's a bit odd, but it should work fine.
You could also do it in the %closure macro. Pass i and then determine the value of the other parameters inside the macro, if they always have this relationship. If they always have some relationship, but the 2012 and 18 parts are variable, then you have several options:
Define 2012 and 29 as macro variables ahead of this step, and replace them in the code with such.
%let year=2012;
%let startrec=29;
%do i = 1 to 12;
%closure(%eval(&startrec.+&i.),%eval(&year.00+&i.));
%end;
Use date functions to determine the value of j, if it is not always 01-12.
%closure(30,%sysfunc(intnx(month,'01JUN2011'd,&i.-1)))
(you may want to format the result in YYYYMM format, or you may be just as well off using the date result, depending on %closure)
Define all of these terms in a dataset, and call the macro from the dataset.
data to_call;
input i j;
datalines;
30 201201
31 201202
.... ;
run;
proc sql;
select cats('%closure(',i,',',j,')') into :calllist separated by ' ' from to_call;
quit;
&calllist.
That's a more 'SAS'sy way to do things, making the process data driven. Most commonly used when the i and j parameters are stored as data element somewhere (in a control table, for example, or derived from some other data source).
So If you want
%closure(30,201201);
%closure(31,201202);
%closure(32,201203);
%closure(33,201204);
%closure(34,201205);
%closure(35,201206);
%closure(36,201207);
%closure(37,201208);
%closure(38,201209);
%closure(39,201210);
%closure(40,201211);
%closure(41,201212);
then it would be better either you calculate the value of J and bring it to 201200 aur something near about.
Or you should start the j loop with 201201 and end it to 201212
simply go for
For(i=30, j=201201; i<=41, j<=201212; i++, j++)
(
closure(i,j)
);