SAS macro loop and dummy variable - loops

I just want to make new dummy variable when there is a certain value.
Here is my orignal data example.
ID A1 A2... A10
1 10 1 5
2 20 8 4
...
...
And I would like to add dummy variable when there is a certain value in those attributes.
For example, ID 1 subject have "10", a new variable, Add10 would be 1..
ID A1 A2.. A10 Add1..Add4 Add5...Add20
1 10 1.. 5 1 ...0 1 ... 0
2 20 8.. 4 0 ...1 0 ... 1
...
Here is my code..
%MACRO DO_LIST;
%DO I=1 %TO 20;
data aaaa;
set aa33;
if A1 =i or
A2 =i or
A3 =i or
...
A10 =i then Add&I=I ;
RUN;
%END;
%MEND DO_LIST;
%DO_LIST;
However, my result have only Add20, which is the last variable..
I feel I took a mistake in loop statement. Would you mind helping me out?
Thanks in advance.

Right now you're always using the same data set as the input to aaaa and you're not changing this dataset with each loop. Thus, you'll always get Add20 only as this is what the last iteration of the loop will do.
A simple fix to this would be:
data append;
set aa33;
run;
%MACRO DO_LIST;
%DO I=1 %TO 20;
data append;
set append;
if A1 =i or
A2 =i or
A3 =i or
.....
A10 =i then Add&I=I ;
RUN;
%END;
%MEND DO_LIST;
%DO_LIST;
You want pretty much add a column to your dataset each time the loop runs as opposed to entirely replacing it with the original dataset (aa33) and the results of only the current iteration.
If you know the max # is 20, the following should work without a macro
data test;
set aa33;
array add[20] 1. add1 - add20;
array a[*] a:;
do i = 1 to dim(a);
value = a[i];
add[value] = 1;
end;
run;
I think that's what you're looking for, it'd help if you'd fill in at least the first two full rows of your example.

Related

DO loops to update allocation SAS

I am working on a clinical trial where I have to create a variable for allocation(Zjt) based on the variable (Xjk) and T,K(Treatments=1,2,3) and Age Factor(J=1,2,3). We are assuming that if a patient falls in age factor(j) then it can be assigned K treatments(1,2,3). So if patient is in age factor one then patient can get assigned X11, X12,X13. Factor two(X21,X22,X33). Factor 3(X31,X32,X33). Z is the variable that keeps the count for each assigned treatment. T and K are both treatments used in different scenarios.
The sample data looks like this:
Subject J K T X
1 1 2 2 X12
2 2 2 2 X22
3 1 1 1 X11
4 2 1 1 X21
..............
2310 1 3 X13
data four;
set four;
If J = 1 and K=1 then X=X11
If J=2 and K=1 then X=X21
If J=3 and K=2 then X=X32
data four;
set four;
If J=1 and T=1 then Z11=X11+1 Z12=X12 Z13=X13
If J=1 and T=2 then Z11=X11 Z12=X12+l Z13=X13
If J=1 and T=3 then Z11=X11 Z12=X12 Z13=X13+1
If J=2 and T=1 then Z21=X21+1 Z22=X22 Z23=X23
If J=2 and T=2 then Z21=X21 Z22=X22+1 Z23=X23
If J=2 and T=3 then Z21=X21 Z22=X22 Z23=X23+1
then it repeats for factor 3.
Each time T=K the Z(count for Xjk) increases by 1, if T is not equal to K then Z remains the same. I think I would need an array to check the condition each time and have no idea how to do it as I am very new to SAS. I have no idea how to program the Z as the arrays I have created have failed. Any help will be appreciated.
It really looks like you are making this much too complex. (or else I do not understand the question.)
It sounds like you have a list of subjects, what age group they are in and what two treatment groups they were assigned to. So it sounds like you have data like this:
data have ;
input Subj AgeGrp Trt1 Trt2 ;
cards;
1 1 2 2
2 2 2 2
3 1 1 1
4 2 1 1
;
You could just use PROC FREQ to count how many fall into each combination.
proc freq data=have; tables trt1*trt2 ; run;
If you want the counts separately for each age group then add that variable also.
proc freq data=have; tables agegrp*trt1*trt2 ; run;

Using macrovariables in calculus with arrays SAS

I would like some help with SAS Arrays and do loops
I have some code which nearly works and would like an explanation why it doesn't work as expected. I wish to calculate the max of the elements in an array taking a limited number values determined by a variable calculated.
DATA VALUES;
INPUT VAL;
DATALINES;
1
2
3
4
5
6
7
8
9
10
;
RUN;
%macro prueba(dataset);
DATA PRUEBA;
SET &dataset;
ARRAY DIAS(11) V1-V11 (4 5 6 7 8 9 10 88 75 46 71);
k = _n_ + 1;
IF k le dim(DIAS) THEN
%DO i = 1 %TO k;
Maxi = max(of V1 - V&i);
%END;
RUN;
%MEND;
%prueba(VALUES);
The error message:
A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was:
k
ERROR: The %TO value of the %DO I loop is invalid.
ERROR: The macro PRUEBA will stop executing.
Thanks
The error message is telling you that the upper bound for your %DO loop is a character string instead of a number. The macro processor sees k as constant text.
Just use a regular DO loop instead.
%macro prueba(dataset);
DATA PRUEBA;
SET &dataset;
ARRAY DIAS(11) V1-V11 (4 5 6 7 8 9 10 88 75 46 71);
k = _n_ + 1;
do i=1 to min(k,dim(DIAS));
Maxi = max(maxi,dias(i));
end;
RUN;
%MEND;

Dynamically set size of array without hardcoding

So I have datasets with variables and values like so:
A1 A2 A3 A4 A5 A6
1 3 5 6 10 2
The variables can go up to A2000 in certain cases. I want to perform the same operation on each variable using an array. Is there a way to dynamically set the size of the array without manually typing it?
Example code of what I am striving for is below
data A;
input A1-A6;
datalines;
1 3 5 6 10 2;
run;
data A;
set A;
array a[*] a1-a&size;
do i=1 to &size;
{perform some operation here}
end;
run;
My question is how can I write code to get the parameter &size that represents the size of the array? In this example, &size = six.
Sure, use the : wildcard. This only works if a1-a6 are already defined (or a-whatever) in the dataset, though.
data have;
input a1-a6;
datalines;
1 2 3 4 5 6
7 8 9 10 11 12
;;;;
run;
data want;
set have;
array a a:;
do i=1 to dim(a);
sum = sum(sum ,a[i]);
end;
run;
Otherwise, what you put above would absolutely work. You don't need the [*] bit, though, and I prefer to keep the dim instead of &size on the loop control in case you change the way this works in the future. Of course you need to have a way to determine &size which will depend on your data.
%let size=6;
data want;
set have;
array a a1-a&size.;
do i=1 to dim(a);
sum = sum(sum ,a[i]);
end;
run;

Loop each variable into SAS macro

I have several variables in data set survey. I want to write a loop to load each variable into a SAS macro.
the code is below.
%let var= r1 r2 r3 ;
DATA survey;
INPUT id sex $ age inc r1 r2 r3 ;
DATALINES;
1 F 35 17 7 2 2
17 M 50 14 5 5 3
33 F 45 6 7 2 7
49 M 24 14 7 5 7
65 F 52 9 4 7 7
81 M 44 11 7 7 7
2 F 34 17 6 5 3
18 M 40 14 7 5 2
34 F 47 6 6 5 6
50 M 35 17 5 7 5
;
%MACRO bvars(input);
proc univariate data = "D:\hsb2" plots;
var &input.;
run;
%MEND bvars;
I just want &var can load into macro bvars each time for only one variable instead of writing the following.
%bvars(r1)
%bvars(r2)
%bvars(r3)
.....
This is time consuming while the number of variables are bigger than 100.
This will run proc univariate for all the variables in survay which start with "r" (so r1, r2, etc.). Procedures with a var statement usually accept multiple variables.
proc univariate data = survey;
var r:;
run;
If you wish to run for all numeric variables replace r: with _NUM_.
If you want to loop through the variables and call a function seperately each time there are several approaches. Usually they involve a macro do loop (which must be inside a macro) like so:
%macro looper(inData);
/* List all the variable names */
proc contents data = &inData. out = _colNames noprint;
run;
proc sql noprint;
select name
/* Put the variable names in a macro variable list */
into :colNames separated by " "
from _colNames
/* Get only numeric variables */
where type = 1
order by varnum;
quit;
/* Loop through the variable names */
%do i = 1 %to %sysfunc(countw(&colNames.));
%let colName = %scan(&colNAmes., &i.);
%put &colName.;
/* Your macro call or code here */
/* %bvars(&inData., &colName.) */
%end;
%mend looper;
%looper(sashelp.cars);
It might prove useful for you to become familiar with macro %do loops, proc contents (or better yet proc datasets), the %scan() function and the different ways to assign macro variables. The sas documentation online is a great place to start.
Updated answer.
You can utilise the VCOLUMN table that is automatically created for every SAS dataset in each library including the Work library. This table contains a row for each variable for each dataset in SAS.
So you would do the following. I am assuming your survery dataset is in the Work library.
So the code does the following;
1. Looks ups your dataset in the Vcolumn table and only keep the name of the variable (thats all we need) and store it into dataset temp.
2. For every variable run the bvars Marcro via the call execute statement.
data temp(keep=name);
set Sashelp.Vcolumn;
where libname = 'WORK' and memname = 'SURVEY';
run;
*Call macro using call execute;
data _null_;
set temp;
call execute ("%bvars("||name||");");
run;

Calculating moving average using do loop in SAS

I am trying to find a way to calculate a moving average using SAS do loops. I am having difficulty. I essentially want to calculate a 4 unit moving average.
DATA data;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data test(drop = i);
set data;
retain c 0;
do i = 1 to _n_-4;
c = (c+a)/4;
end;
run;
proc print data = test;
run;
One option is to use the merge-ahead:
DATA have;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data want;
merge have have(firstobs=2 rename=a=a_1) have(firstobs=3 rename=a=a_2) have(firstobs=4 rename=a=a_3);
c = mean(of a:);
run;
Merge the data to itself, each time the merged dataset advancing one - so the 2nd starts with 2, third starts with 3, etc. That gives you all 4 'a' on one line.
SAS has a lag() function. What this does is create the lag of the variable it is applied to. SO for example, if your data looked like this:
DATA data;
INPUT a ;
CARDS;
1
2
3
4
5
;
Then the following would create a lag one, two, three etc variable;
data data2;
set data;
a_1=lag(a);
a_2=lag2(a);
a_3=lag3(a);
drop b;
run;
would create the following dataset
a a_1 a_2 a_3
1 . . .
2 1 . .
3 2 1 .
4 3 2 1
etc.
Moving averages can be easily calculated from these.
Check out http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
(Please note, I did not get a chance to run the codes, so they may have errors.)
Straight from Cody's Collection of Popular Programming Tasks and How to Tackle them.
*Presenting a macro to compute a moving average;
%macro Moving_ave(In_dsn=, /*Input data set name */
Out_dsn=, /*Output data set name */
Var=, /*Variable on which to compute
the average */
Moving=, /* Variable for moving average */
n= /* Number of observations on which
to compute the average */);
data &Out_dsn;
set &In_dsn;
***compute the lags;
_x1 = &Var;
%do i = 1 %to &n - 1;
%let Num = %eval(&i + 1);
_x&Num = lag&i(&Var);
%end;
***if the observation number is greater than or equal to the
number of values needed for the moving average, output;
if _n_ ge &n then do;
&Moving = mean (of _x1 - _x&n);
output;
end;
drop _x:;
run;
%mend Moving_ave;
*Testing the macro;
%moving_Ave(In_dsn=data,
Out_dsn=test,
Var=a,
Moving=Average,
n=4)

Resources