I would like some help with SAS Arrays and do loops
I have some code which nearly works and would like an explanation why it doesn't work as expected. I wish to calculate the max of the elements in an array taking a limited number values determined by a variable calculated.
DATA VALUES;
INPUT VAL;
DATALINES;
1
2
3
4
5
6
7
8
9
10
;
RUN;
%macro prueba(dataset);
DATA PRUEBA;
SET &dataset;
ARRAY DIAS(11) V1-V11 (4 5 6 7 8 9 10 88 75 46 71);
k = _n_ + 1;
IF k le dim(DIAS) THEN
%DO i = 1 %TO k;
Maxi = max(of V1 - V&i);
%END;
RUN;
%MEND;
%prueba(VALUES);
The error message:
A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was:
k
ERROR: The %TO value of the %DO I loop is invalid.
ERROR: The macro PRUEBA will stop executing.
Thanks
The error message is telling you that the upper bound for your %DO loop is a character string instead of a number. The macro processor sees k as constant text.
Just use a regular DO loop instead.
%macro prueba(dataset);
DATA PRUEBA;
SET &dataset;
ARRAY DIAS(11) V1-V11 (4 5 6 7 8 9 10 88 75 46 71);
k = _n_ + 1;
do i=1 to min(k,dim(DIAS));
Maxi = max(maxi,dias(i));
end;
RUN;
%MEND;
Related
I want to get a data set with an array that saves the count of values greater than zero in a subset of an array.
My code:
%Macro Test(input_array, window);
array initial{*} &input_array;
array position[&window];
array cumulative[&window];
/* Fill array indicating position with value zero, previous value greater than zero */
do i = 1 to dim(initial) - 1;
if initial(i) gt 0 and initial(i+1) eq 0 then
position(i) = i + 1;
end;
/* Fill array indicating the count of values greater than zero until the index in the position array*/
%let j = 1;
%do %while (&j lt &window);
end_ = coalesce(of position&j - position&window);
if not missing(end_) then do;
gt_0_cnt = 0;
do k = &j to end_ - 1;
gt_0_cnt + ifn(initial(k) > 0,1,0);
end;
cumulative(end_ - 1) = gt_0_cnt;
end;
%let j = %eval(&j + end_);
%end;
%Mend;
DATA HAVE;
INPUT ID FM1-FM18;
DATALINES;
A 1 2 0 0 1 0 0 0 0 2 2 2 3 3 4 4 4 0
B 0 0 1 2 3 4 5 1 2 3 4 0 0 0 1 2 0 0
;
RUN;
DATA WANT;
SET HAVE;
%Test(FM: 18);
RUN;
The output I need:
But I have a problem when trying to evaluate this expression
%let j = %eval(&j + end_)
I get the messaje ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was:
1 + end_
I don't know of any other way to get the desired result.
If someone can help me I will be grateful.
Doesn't seem like you need the macro language for this.
data want;
set have;
array fm fm:;
array cum cum_1-cum_18;
do _i = 1 to dim(fm);
if fm[_i] eq 0 then call missing(cum[_i]);
else do;
do count = 1 by 1 until (fm[_i+count] eq 0 or (count+_i eq dim(fm)));
end;
put _i= count=;
cum[_i+count-1] = count;
_i = _i + count - 1;
end;
end;
run;
Obviously you can specify the 18 max on the cum array through a macro parameter, or what the variable names are, but all of the stuff you're doing is perfectly doable through the data step language or simple macro variable parameters.
So I have datasets with variables and values like so:
A1 A2 A3 A4 A5 A6
1 3 5 6 10 2
The variables can go up to A2000 in certain cases. I want to perform the same operation on each variable using an array. Is there a way to dynamically set the size of the array without manually typing it?
Example code of what I am striving for is below
data A;
input A1-A6;
datalines;
1 3 5 6 10 2;
run;
data A;
set A;
array a[*] a1-a&size;
do i=1 to &size;
{perform some operation here}
end;
run;
My question is how can I write code to get the parameter &size that represents the size of the array? In this example, &size = six.
Sure, use the : wildcard. This only works if a1-a6 are already defined (or a-whatever) in the dataset, though.
data have;
input a1-a6;
datalines;
1 2 3 4 5 6
7 8 9 10 11 12
;;;;
run;
data want;
set have;
array a a:;
do i=1 to dim(a);
sum = sum(sum ,a[i]);
end;
run;
Otherwise, what you put above would absolutely work. You don't need the [*] bit, though, and I prefer to keep the dim instead of &size on the loop control in case you change the way this works in the future. Of course you need to have a way to determine &size which will depend on your data.
%let size=6;
data want;
set have;
array a a1-a&size.;
do i=1 to dim(a);
sum = sum(sum ,a[i]);
end;
run;
I have several variables in data set survey. I want to write a loop to load each variable into a SAS macro.
the code is below.
%let var= r1 r2 r3 ;
DATA survey;
INPUT id sex $ age inc r1 r2 r3 ;
DATALINES;
1 F 35 17 7 2 2
17 M 50 14 5 5 3
33 F 45 6 7 2 7
49 M 24 14 7 5 7
65 F 52 9 4 7 7
81 M 44 11 7 7 7
2 F 34 17 6 5 3
18 M 40 14 7 5 2
34 F 47 6 6 5 6
50 M 35 17 5 7 5
;
%MACRO bvars(input);
proc univariate data = "D:\hsb2" plots;
var &input.;
run;
%MEND bvars;
I just want &var can load into macro bvars each time for only one variable instead of writing the following.
%bvars(r1)
%bvars(r2)
%bvars(r3)
.....
This is time consuming while the number of variables are bigger than 100.
This will run proc univariate for all the variables in survay which start with "r" (so r1, r2, etc.). Procedures with a var statement usually accept multiple variables.
proc univariate data = survey;
var r:;
run;
If you wish to run for all numeric variables replace r: with _NUM_.
If you want to loop through the variables and call a function seperately each time there are several approaches. Usually they involve a macro do loop (which must be inside a macro) like so:
%macro looper(inData);
/* List all the variable names */
proc contents data = &inData. out = _colNames noprint;
run;
proc sql noprint;
select name
/* Put the variable names in a macro variable list */
into :colNames separated by " "
from _colNames
/* Get only numeric variables */
where type = 1
order by varnum;
quit;
/* Loop through the variable names */
%do i = 1 %to %sysfunc(countw(&colNames.));
%let colName = %scan(&colNAmes., &i.);
%put &colName.;
/* Your macro call or code here */
/* %bvars(&inData., &colName.) */
%end;
%mend looper;
%looper(sashelp.cars);
It might prove useful for you to become familiar with macro %do loops, proc contents (or better yet proc datasets), the %scan() function and the different ways to assign macro variables. The sas documentation online is a great place to start.
Updated answer.
You can utilise the VCOLUMN table that is automatically created for every SAS dataset in each library including the Work library. This table contains a row for each variable for each dataset in SAS.
So you would do the following. I am assuming your survery dataset is in the Work library.
So the code does the following;
1. Looks ups your dataset in the Vcolumn table and only keep the name of the variable (thats all we need) and store it into dataset temp.
2. For every variable run the bvars Marcro via the call execute statement.
data temp(keep=name);
set Sashelp.Vcolumn;
where libname = 'WORK' and memname = 'SURVEY';
run;
*Call macro using call execute;
data _null_;
set temp;
call execute ("%bvars("||name||");");
run;
I am trying to find a way to calculate a moving average using SAS do loops. I am having difficulty. I essentially want to calculate a 4 unit moving average.
DATA data;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data test(drop = i);
set data;
retain c 0;
do i = 1 to _n_-4;
c = (c+a)/4;
end;
run;
proc print data = test;
run;
One option is to use the merge-ahead:
DATA have;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data want;
merge have have(firstobs=2 rename=a=a_1) have(firstobs=3 rename=a=a_2) have(firstobs=4 rename=a=a_3);
c = mean(of a:);
run;
Merge the data to itself, each time the merged dataset advancing one - so the 2nd starts with 2, third starts with 3, etc. That gives you all 4 'a' on one line.
SAS has a lag() function. What this does is create the lag of the variable it is applied to. SO for example, if your data looked like this:
DATA data;
INPUT a ;
CARDS;
1
2
3
4
5
;
Then the following would create a lag one, two, three etc variable;
data data2;
set data;
a_1=lag(a);
a_2=lag2(a);
a_3=lag3(a);
drop b;
run;
would create the following dataset
a a_1 a_2 a_3
1 . . .
2 1 . .
3 2 1 .
4 3 2 1
etc.
Moving averages can be easily calculated from these.
Check out http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
(Please note, I did not get a chance to run the codes, so they may have errors.)
Straight from Cody's Collection of Popular Programming Tasks and How to Tackle them.
*Presenting a macro to compute a moving average;
%macro Moving_ave(In_dsn=, /*Input data set name */
Out_dsn=, /*Output data set name */
Var=, /*Variable on which to compute
the average */
Moving=, /* Variable for moving average */
n= /* Number of observations on which
to compute the average */);
data &Out_dsn;
set &In_dsn;
***compute the lags;
_x1 = &Var;
%do i = 1 %to &n - 1;
%let Num = %eval(&i + 1);
_x&Num = lag&i(&Var);
%end;
***if the observation number is greater than or equal to the
number of values needed for the moving average, output;
if _n_ ge &n then do;
&Moving = mean (of _x1 - _x&n);
output;
end;
drop _x:;
run;
%mend Moving_ave;
*Testing the macro;
%moving_Ave(In_dsn=data,
Out_dsn=test,
Var=a,
Moving=Average,
n=4)
I just want to make new dummy variable when there is a certain value.
Here is my orignal data example.
ID A1 A2... A10
1 10 1 5
2 20 8 4
...
...
And I would like to add dummy variable when there is a certain value in those attributes.
For example, ID 1 subject have "10", a new variable, Add10 would be 1..
ID A1 A2.. A10 Add1..Add4 Add5...Add20
1 10 1.. 5 1 ...0 1 ... 0
2 20 8.. 4 0 ...1 0 ... 1
...
Here is my code..
%MACRO DO_LIST;
%DO I=1 %TO 20;
data aaaa;
set aa33;
if A1 =i or
A2 =i or
A3 =i or
...
A10 =i then Add&I=I ;
RUN;
%END;
%MEND DO_LIST;
%DO_LIST;
However, my result have only Add20, which is the last variable..
I feel I took a mistake in loop statement. Would you mind helping me out?
Thanks in advance.
Right now you're always using the same data set as the input to aaaa and you're not changing this dataset with each loop. Thus, you'll always get Add20 only as this is what the last iteration of the loop will do.
A simple fix to this would be:
data append;
set aa33;
run;
%MACRO DO_LIST;
%DO I=1 %TO 20;
data append;
set append;
if A1 =i or
A2 =i or
A3 =i or
.....
A10 =i then Add&I=I ;
RUN;
%END;
%MEND DO_LIST;
%DO_LIST;
You want pretty much add a column to your dataset each time the loop runs as opposed to entirely replacing it with the original dataset (aa33) and the results of only the current iteration.
If you know the max # is 20, the following should work without a macro
data test;
set aa33;
array add[20] 1. add1 - add20;
array a[*] a:;
do i = 1 to dim(a);
value = a[i];
add[value] = 1;
end;
run;
I think that's what you're looking for, it'd help if you'd fill in at least the first two full rows of your example.