Dynamically set size of array without hardcoding - arrays

So I have datasets with variables and values like so:
A1 A2 A3 A4 A5 A6
1 3 5 6 10 2
The variables can go up to A2000 in certain cases. I want to perform the same operation on each variable using an array. Is there a way to dynamically set the size of the array without manually typing it?
Example code of what I am striving for is below
data A;
input A1-A6;
datalines;
1 3 5 6 10 2;
run;
data A;
set A;
array a[*] a1-a&size;
do i=1 to &size;
{perform some operation here}
end;
run;
My question is how can I write code to get the parameter &size that represents the size of the array? In this example, &size = six.

Sure, use the : wildcard. This only works if a1-a6 are already defined (or a-whatever) in the dataset, though.
data have;
input a1-a6;
datalines;
1 2 3 4 5 6
7 8 9 10 11 12
;;;;
run;
data want;
set have;
array a a:;
do i=1 to dim(a);
sum = sum(sum ,a[i]);
end;
run;
Otherwise, what you put above would absolutely work. You don't need the [*] bit, though, and I prefer to keep the dim instead of &size on the loop control in case you change the way this works in the future. Of course you need to have a way to determine &size which will depend on your data.
%let size=6;
data want;
set have;
array a a1-a&size.;
do i=1 to dim(a);
sum = sum(sum ,a[i]);
end;
run;

Related

merge or join unequal data and duplicate value of one of them in sas

I am trying to merge 2 datasets (df1, df2) with the one of them df2 has only 1 observation that I want to assign its value to all length of the df1 duplicate with merge in sas.
I am aware that I can add that manually but I want to use automated way as this is just a step in my long code with big data.
Here is a reproducible example and datasets:
data df1;
input a b c;
datalines;
1 2 3
6 7 8
5 6 9
;
run;
data df2;
input d ;
datalines;
4
;
run;
data df3;
merge df1 df2;
run;
/*I need the resulting df3 to be */;
a b c d
1 2 3 4
6 7 8 4
5 6 9 4
Any help will be greatly appreciated.
Then you don't want to MERGE the dataset, since there are no common variables that the merge could actually use.
Instead just SET both datasets, but take care to not read past the end of single observation set.
data want;
set long_dataset;
if _n_=1 then set short_dataset;
run;

How to efficently create arrays with runs of equal elements in SAS

Case One
Sample output:
`1,1,1,1,2,2,2,2......9,9,9,9,0,0,0,0'
A more general case would be:
The array starts with n1 elements valued x1, then followed by n2 elements valued x2...
In the sample output, n1 = n2 = n3 = .. = 4, x1=1, x2=2 ...
But I don't want to create it based on the element's position in array using if-else statement.
Here's what I have done:
%let nd = 80;
data _t(drop = i);
array ap{&nd};
do i = 1 to &nd;
if i le 4 then a[i] = 1;
else ....;
end;
'other codes'
run;
Case Two
What if the order in the array doesn't matter as long as it contains all the elements I need (n1 x1, n2 x2 ...) ? In this scenario, is it easier to build up the array?
Provided that you know in advance an upper bound for how many elements you want to create, you can do this in the array statement - e.g.
data _null_;
array t{10} (1*1 2*2 3*3 4*4);
put t{*};
run;
Output:
1 2 2 3 3 3 4 4 4 4
N.B. This sort of assignment implicitly causes your array variables to be retained.
You can also nest brackets when creating runs of elements, e.g,
data _null_;
array t{10} (2*(1 2 3 4 5));
put t{*};
run;
Output:
1 2 3 4 5 1 2 3 4 5
However, * signs need to be separated by brackets, e.g.
data _null_;
array t{10} (2*(1 2) 2*(3*3));
put t{*};
run;
Output:
1 2 1 2 3 3 3 3 3 3

Loop each variable into SAS macro

I have several variables in data set survey. I want to write a loop to load each variable into a SAS macro.
the code is below.
%let var= r1 r2 r3 ;
DATA survey;
INPUT id sex $ age inc r1 r2 r3 ;
DATALINES;
1 F 35 17 7 2 2
17 M 50 14 5 5 3
33 F 45 6 7 2 7
49 M 24 14 7 5 7
65 F 52 9 4 7 7
81 M 44 11 7 7 7
2 F 34 17 6 5 3
18 M 40 14 7 5 2
34 F 47 6 6 5 6
50 M 35 17 5 7 5
;
%MACRO bvars(input);
proc univariate data = "D:\hsb2" plots;
var &input.;
run;
%MEND bvars;
I just want &var can load into macro bvars each time for only one variable instead of writing the following.
%bvars(r1)
%bvars(r2)
%bvars(r3)
.....
This is time consuming while the number of variables are bigger than 100.
This will run proc univariate for all the variables in survay which start with "r" (so r1, r2, etc.). Procedures with a var statement usually accept multiple variables.
proc univariate data = survey;
var r:;
run;
If you wish to run for all numeric variables replace r: with _NUM_.
If you want to loop through the variables and call a function seperately each time there are several approaches. Usually they involve a macro do loop (which must be inside a macro) like so:
%macro looper(inData);
/* List all the variable names */
proc contents data = &inData. out = _colNames noprint;
run;
proc sql noprint;
select name
/* Put the variable names in a macro variable list */
into :colNames separated by " "
from _colNames
/* Get only numeric variables */
where type = 1
order by varnum;
quit;
/* Loop through the variable names */
%do i = 1 %to %sysfunc(countw(&colNames.));
%let colName = %scan(&colNAmes., &i.);
%put &colName.;
/* Your macro call or code here */
/* %bvars(&inData., &colName.) */
%end;
%mend looper;
%looper(sashelp.cars);
It might prove useful for you to become familiar with macro %do loops, proc contents (or better yet proc datasets), the %scan() function and the different ways to assign macro variables. The sas documentation online is a great place to start.
Updated answer.
You can utilise the VCOLUMN table that is automatically created for every SAS dataset in each library including the Work library. This table contains a row for each variable for each dataset in SAS.
So you would do the following. I am assuming your survery dataset is in the Work library.
So the code does the following;
1. Looks ups your dataset in the Vcolumn table and only keep the name of the variable (thats all we need) and store it into dataset temp.
2. For every variable run the bvars Marcro via the call execute statement.
data temp(keep=name);
set Sashelp.Vcolumn;
where libname = 'WORK' and memname = 'SURVEY';
run;
*Call macro using call execute;
data _null_;
set temp;
call execute ("%bvars("||name||");");
run;

Calculating moving average using do loop in SAS

I am trying to find a way to calculate a moving average using SAS do loops. I am having difficulty. I essentially want to calculate a 4 unit moving average.
DATA data;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data test(drop = i);
set data;
retain c 0;
do i = 1 to _n_-4;
c = (c+a)/4;
end;
run;
proc print data = test;
run;
One option is to use the merge-ahead:
DATA have;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data want;
merge have have(firstobs=2 rename=a=a_1) have(firstobs=3 rename=a=a_2) have(firstobs=4 rename=a=a_3);
c = mean(of a:);
run;
Merge the data to itself, each time the merged dataset advancing one - so the 2nd starts with 2, third starts with 3, etc. That gives you all 4 'a' on one line.
SAS has a lag() function. What this does is create the lag of the variable it is applied to. SO for example, if your data looked like this:
DATA data;
INPUT a ;
CARDS;
1
2
3
4
5
;
Then the following would create a lag one, two, three etc variable;
data data2;
set data;
a_1=lag(a);
a_2=lag2(a);
a_3=lag3(a);
drop b;
run;
would create the following dataset
a a_1 a_2 a_3
1 . . .
2 1 . .
3 2 1 .
4 3 2 1
etc.
Moving averages can be easily calculated from these.
Check out http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
(Please note, I did not get a chance to run the codes, so they may have errors.)
Straight from Cody's Collection of Popular Programming Tasks and How to Tackle them.
*Presenting a macro to compute a moving average;
%macro Moving_ave(In_dsn=, /*Input data set name */
Out_dsn=, /*Output data set name */
Var=, /*Variable on which to compute
the average */
Moving=, /* Variable for moving average */
n= /* Number of observations on which
to compute the average */);
data &Out_dsn;
set &In_dsn;
***compute the lags;
_x1 = &Var;
%do i = 1 %to &n - 1;
%let Num = %eval(&i + 1);
_x&Num = lag&i(&Var);
%end;
***if the observation number is greater than or equal to the
number of values needed for the moving average, output;
if _n_ ge &n then do;
&Moving = mean (of _x1 - _x&n);
output;
end;
drop _x:;
run;
%mend Moving_ave;
*Testing the macro;
%moving_Ave(In_dsn=data,
Out_dsn=test,
Var=a,
Moving=Average,
n=4)

SAS macro loop and dummy variable

I just want to make new dummy variable when there is a certain value.
Here is my orignal data example.
ID A1 A2... A10
1 10 1 5
2 20 8 4
...
...
And I would like to add dummy variable when there is a certain value in those attributes.
For example, ID 1 subject have "10", a new variable, Add10 would be 1..
ID A1 A2.. A10 Add1..Add4 Add5...Add20
1 10 1.. 5 1 ...0 1 ... 0
2 20 8.. 4 0 ...1 0 ... 1
...
Here is my code..
%MACRO DO_LIST;
%DO I=1 %TO 20;
data aaaa;
set aa33;
if A1 =i or
A2 =i or
A3 =i or
...
A10 =i then Add&I=I ;
RUN;
%END;
%MEND DO_LIST;
%DO_LIST;
However, my result have only Add20, which is the last variable..
I feel I took a mistake in loop statement. Would you mind helping me out?
Thanks in advance.
Right now you're always using the same data set as the input to aaaa and you're not changing this dataset with each loop. Thus, you'll always get Add20 only as this is what the last iteration of the loop will do.
A simple fix to this would be:
data append;
set aa33;
run;
%MACRO DO_LIST;
%DO I=1 %TO 20;
data append;
set append;
if A1 =i or
A2 =i or
A3 =i or
.....
A10 =i then Add&I=I ;
RUN;
%END;
%MEND DO_LIST;
%DO_LIST;
You want pretty much add a column to your dataset each time the loop runs as opposed to entirely replacing it with the original dataset (aa33) and the results of only the current iteration.
If you know the max # is 20, the following should work without a macro
data test;
set aa33;
array add[20] 1. add1 - add20;
array a[*] a:;
do i = 1 to dim(a);
value = a[i];
add[value] = 1;
end;
run;
I think that's what you're looking for, it'd help if you'd fill in at least the first two full rows of your example.

Resources