Retain the value of a variable within a group in SAS - arrays

I want to create a variable Var2 that is equal to 1 starting at the first observation Var1 is equal to 1 and Var2 is equal to 1 until the end of the by group defined by ID.
Here is the minimal working example:
ID Year Var1
1 1 .
1 2 0
1 3 .
1 4 1
1 5 .
And I want to create the following output:
ID Year Var1 Var2
1 1 . .
1 2 0 0
1 3 . 0
1 4 1 1
1 5 . 1
My current code is as follows:
DATA data1;
SET data0;
BY ID YEAR ;
IF LAST.ID THEN END = _N_;
IF Var1 > 0 THEN CNT=_N_;
RUN;
DATA data2;
SET data1;
BY ID YEAR ;
Var2 = 0;
IF Var1 = 1 THEN DO;
DO I = CNT TO END;
Var[I] = 1;
END;
END;
RUN;
However, SAS does not loop along observations.

I'm not sure what your example is doing, but this is fairly straightforward.
data want;
set have;
by id;
retain var2;
if first.id then var2=0;
if var1=1 then var2=1;
run;
Retain var2 to keep its value across observations, and then set it to 1 when you see a 1 in var1; finally, set it to 0 when you see a first.id row.

Related

Creating Dataset with random Values in SAS

I want to create a random dataset. Something like this-
ptno visits sex race
1 1 1 0
1 2 1 0
1 3 1 0
2 1 2 1
2 2 2 1
2 3 2 1
3 1 1 0
3 2 1 0
3 3 1 0
The values should be randomly generated. I want to know if I can do this dynamically using do loops. Thanks in advance for helping.
data want ;
length ptno visits sex race 8. ;
do ptno = 1 to 100 ;
_visits = ceil(ranuni(0)*5) ; /* between 1 & 5 */
sex = ceil(ranuni(0)*2) ; /* between 1 & 2 */
race = floor(ranuni(0)*2) ; /* between 0 & 1 */
do visits = 1 to _visits ;
output ;
end ;
end ;
drop _visits ;
run ;
SAS call ranuni() produce a random variate from a uniform distribution, if value is greater than 0.5 then 1, otherwise 0. Here, the same ptno (i) + seed get the same sex or race.
data want;
do i=100 to 110;
do j=1 to 5;
seed1=i+4567;
call ranuni(seed1,x);
seed2=i+1234;
call ranuni(seed2,y);
ptno=i;
visit=j;
sex=(x>0.5)+1;
race=(y<0.5);
output;
end;
end;
keep ptno--race;
run;

How to get the count of values greater than zero from a subset of an array in SAS

I want to get a data set with an array that saves the count of values greater than zero in a subset of an array.
My code:
%Macro Test(input_array, window);
array initial{*} &input_array;
array position[&window];
array cumulative[&window];
/* Fill array indicating position with value zero, previous value greater than zero */
do i = 1 to dim(initial) - 1;
if initial(i) gt 0 and initial(i+1) eq 0 then
position(i) = i + 1;
end;
/* Fill array indicating the count of values greater than zero until the index in the position array*/
%let j = 1;
%do %while (&j lt &window);
end_ = coalesce(of position&j - position&window);
if not missing(end_) then do;
gt_0_cnt = 0;
do k = &j to end_ - 1;
gt_0_cnt + ifn(initial(k) > 0,1,0);
end;
cumulative(end_ - 1) = gt_0_cnt;
end;
%let j = %eval(&j + end_);
%end;
%Mend;
DATA HAVE;
INPUT ID FM1-FM18;
DATALINES;
A 1 2 0 0 1 0 0 0 0 2 2 2 3 3 4 4 4 0
B 0 0 1 2 3 4 5 1 2 3 4 0 0 0 1 2 0 0
;
RUN;
DATA WANT;
SET HAVE;
%Test(FM: 18);
RUN;
The output I need:
But I have a problem when trying to evaluate this expression
%let j = %eval(&j + end_)
I get the messaje ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was:
1 + end_
I don't know of any other way to get the desired result.
If someone can help me I will be grateful.
Doesn't seem like you need the macro language for this.
data want;
set have;
array fm fm:;
array cum cum_1-cum_18;
do _i = 1 to dim(fm);
if fm[_i] eq 0 then call missing(cum[_i]);
else do;
do count = 1 by 1 until (fm[_i+count] eq 0 or (count+_i eq dim(fm)));
end;
put _i= count=;
cum[_i+count-1] = count;
_i = _i + count - 1;
end;
end;
run;
Obviously you can specify the 18 max on the cum array through a macro parameter, or what the variable names are, but all of the stuff you're doing is perfectly doable through the data step language or simple macro variable parameters.

Counting a value within a specific range in array in SAS

So my dataset looks like this:
ABC1 ABC2 ABC3 ABC4 ABC5 DEF1 DEF2 DEF3 DEF4 DEF5
1 0 0 1 . 0 1 1 0 .
I want my output to be:
XYZ1 XYZ2 XYZ3 XYZ4 XYZ5
0 1 1 0 .
Basically if DEF2 = 1 and count of ABC3 and ABC4 and ABC5 of 1 is > 0 then XYZ2 is 1.
I have tried the following code but it doesnt work
data want;
set have;
array ABC ABC:;
array DEF DEF:;
array XYZ [5] $1;
do i = 1 to dim(ABC)-5;
if ABC(i) = . then XYZ(i) = '';
else if (DEF(i) = 1 and sum(ABC(i+1), ABC(i+3)) > 0) then XYZ(i) = 1;
else XYZ(i) = 0;
end;
drop i;
run;
Lets pivot things for a better understanding
index ABC DEF XYZ (wanted)
----- --- --- ---
1 1 0 0 (because DEF=0)
2 0 1 1 (sum ABC index 2..5 because DEF=1 # index 2)
3 0 1 1 (sum ABC index 3..5 because DEF=1 # index 3)
4 1 0 0 (because DEF=0)
5 . . . (because DEF=.)
Now apply that understanding to processing variables of the row when arrayed. The items will be processed from 5 to 1, so a running_sum can be computed and applied when necessary.
data want;
set have;
array abc abc:;
array def def:;
array xyz(5);
running_sum = .;
do index = dim(abc) to 1 by -1;
if not missing(abc(index)) then running_sum + abc(index);
if def(index) in (., 0)
then xyz(index) = def(index);
else xyz(index) = running_sum;
end;
run;
Not all processing rules are stated in the question, such as
the case abc(j) = . and abc(k) ne . and k > j
Such a case may never happen.

SAS: reshape data (stacking)

How do I reshape this data in SAS?
id q1a q2a q1b q2b q1c q2c
1 3 0 1 1 1 9
2 4 9 1 2 2 0
3 5 9 1 2 4 0
into this:
id q1 q2 type
1 3 0 a
1 1 1 b
1 1 9 c
.............
The simplest way is to transpose the data, split out the last letter from the q variables, then re-transpose.
data have;
input id q1a q2a q1b q2b q1c q2c;
datalines;
1 3 0 1 1 1 9
2 4 9 1 2 2 0
3 5 9 1 2 4 0
;
run;
proc transpose data=have out=temp1;
by id;
run;
data temp2;
set temp1;
length type $1;
type=substr(_NAME_,3);
_NAME_=substr(_NAME_,1,2);
run;
proc transpose data=temp2 out=want (drop=_:) ;
by id type;
id _NAME_;
var COL1;
run;
Seems simpler to me to just do it in one bit.
data want;
set have;
array qs q:;
do _t = 1 to dim(qs) by 2;
q1=qs[_t];
q2=qs[_t+1];
type = substr(vname(qs[_t]),3,1);
output;
end;
keep id q1 q2 type;
run;
That works for exactly your data; there are some assumptions (the substr and the relationship between q1/q2) that might need to be changed for a real-world example.

SAS ID increment by 2

I have a dataset like this
data ID;
input num;
datalines;
1
1
2
3
4
4
;
I want to create another variable to group them in increment of 2:
num id
1 1
1 1
2 1
3 2
4 2
4 2
Thanks!
If they're truly 1-2-3-4, then you can use MOD:
data want;
set have;
id=mod(num,2)+1;
run;
*or subtract 2-mod(num,2) if you want to start with 1;
If they're not truly 1-2-3-4, then you can use MOD(iter) instead of mod(num) and if first.num then iter+1; to make the iterator that is 1-2-3-4.
You can use binary variable that changes its value from 1 to 0 or from 0 to 1 for every new num;
data result;
set id;
by num;
retain incr 0;
if FIRST.num then do;
incr=not incr;
id+incr;
end;
drop incr;
run;

Resources