SAS ID increment by 2 - dataset

I have a dataset like this
data ID;
input num;
datalines;
1
1
2
3
4
4
;
I want to create another variable to group them in increment of 2:
num id
1 1
1 1
2 1
3 2
4 2
4 2
Thanks!

If they're truly 1-2-3-4, then you can use MOD:
data want;
set have;
id=mod(num,2)+1;
run;
*or subtract 2-mod(num,2) if you want to start with 1;
If they're not truly 1-2-3-4, then you can use MOD(iter) instead of mod(num) and if first.num then iter+1; to make the iterator that is 1-2-3-4.

You can use binary variable that changes its value from 1 to 0 or from 0 to 1 for every new num;
data result;
set id;
by num;
retain incr 0;
if FIRST.num then do;
incr=not incr;
id+incr;
end;
drop incr;
run;

Related

Creating Dataset with random Values in SAS

I want to create a random dataset. Something like this-
ptno visits sex race
1 1 1 0
1 2 1 0
1 3 1 0
2 1 2 1
2 2 2 1
2 3 2 1
3 1 1 0
3 2 1 0
3 3 1 0
The values should be randomly generated. I want to know if I can do this dynamically using do loops. Thanks in advance for helping.
data want ;
length ptno visits sex race 8. ;
do ptno = 1 to 100 ;
_visits = ceil(ranuni(0)*5) ; /* between 1 & 5 */
sex = ceil(ranuni(0)*2) ; /* between 1 & 2 */
race = floor(ranuni(0)*2) ; /* between 0 & 1 */
do visits = 1 to _visits ;
output ;
end ;
end ;
drop _visits ;
run ;
SAS call ranuni() produce a random variate from a uniform distribution, if value is greater than 0.5 then 1, otherwise 0. Here, the same ptno (i) + seed get the same sex or race.
data want;
do i=100 to 110;
do j=1 to 5;
seed1=i+4567;
call ranuni(seed1,x);
seed2=i+1234;
call ranuni(seed2,y);
ptno=i;
visit=j;
sex=(x>0.5)+1;
race=(y<0.5);
output;
end;
end;
keep ptno--race;
run;

SAS - Find and print first non-zero value from a dataset in columns

I have a data set with ID in rows and months in columns, as the one shown below.
I want to create an auxiliary column that records the first value that is not zero of each line.
ID M1 M2 M3 M4 M5 Auxiliary column
1 0 0 8 8 7 8
2 7 7 7 . . 7
3 0 0 0 0 9 9
4 0 9 9 9 8 9
5 1 1 1 1 1 1
6 0 2 2 1 1 2
Currently l am using this code, but I haven't been able to get the results I am looking for. Any ideas?
data new_ops04;
set new_ops03;
array MONTHS (24) M1-M24;
RETAIN AUXILIARY_COLUMN 0;
do i=1 to 24;
IF MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
end;
drop i;
run;
Thanks a lot!
You're very close. Just drop the retain statement:
data new_ops04;
set new_ops03;
array MONTHS (24) M1-M24;
AUXILIARY_COLUMN = 0;
do i=1 to 24;
IF MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
end;
drop i;
run;
you need to consider what happens if the first observation(s) are missing
I would do this use case in proc sql. But your problem is that you are not stopping when you reach the first value. So:
flag = 0;
do i=1 to 24 until (flag)
if MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
flag = 1;
end;
drop i, flag;

Retain the value of a variable within a group in SAS

I want to create a variable Var2 that is equal to 1 starting at the first observation Var1 is equal to 1 and Var2 is equal to 1 until the end of the by group defined by ID.
Here is the minimal working example:
ID Year Var1
1 1 .
1 2 0
1 3 .
1 4 1
1 5 .
And I want to create the following output:
ID Year Var1 Var2
1 1 . .
1 2 0 0
1 3 . 0
1 4 1 1
1 5 . 1
My current code is as follows:
DATA data1;
SET data0;
BY ID YEAR ;
IF LAST.ID THEN END = _N_;
IF Var1 > 0 THEN CNT=_N_;
RUN;
DATA data2;
SET data1;
BY ID YEAR ;
Var2 = 0;
IF Var1 = 1 THEN DO;
DO I = CNT TO END;
Var[I] = 1;
END;
END;
RUN;
However, SAS does not loop along observations.
I'm not sure what your example is doing, but this is fairly straightforward.
data want;
set have;
by id;
retain var2;
if first.id then var2=0;
if var1=1 then var2=1;
run;
Retain var2 to keep its value across observations, and then set it to 1 when you see a 1 in var1; finally, set it to 0 when you see a first.id row.

SAS: reshape data (stacking)

How do I reshape this data in SAS?
id q1a q2a q1b q2b q1c q2c
1 3 0 1 1 1 9
2 4 9 1 2 2 0
3 5 9 1 2 4 0
into this:
id q1 q2 type
1 3 0 a
1 1 1 b
1 1 9 c
.............
The simplest way is to transpose the data, split out the last letter from the q variables, then re-transpose.
data have;
input id q1a q2a q1b q2b q1c q2c;
datalines;
1 3 0 1 1 1 9
2 4 9 1 2 2 0
3 5 9 1 2 4 0
;
run;
proc transpose data=have out=temp1;
by id;
run;
data temp2;
set temp1;
length type $1;
type=substr(_NAME_,3);
_NAME_=substr(_NAME_,1,2);
run;
proc transpose data=temp2 out=want (drop=_:) ;
by id type;
id _NAME_;
var COL1;
run;
Seems simpler to me to just do it in one bit.
data want;
set have;
array qs q:;
do _t = 1 to dim(qs) by 2;
q1=qs[_t];
q2=qs[_t+1];
type = substr(vname(qs[_t]),3,1);
output;
end;
keep id q1 q2 type;
run;
That works for exactly your data; there are some assumptions (the substr and the relationship between q1/q2) that might need to be changed for a real-world example.

Calculating moving average using do loop in SAS

I am trying to find a way to calculate a moving average using SAS do loops. I am having difficulty. I essentially want to calculate a 4 unit moving average.
DATA data;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data test(drop = i);
set data;
retain c 0;
do i = 1 to _n_-4;
c = (c+a)/4;
end;
run;
proc print data = test;
run;
One option is to use the merge-ahead:
DATA have;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data want;
merge have have(firstobs=2 rename=a=a_1) have(firstobs=3 rename=a=a_2) have(firstobs=4 rename=a=a_3);
c = mean(of a:);
run;
Merge the data to itself, each time the merged dataset advancing one - so the 2nd starts with 2, third starts with 3, etc. That gives you all 4 'a' on one line.
SAS has a lag() function. What this does is create the lag of the variable it is applied to. SO for example, if your data looked like this:
DATA data;
INPUT a ;
CARDS;
1
2
3
4
5
;
Then the following would create a lag one, two, three etc variable;
data data2;
set data;
a_1=lag(a);
a_2=lag2(a);
a_3=lag3(a);
drop b;
run;
would create the following dataset
a a_1 a_2 a_3
1 . . .
2 1 . .
3 2 1 .
4 3 2 1
etc.
Moving averages can be easily calculated from these.
Check out http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
(Please note, I did not get a chance to run the codes, so they may have errors.)
Straight from Cody's Collection of Popular Programming Tasks and How to Tackle them.
*Presenting a macro to compute a moving average;
%macro Moving_ave(In_dsn=, /*Input data set name */
Out_dsn=, /*Output data set name */
Var=, /*Variable on which to compute
the average */
Moving=, /* Variable for moving average */
n= /* Number of observations on which
to compute the average */);
data &Out_dsn;
set &In_dsn;
***compute the lags;
_x1 = &Var;
%do i = 1 %to &n - 1;
%let Num = %eval(&i + 1);
_x&Num = lag&i(&Var);
%end;
***if the observation number is greater than or equal to the
number of values needed for the moving average, output;
if _n_ ge &n then do;
&Moving = mean (of _x1 - _x&n);
output;
end;
drop _x:;
run;
%mend Moving_ave;
*Testing the macro;
%moving_Ave(In_dsn=data,
Out_dsn=test,
Var=a,
Moving=Average,
n=4)

Resources