SAS: reshape data (stacking) - arrays

How do I reshape this data in SAS?
id q1a q2a q1b q2b q1c q2c
1 3 0 1 1 1 9
2 4 9 1 2 2 0
3 5 9 1 2 4 0
into this:
id q1 q2 type
1 3 0 a
1 1 1 b
1 1 9 c
.............

The simplest way is to transpose the data, split out the last letter from the q variables, then re-transpose.
data have;
input id q1a q2a q1b q2b q1c q2c;
datalines;
1 3 0 1 1 1 9
2 4 9 1 2 2 0
3 5 9 1 2 4 0
;
run;
proc transpose data=have out=temp1;
by id;
run;
data temp2;
set temp1;
length type $1;
type=substr(_NAME_,3);
_NAME_=substr(_NAME_,1,2);
run;
proc transpose data=temp2 out=want (drop=_:) ;
by id type;
id _NAME_;
var COL1;
run;

Seems simpler to me to just do it in one bit.
data want;
set have;
array qs q:;
do _t = 1 to dim(qs) by 2;
q1=qs[_t];
q2=qs[_t+1];
type = substr(vname(qs[_t]),3,1);
output;
end;
keep id q1 q2 type;
run;
That works for exactly your data; there are some assumptions (the substr and the relationship between q1/q2) that might need to be changed for a real-world example.

Related

Creating Dataset with random Values in SAS

I want to create a random dataset. Something like this-
ptno visits sex race
1 1 1 0
1 2 1 0
1 3 1 0
2 1 2 1
2 2 2 1
2 3 2 1
3 1 1 0
3 2 1 0
3 3 1 0
The values should be randomly generated. I want to know if I can do this dynamically using do loops. Thanks in advance for helping.
data want ;
length ptno visits sex race 8. ;
do ptno = 1 to 100 ;
_visits = ceil(ranuni(0)*5) ; /* between 1 & 5 */
sex = ceil(ranuni(0)*2) ; /* between 1 & 2 */
race = floor(ranuni(0)*2) ; /* between 0 & 1 */
do visits = 1 to _visits ;
output ;
end ;
end ;
drop _visits ;
run ;
SAS call ranuni() produce a random variate from a uniform distribution, if value is greater than 0.5 then 1, otherwise 0. Here, the same ptno (i) + seed get the same sex or race.
data want;
do i=100 to 110;
do j=1 to 5;
seed1=i+4567;
call ranuni(seed1,x);
seed2=i+1234;
call ranuni(seed2,y);
ptno=i;
visit=j;
sex=(x>0.5)+1;
race=(y<0.5);
output;
end;
end;
keep ptno--race;
run;

Loops to update allocation

I have a data that goes like this:
Subject Treatment X
1 1 X12
2 2 X12
3 3 X13
4 1 X11
5 2 X13
6 3 X12
7 1 X11
8 2 X12
9 1 X11
10 3 X13
I have to count the number of X's using the variable Z so Z11=#of X11's , Z12=#of X12's and so on but if the last number in the X and T is the same then you add one to the allocation.
So Z11=X11+1 if T=1, Z12=X12+1 If T=2 and Z13=X13+1 if T=3 but if last number and T's don't correspond to each other then it would stay the same Z11=X11, Z12=X12 and Z13=13. I am using proc sql to count the allocations but don't how to add 1 each time the loop goes through another subject.
proc sql;
create table new1 as select
sum(y="X11")+1 as z11,
sum(y="X12") as z12,
sum(y="X13") as z13
from dynamic;
quit;
Any help will be appreciated.
I'm not clear exactly what you want, but I would do this in a data step creating the variables you want, and then sum up the totals.
data have;
input Subject Treatment X $;
datalines;
1 1 X12
2 2 X12
3 3 X13
4 1 X11
5 2 X13
6 3 X12
7 1 X11
8 2 X12
9 1 X11
10 3 X13
;
data have;
set have;
z11 = 0;
z12 = 0;
z13 = 0;
if x="X11" then do;
z11 = z11 + 1;
if Treatment=1 then
z11 = z11+1;
end;
else if x="X12" then do;
z12 = z12 + 1;
if Treatment=2 then
z12 = z12 + 1;
end;
else if x="X13" then do;
z13 = z13 + 1;
if Treatment=3 then
z13 = z13+1;
end;
run;
proc summary data=have;
var z:;
output out=want sum=;
run;
produces:
_TYPE_ _FREQ_ z11 z12 z13
0 10 6 6 5
You can drop the _TYPE_ and _FREQ_ variables if you need.

SAS - Find and print first non-zero value from a dataset in columns

I have a data set with ID in rows and months in columns, as the one shown below.
I want to create an auxiliary column that records the first value that is not zero of each line.
ID M1 M2 M3 M4 M5 Auxiliary column
1 0 0 8 8 7 8
2 7 7 7 . . 7
3 0 0 0 0 9 9
4 0 9 9 9 8 9
5 1 1 1 1 1 1
6 0 2 2 1 1 2
Currently l am using this code, but I haven't been able to get the results I am looking for. Any ideas?
data new_ops04;
set new_ops03;
array MONTHS (24) M1-M24;
RETAIN AUXILIARY_COLUMN 0;
do i=1 to 24;
IF MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
end;
drop i;
run;
Thanks a lot!
You're very close. Just drop the retain statement:
data new_ops04;
set new_ops03;
array MONTHS (24) M1-M24;
AUXILIARY_COLUMN = 0;
do i=1 to 24;
IF MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
end;
drop i;
run;
you need to consider what happens if the first observation(s) are missing
I would do this use case in proc sql. But your problem is that you are not stopping when you reach the first value. So:
flag = 0;
do i=1 to 24 until (flag)
if MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
flag = 1;
end;
drop i, flag;

Retain the value of a variable within a group in SAS

I want to create a variable Var2 that is equal to 1 starting at the first observation Var1 is equal to 1 and Var2 is equal to 1 until the end of the by group defined by ID.
Here is the minimal working example:
ID Year Var1
1 1 .
1 2 0
1 3 .
1 4 1
1 5 .
And I want to create the following output:
ID Year Var1 Var2
1 1 . .
1 2 0 0
1 3 . 0
1 4 1 1
1 5 . 1
My current code is as follows:
DATA data1;
SET data0;
BY ID YEAR ;
IF LAST.ID THEN END = _N_;
IF Var1 > 0 THEN CNT=_N_;
RUN;
DATA data2;
SET data1;
BY ID YEAR ;
Var2 = 0;
IF Var1 = 1 THEN DO;
DO I = CNT TO END;
Var[I] = 1;
END;
END;
RUN;
However, SAS does not loop along observations.
I'm not sure what your example is doing, but this is fairly straightforward.
data want;
set have;
by id;
retain var2;
if first.id then var2=0;
if var1=1 then var2=1;
run;
Retain var2 to keep its value across observations, and then set it to 1 when you see a 1 in var1; finally, set it to 0 when you see a first.id row.

SAS ID increment by 2

I have a dataset like this
data ID;
input num;
datalines;
1
1
2
3
4
4
;
I want to create another variable to group them in increment of 2:
num id
1 1
1 1
2 1
3 2
4 2
4 2
Thanks!
If they're truly 1-2-3-4, then you can use MOD:
data want;
set have;
id=mod(num,2)+1;
run;
*or subtract 2-mod(num,2) if you want to start with 1;
If they're not truly 1-2-3-4, then you can use MOD(iter) instead of mod(num) and if first.num then iter+1; to make the iterator that is 1-2-3-4.
You can use binary variable that changes its value from 1 to 0 or from 0 to 1 for every new num;
data result;
set id;
by num;
retain incr 0;
if FIRST.num then do;
incr=not incr;
id+incr;
end;
drop incr;
run;

Resources