I am working on a clinical trial where I have to create a variable for allocation(Zjt) based on the variable (Xjk) and T,K(Treatments=1,2,3) and Age Factor(J=1,2,3). We are assuming that if a patient falls in age factor(j) then it can be assigned K treatments(1,2,3). So if patient is in age factor one then patient can get assigned X11, X12,X13. Factor two(X21,X22,X33). Factor 3(X31,X32,X33). Z is the variable that keeps the count for each assigned treatment. T and K are both treatments used in different scenarios.
The sample data looks like this:
Subject J K T X
1 1 2 2 X12
2 2 2 2 X22
3 1 1 1 X11
4 2 1 1 X21
..............
2310 1 3 X13
data four;
set four;
If J = 1 and K=1 then X=X11
If J=2 and K=1 then X=X21
If J=3 and K=2 then X=X32
data four;
set four;
If J=1 and T=1 then Z11=X11+1 Z12=X12 Z13=X13
If J=1 and T=2 then Z11=X11 Z12=X12+l Z13=X13
If J=1 and T=3 then Z11=X11 Z12=X12 Z13=X13+1
If J=2 and T=1 then Z21=X21+1 Z22=X22 Z23=X23
If J=2 and T=2 then Z21=X21 Z22=X22+1 Z23=X23
If J=2 and T=3 then Z21=X21 Z22=X22 Z23=X23+1
then it repeats for factor 3.
Each time T=K the Z(count for Xjk) increases by 1, if T is not equal to K then Z remains the same. I think I would need an array to check the condition each time and have no idea how to do it as I am very new to SAS. I have no idea how to program the Z as the arrays I have created have failed. Any help will be appreciated.
It really looks like you are making this much too complex. (or else I do not understand the question.)
It sounds like you have a list of subjects, what age group they are in and what two treatment groups they were assigned to. So it sounds like you have data like this:
data have ;
input Subj AgeGrp Trt1 Trt2 ;
cards;
1 1 2 2
2 2 2 2
3 1 1 1
4 2 1 1
;
You could just use PROC FREQ to count how many fall into each combination.
proc freq data=have; tables trt1*trt2 ; run;
If you want the counts separately for each age group then add that variable also.
proc freq data=have; tables agegrp*trt1*trt2 ; run;
Related
I want to create a random dataset. Something like this-
ptno visits sex race
1 1 1 0
1 2 1 0
1 3 1 0
2 1 2 1
2 2 2 1
2 3 2 1
3 1 1 0
3 2 1 0
3 3 1 0
The values should be randomly generated. I want to know if I can do this dynamically using do loops. Thanks in advance for helping.
data want ;
length ptno visits sex race 8. ;
do ptno = 1 to 100 ;
_visits = ceil(ranuni(0)*5) ; /* between 1 & 5 */
sex = ceil(ranuni(0)*2) ; /* between 1 & 2 */
race = floor(ranuni(0)*2) ; /* between 0 & 1 */
do visits = 1 to _visits ;
output ;
end ;
end ;
drop _visits ;
run ;
SAS call ranuni() produce a random variate from a uniform distribution, if value is greater than 0.5 then 1, otherwise 0. Here, the same ptno (i) + seed get the same sex or race.
data want;
do i=100 to 110;
do j=1 to 5;
seed1=i+4567;
call ranuni(seed1,x);
seed2=i+1234;
call ranuni(seed2,y);
ptno=i;
visit=j;
sex=(x>0.5)+1;
race=(y<0.5);
output;
end;
end;
keep ptno--race;
run;
Case One
Sample output:
`1,1,1,1,2,2,2,2......9,9,9,9,0,0,0,0'
A more general case would be:
The array starts with n1 elements valued x1, then followed by n2 elements valued x2...
In the sample output, n1 = n2 = n3 = .. = 4, x1=1, x2=2 ...
But I don't want to create it based on the element's position in array using if-else statement.
Here's what I have done:
%let nd = 80;
data _t(drop = i);
array ap{&nd};
do i = 1 to &nd;
if i le 4 then a[i] = 1;
else ....;
end;
'other codes'
run;
Case Two
What if the order in the array doesn't matter as long as it contains all the elements I need (n1 x1, n2 x2 ...) ? In this scenario, is it easier to build up the array?
Provided that you know in advance an upper bound for how many elements you want to create, you can do this in the array statement - e.g.
data _null_;
array t{10} (1*1 2*2 3*3 4*4);
put t{*};
run;
Output:
1 2 2 3 3 3 4 4 4 4
N.B. This sort of assignment implicitly causes your array variables to be retained.
You can also nest brackets when creating runs of elements, e.g,
data _null_;
array t{10} (2*(1 2 3 4 5));
put t{*};
run;
Output:
1 2 3 4 5 1 2 3 4 5
However, * signs need to be separated by brackets, e.g.
data _null_;
array t{10} (2*(1 2) 2*(3*3));
put t{*};
run;
Output:
1 2 1 2 3 3 3 3 3 3
I have a dataset like this
data ID;
input num;
datalines;
1
1
2
3
4
4
;
I want to create another variable to group them in increment of 2:
num id
1 1
1 1
2 1
3 2
4 2
4 2
Thanks!
If they're truly 1-2-3-4, then you can use MOD:
data want;
set have;
id=mod(num,2)+1;
run;
*or subtract 2-mod(num,2) if you want to start with 1;
If they're not truly 1-2-3-4, then you can use MOD(iter) instead of mod(num) and if first.num then iter+1; to make the iterator that is 1-2-3-4.
You can use binary variable that changes its value from 1 to 0 or from 0 to 1 for every new num;
data result;
set id;
by num;
retain incr 0;
if FIRST.num then do;
incr=not incr;
id+incr;
end;
drop incr;
run;
I am trying to find a way to calculate a moving average using SAS do loops. I am having difficulty. I essentially want to calculate a 4 unit moving average.
DATA data;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data test(drop = i);
set data;
retain c 0;
do i = 1 to _n_-4;
c = (c+a)/4;
end;
run;
proc print data = test;
run;
One option is to use the merge-ahead:
DATA have;
INPUT a b;
CARDS;
1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
;
run;
data want;
merge have have(firstobs=2 rename=a=a_1) have(firstobs=3 rename=a=a_2) have(firstobs=4 rename=a=a_3);
c = mean(of a:);
run;
Merge the data to itself, each time the merged dataset advancing one - so the 2nd starts with 2, third starts with 3, etc. That gives you all 4 'a' on one line.
SAS has a lag() function. What this does is create the lag of the variable it is applied to. SO for example, if your data looked like this:
DATA data;
INPUT a ;
CARDS;
1
2
3
4
5
;
Then the following would create a lag one, two, three etc variable;
data data2;
set data;
a_1=lag(a);
a_2=lag2(a);
a_3=lag3(a);
drop b;
run;
would create the following dataset
a a_1 a_2 a_3
1 . . .
2 1 . .
3 2 1 .
4 3 2 1
etc.
Moving averages can be easily calculated from these.
Check out http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
(Please note, I did not get a chance to run the codes, so they may have errors.)
Straight from Cody's Collection of Popular Programming Tasks and How to Tackle them.
*Presenting a macro to compute a moving average;
%macro Moving_ave(In_dsn=, /*Input data set name */
Out_dsn=, /*Output data set name */
Var=, /*Variable on which to compute
the average */
Moving=, /* Variable for moving average */
n= /* Number of observations on which
to compute the average */);
data &Out_dsn;
set &In_dsn;
***compute the lags;
_x1 = &Var;
%do i = 1 %to &n - 1;
%let Num = %eval(&i + 1);
_x&Num = lag&i(&Var);
%end;
***if the observation number is greater than or equal to the
number of values needed for the moving average, output;
if _n_ ge &n then do;
&Moving = mean (of _x1 - _x&n);
output;
end;
drop _x:;
run;
%mend Moving_ave;
*Testing the macro;
%moving_Ave(In_dsn=data,
Out_dsn=test,
Var=a,
Moving=Average,
n=4)
consider an area with size m*n. Here the size of m and n is unknown. Now I am extracting data from each point in the area. I am scanning the area first going in the x direction till m point and the again returning to m=0 and n=1, i.e the second row. Again I scan along the x direction till the end of m. An example of the data has been shown below. Here I get value for different x,y coordinates during the scan. I can carry out operation between the first two points in x direction by
p1 = A{1}; %%reading the data from the text file
p2 = A{2};
LA=[p1 p2];
for m=1:length(y)
p= LA(m,1);
t= LA(m,2);
%%and
q=LA(m+1,1)
r=LA(m+1,2)
I want to do the same for y axis. That is I want to operate between first point in x=0 and y=1 then between x=2 and y=1 and so on. Hope you have got it.
g x y
2 0 0
3 1 0
2 2 0
4 3 0
1 4 0
2 m 0
3 0 1
2 1 1
4 2 1
5 3 1
.
.
.
.
2 m 1
now I was thinking of a logic where I will first find the size of n by counting the number of zeros
NUMX = 0;
while y((NUMX+1),:) == 0
NUMX = NUMX + 1;
end
NU= NUMX;
And then I was thinking of applying the following loop
for m=1:NU:n-1
%%and
p= LA(m,1);
t= LA(m,2);
%%and
q=LA(m+1,1)
r=LA(m+1,2)
But its showing error. Please help!!
??? Attempted to access del2(99794,:); index out of bounds because
size(del2)=[99793,1].
Here NUMX=198
Comment: The nomenclature in your question is inconsistent, making it difficult to understand what you are doing. The variable del2 you mention in the error message is nowhere to be seen.
1.) Let's start off by creating a minimal working example that illustrates the data structure and provides knowledge of the dimensions we want to retrieve later. You matrix is not m x n but m*n x 3.
The following example will set up a matrix with data similar to what you have shown in your question:
M = zeros(8,3);
for J=1:4
for I=1:2
M((J-1)*2+I,1) = rand(1);
M((J-1)*2+I,2) = I;
M((J-1)*2+I,3) = J-1;
end
end
M =
0.469 1 0
0.012 2 0
0.337 1 1
0.162 2 1
0.794 1 2
0.311 2 2
0.529 1 3
0.166 2 3
2.) Next, let's determine the number of x and y, to use the nomenclature of your question:
NUMX = 0;
while M(NUMX+1,3) == 0
NUMX = NUMX + 1;
end
NUMY = size(M,1)/NUMX;
NUMX =
2
NUMY =
4
3.) The data processing you want to do still is unclear, but here are two approaches that can be used for different means:
(a)
COUNT = 1;
for K=1:NUMX:size(M,1)
A(COUNT,1) = M(K,1);
COUNT = COUNT + 1;
end
In this case, you step through the first column of M with a step-size corresponding to NUMX. This will result in all the values for x=1:
A =
0.469
0.337
0.794
0.529
(b) You can also use NUMX and NUMY to reorder M:
for J=1:NUMY
for I=1:NUMX
NEW_M(I,J) = M((J-1)*NUMX+I,1);
end
end
NEW_M =
0.469 0.337 0.794 0.529
0.012 0.162 0.311 0.166
The matrix NEW_M now is of size m x n, with the values of constant y in the columns and the values of constant x in the rows.
Concluding remark: It is unclear how you define m and n in your code, so your specific error message cannot be resolved here.