Counting a value within a specific range in array in SAS

Counting a value within a specific range in array in SAS - arrays

So my dataset looks like this:
ABC1 ABC2 ABC3 ABC4 ABC5 DEF1 DEF2 DEF3 DEF4 DEF5
1 0 0 1 . 0 1 1 0 .
I want my output to be:
XYZ1 XYZ2 XYZ3 XYZ4 XYZ5
0 1 1 0 .
Basically if DEF2 = 1 and count of ABC3 and ABC4 and ABC5 of 1 is > 0 then XYZ2 is 1.
I have tried the following code but it doesnt work
data want;
set have;
array ABC ABC:;
array DEF DEF:;
array XYZ [5] $1;
do i = 1 to dim(ABC)-5;
if ABC(i) = . then XYZ(i) = '';
else if (DEF(i) = 1 and sum(ABC(i+1), ABC(i+3)) > 0) then XYZ(i) = 1;
else XYZ(i) = 0;
end;
drop i;
run;

Lets pivot things for a better understanding
index ABC DEF XYZ (wanted)
----- --- --- ---
1 1 0 0 (because DEF=0)
2 0 1 1 (sum ABC index 2..5 because DEF=1 # index 2)
3 0 1 1 (sum ABC index 3..5 because DEF=1 # index 3)
4 1 0 0 (because DEF=0)
5 . . . (because DEF=.)
Now apply that understanding to processing variables of the row when arrayed. The items will be processed from 5 to 1, so a running_sum can be computed and applied when necessary.
data want;
set have;
array abc abc:;
array def def:;
array xyz(5);
running_sum = .;
do index = dim(abc) to 1 by -1;
if not missing(abc(index)) then running_sum + abc(index);
if def(index) in (., 0)
then xyz(index) = def(index);
else xyz(index) = running_sum;
end;
run;
Not all processing rules are stated in the question, such as
the case abc(j) = . and abc(k) ne . and k > j
Such a case may never happen.

Related

Creating Dataset with random Values in SAS

I want to create a random dataset. Something like this-
ptno visits sex race
1 1 1 0
1 2 1 0
1 3 1 0
2 1 2 1
2 2 2 1
2 3 2 1
3 1 1 0
3 2 1 0
3 3 1 0
The values should be randomly generated. I want to know if I can do this dynamically using do loops. Thanks in advance for helping.

data want ;
length ptno visits sex race 8. ;
do ptno = 1 to 100 ;
_visits = ceil(ranuni(0)*5) ; /* between 1 & 5 */
sex = ceil(ranuni(0)*2) ; /* between 1 & 2 */
race = floor(ranuni(0)*2) ; /* between 0 & 1 */
do visits = 1 to _visits ;
output ;
end ;
end ;
drop _visits ;
run ;

SAS call ranuni() produce a random variate from a uniform distribution, if value is greater than 0.5 then 1, otherwise 0. Here, the same ptno (i) + seed get the same sex or race.
data want;
do i=100 to 110;
do j=1 to 5;
seed1=i+4567;
call ranuni(seed1,x);
seed2=i+1234;
call ranuni(seed2,y);
ptno=i;
visit=j;
sex=(x>0.5)+1;
race=(y<0.5);
output;
end;
end;
keep ptno--race;
run;

Matlab Fill zeros matric based on array

So i have this data
F =
1
1
2
3
1
2
1
1
and zeros matric
NM =
0 0 0
0 0 0
0 0 0
i have rules, from the lis of array make connection for each variabel, from the F data the connection should be
1&1, 1&2, 2&3, 3&1, 1&2, 2&1, 1&1
each connection represent column and row value on NM matric, and if there is connection the value must be +1
so from the connection above the new matric should be
NNM=
2 2 0
1 0 1
1 0 0
im trying to code like this
[G H]=size(NM)
for i=1:G
j=2:G
if F(i)==A(j)
(NM(i,j))+1
else
NM(i,j)=0
end
end
NNM=NM
but there is no change from the NM matric?
what shoul i do?

Is this what you are trying to do
F = [1 1 2 3 1 2 1 1];
NM = zeros(3, 3);
for i=1:(numel(F)-1)
NM(F(i), F(i+1))=NM(F(i), F(i+1))+1;
end

You can use sparse (and then convert to full) as follows:
NM = full(sparse(F(1:end-1), F(2:end), 1));

list = [1 1 ; 1 2 ; 2 3 ; 3 1 ; 1 2 ; 2 1 ; 1 1 ] ;
[nx,ny] = size(list) ;
NM = zeros(3) ;
for i = 1:nx
for j = 1:ny
NM(list(i,1),list(i,2)) = NM(list(i,1),list(i,2)) + 1/2 ;
end
end

SAS - Find and print first non-zero value from a dataset in columns

I have a data set with ID in rows and months in columns, as the one shown below.
I want to create an auxiliary column that records the first value that is not zero of each line.
ID M1 M2 M3 M4 M5 Auxiliary column
1 0 0 8 8 7 8
2 7 7 7 . . 7
3 0 0 0 0 9 9
4 0 9 9 9 8 9
5 1 1 1 1 1 1
6 0 2 2 1 1 2
Currently l am using this code, but I haven't been able to get the results I am looking for. Any ideas?
data new_ops04;
set new_ops03;
array MONTHS (24) M1-M24;
RETAIN AUXILIARY_COLUMN 0;
do i=1 to 24;
IF MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
end;
drop i;
run;
Thanks a lot!

You're very close. Just drop the retain statement:
data new_ops04;
set new_ops03;
array MONTHS (24) M1-M24;
AUXILIARY_COLUMN = 0;
do i=1 to 24;
IF MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
end;
drop i;
run;
you need to consider what happens if the first observation(s) are missing

I would do this use case in proc sql. But your problem is that you are not stopping when you reach the first value. So:
flag = 0;
do i=1 to 24 until (flag)
if MONTHS(i) ne 0 and AUXILIARY_COLUMN = 0 THEN
AUXILIARY_COLUMN = MONTHS(i);
flag = 1;
end;
drop i, flag;

matlab: inserting element after element?

Is there a way to insert an element into an array after verifying a certain element value? For example, take
A = [0 0 1 1 0 1 0]
After each 1 in the array, I want to insert another 1 to get
Anew = [0 0 1 1 1 1 0 1 1 0]
However I want a way to code this for a general case (any length 1 row array and the ones might be in any order).

A = [0 0 1 1 0 1 1];
i = (A == 1); % Test for number you want insert after
t = cumsum(i);
idx = [1 (2:numel(A)) + t(1:end-1)];
newSize = numel(A) + sum(i);
N = ones(newSize,1)*5; % Make this number you want to insert
N(idx) = A
Output:
N =
0 0 1 5 1 5 0 1 5 0
I made the inserted number 5 and split things onto multiple lines so it's easy to see what's going on.
If you wanted to do it in a loop (and this is how I would do it in real life where no-one can see me showing off)
A = [0 0 1 1 0 1 0];
idx = (A == 1); % Test for number you want insert after
N = zeros(1, numel(A) + sum(idx));
j = 1;
for i = 1:numel(A)
N(j) = A(i);
if idx(i)
j = j+1;
N(j) = 5; % Test for number you want to insert after
end
j = j+1;
end
N
Output:
N =
0 0 1 5 1 5 0 1 5 0

This code is not the most elegant, but it'll answer your question...
A=[0 0 1 1 0 1 0];
AA=[];
for ii=1:length(A);
AA=[AA A(ii)];
if A(ii)
AA=[AA 1];
end
end
I'm sure there will be also a vectorized way...

This should do the trick:
>> A = [0 0 1 1 0 1 0]
>>
>> sumA = sum(A);
>> Anew = zeros(1, 2*sumA+sum(~A));
>> I = find(A) + (0:sumA-1);
>> Anew(I) = 1;
>> Anew(I+1) = 8.2;
Anew =
0 0 1 8.2 1 8.2 0 1 8.2 0

A question about matrix manipulation

Given a 1*N matrix or an array, how do I find the first 4 elements which have the same value and then store the index for those elements?
PS:
I'm just curious. What if we want to find the first 4 elements whose value differences are within a certain range, say below 2? For example, M=[10,15,14.5,9,15.1,8.5,15.5,9.5], the elements I'm looking for will be 15,14.5,15.1,15.5 and the indices will be 2,3,5,7.

If you want the first value present 4 times in the array 'tab' in Matlab, you can use
num_min = 4
val=NaN;
for i = tab
if sum(tab==i) >= num_min
val = i;
break
end
end
ind = find(tab==val, num_min);
By instance with
tab = [2 4 4 5 4 6 4 5 5 4 6 9 5 5]
you get
val =
4
ind =
2 3 5 7

Here is my MATLAB solution:
array = randi(5, [1 10]); %# random array of integers
n = unique(array)'; %'# unique elements
[r,~] = find(cumsum(bsxfun(#eq,array,n),2) == 4, 1, 'first');
if isempty(r)
val = []; ind = []; %# no answer
else
val = n(r); %# the value found
ind = find(array == val, 4); %# indices of elements corresponding to val
end
Example:
array =
1 5 3 3 1 5 4 2 3 3
val =
3
ind =
3 4 9 10
Explanation:
First of all, we extract the list of unique elements. In the example used above, we have:
n =
1
2
3
4
5
Then using the BSXFUN function, we compare each unique value against the entire vector array we have. This is equivalent to the following:
result = zeros(length(n),length(array));
for i=1:length(n)
result(i,:) = (array == n(i)); %# row-by-row
end
Continuing with the same example we get:
result =
1 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 1 1 0 0 0 0 1 1
0 0 0 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0 0 0
Next we call CUMSUM on the result matrix to compute the cumulative sum along the rows. Each row will give us how many times the element in question appeared so far:
>> cumsum(result,2)
ans =
1 1 1 1 2 2 2 2 2 2
0 0 0 0 0 0 0 1 1 1
0 0 1 2 2 2 2 2 3 4
0 0 0 0 0 0 1 1 1 1
0 1 1 1 1 2 2 2 2 2
Then we compare that against four cumsum(result,2)==4 (since we want the location where an element appeared for the forth time):
>> cumsum(result,2)==4
ans =
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Finally we call FIND to look for the first appearing 1 according to a column-wise order: if we traverse the matrix from the previous step column-by-column, then the row of the first appearing 1 indicates the index of the element we are looking for. In this case, it was the third row (r=3), thus the third element in the unique vector is the answer val = n(r). Note that if we had multiple elements repeated 4 times or more in the original array, then the one first appearing for the forth time will show up first as a 1 going column-by-column in the above expression.
Finding the indices of the corresponding answer value is a simple call to FIND...

Here is C++ code
std::map<int,std::vector<int> > dict;
std::vector<int> ans(4);//here we will store indexes
bool noanswer=true;
//my_vector is a vector, which we must analize
for(int i=0;i<my_vector.size();++i)
{
std::vector<int> &temp = dict[my_vector[i]];
temp.push_back(i);
if(temp.size()==4)//we find ans
{
std::copy(temp.begin(),temp.end(),ans.begin() );
noanswer = false;
break;
}
}
if(noanswer)
std::cout<<"No Answer!"<<std::endl;

Ignore this and use Amro's mighty solution . . .
Here is how I'd do it in Matlab. The matrix can be any size and contain any range of values and this should work. This solution will automatically find a value and then the indicies of the first 4 elements without being fed the search value a priori.
tab = [2 5 4 5 4 6 4 5 5 4 6 9 5 5]
%this is a loop to find the indicies of groups of 4 identical elements
tot = zeros(size(tab));
for nn = 1:numel(tab)
idxs=find(tab == tab(nn), 4, 'first');
if numel(idxs)<4
tot(nn) = Inf;
else
tot(nn) = sum(idxs);
end
end
%find the first 4 identical
bestTot = find(tot == min(tot), 1, 'first' );
%store the indicies you are interested in.
indiciesOfInterst = find(tab == tab(bestTot), 4, 'first')

Since I couldn't easily understand some of the solutions, I made that one:
l = 10; m = 5; array = randi(m, [1 l])
A = zeros(l,m); % m is the maximum value (may) in array
A(sub2ind([l,m],1:l,array)) = 1;
s = sum(A,1);
b = find(s(array) == 4,1);
% now in b is the index of the first element
if (~isempty(b))
find(array == array(b))
else
disp('nothing found');
end
I find this easier to visualize. It fills '1' in all places of a square matrix, where values in array exist - according to their position (row) and value (column). This is than summed up easily and mapped to the original array. Drawback: if array contains very large values, A may get relative large too.

You're PS question is more complicated. I didn't have time to check each case but the idea is here :
M=[10,15,14.5,9,15.1,8.5,15.5,9.5]
val = NaN;
num_min = 4;
delta = 2;
[Ms, iMs] = sort(M);
dMs = diff(Ms);
ind_min=Inf;
n = 0;
for i = 1:length(dMs)
if dMs(i) <= delta
n=n+1;
else
n=0;
end
if n == (num_min-1)
if (iMs(i) < ind_min)
ind_min = iMs(i);
end
end
end
ind = sort(iMs(ind_min + (0:num_min-1)))
val = M(ind)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Counting a value within a specific range in array in SAS - arrays

Related

Creating Dataset with random Values in SAS

Matlab Fill zeros matric based on array

SAS - Find and print first non-zero value from a dataset in columns

matlab: inserting element after element?

A question about matrix manipulation

Categories

Resources