Standard SQL - Not able to loop over main loop - loops

When using nested LOOP or WHILE loop in bigquery, it seems I am not able to iterate again over the outer loop. The problem can be reproduced with the code below.
DECLARE i int64 DEFAULT 0;
DECLARE j int64 DEFAULT 0;
DECLARE k int64 DEFAULT 0;
WHILE i < 3 DO
SET i = i + 1;
WHILE j < 2 DO
SET j = j + 1;
IF j = 2 THEN
SET k = k+7;
END IF;
EXECUTE IMMEDIATE """
WITH test AS(SELECT #i2 AS i, #j2 AS j, #k2 AS k)
SELECT * FROM test
"""
USING i AS i2, j AS j2, k AS k2;
END WHILE;
END WHILE;
As output of this, Bigquery give me back two iterations (the inner loop):
Row
i
j
k
1
1
1
0
Row
i
j
k
1
1
2
7
I would expect that, when we end the inner while loop, we would go to the outer one and start over.
Ending up in something like:
Row
i
j
k
1
2
2
7
What is the right way to do this? When using the same set-up but with a LOOP and BREAK condition, the results are exactly the same as explained as above. when using CONTINUE instead of BREAK my query runs forever / keeps hanging at the second statement

I would expect that, when we end the inner while loop, we would go to the outer one and start over
It actually performs exactly as you expected - to check this - run below with extra line so you will see the proof
DECLARE i int64 DEFAULT 0;
DECLARE j int64 DEFAULT 0;
DECLARE k int64 DEFAULT 0;
WHILE i < 3 DO
SET i = i + 1;
SELECT i; # insert this line to check correctness
WHILE j < 2 DO
SET j = j + 1;
IF j = 2 THEN
SET k = k+7;
END IF;
EXECUTE IMMEDIATE """
WITH test AS(SELECT #i2 AS i, #j2 AS j, #k2 AS k)
SELECT * FROM test
"""
USING i AS i2, j AS j2, k AS k2;
END WHILE;
END WHILE;
So, obviously for i = 2 - WHILE j < 2 DO evaluated as false and thus skipped
What is the right way to do this?
It depends on what you are trying to achieve - but usually this is done by resetting j inside first loop as in below example
DECLARE i int64 DEFAULT 0;
DECLARE j int64 DEFAULT 0;
DECLARE k int64 DEFAULT 0;
WHILE i < 3 DO
SET i = i + 1;
SET j = 0; # reset j
WHILE j < 2 DO
SET j = j + 1;
IF j = 2 THEN
SET k = k+7;
END IF;
EXECUTE IMMEDIATE """
WITH test AS(SELECT #i2 AS i, #j2 AS j, #k2 AS k)
SELECT * FROM test
"""
USING i AS i2, j AS j2, k AS k2;
END WHILE;
END WHILE;

Related

SAS EG How to compare cell values in an array loop?

I am currently trying to compare cell values on the same row over multiple columns, but having issues with referencing the correct cells.
My data currently is this:
col1
col2
col3
col4
col5
col6
a
b
c
d
e
f
a
b
c
d
e
e
a
b
c
d
d
d
I would like to compare col{i} to col{i+1} and drop values when repeated to give:
col1
col2
col3
col4
col5
col6
a
b
c
d
e
f
a
b
c
d
e
-
a
b
c
d
-
-
My current code is:
data want;
set have;
array c{*} col;
do i = 1 to dim(c);
do j = i+1;
if c{i} = c{j} then .;
else c{i};
end;
end;
run;
TIA
data want;
set have;
array c{*} col:;
do i = dim(c) to 2 by -1; *no reason to check #1;
if c{i} = c{i-1} then call missing(c{i}); *if identical to prior, clear out;
end;
run;
You don't need two loops - just one - as you're just checking the record "before" (or "after", but "before" is easier to mentally comprehend, at least for me). Start on 2, check the one prior, and if identical, clear it out.
Importantly, this goes in reverse order (so it gets the d situation above) - if you go left to right, it won't get the last d as it won't compare to the right one.
For the case of data containing multiple segments of repeated values and wanting only unique consecutive values you will need to track an insertion index.
Example: Variable j tracks the insertion point
data have;
input (col1-col6) ($) #1 (kol1-kol6) ($);
format col: kol: $1.;
datalines;
a b c d e f
a b c d e e
a b c d d d
a a b b c c
a a b b a a
. b b b c d
a a . . c c
run;
data want(keep=col: kol:);
set have;
array c col1-col6;
j = 1;
do i = 2 to dim(c);
if c(i) ne c(j) then do;
j = j + 1;
if i ne j then do;
c(j) = c(i);
call missing(c(i));
end;
end;
end;
do j = j+1 to i-1;
call missing(c{j});
end;
run;
For the case of wanting only unique values of the array, you can use a bubble sorting comparison approach when the number of elements is smallish, say <10.
/* uniqueness via a bubbly search */
data want_b;
set have;
array c col1-col6;
j=0;
do i = 1 to dim(c);
if missing(c{i}) then continue;
do k = 1 to j; * bubble, bubble;
if c{k} = c{i} then do;
call missing(c{i});
leave;
end;
end;
if missing(c{i}) then continue;
j = j + 1;
if j < i then do;
c{j} = c{i};
call missing(c{i});
end;
end;
run;
When the number of elements increases you can use a hash to be more efficient whilst ensuring uniqueness.
/* uniqueness via hash lookup */
data want_h(keep=col: kol:);
set have;
array c col1-col6;
if _n_ = 1 then do;
declare hash v();
length value $20; * must be at least as long as longest of c{*} variable ;
v.defineKey('value');
v.defineData('i');
v.defineDone();
call missing(value);
end;
j = 0;
do i = 1 to dim(c);
if not missing(c{i}) then if v.check(key:c{i}) ne 0 then do;
v.add(key:c{i},data:i);
j = j + 1;
if i ne j then
c(j) = c(i);
end;
end;
do j = j+1 to dim(c);
call missing(c{j});
end;
v.clear();
run;

Iterating over all possible ways of dividing 12 objects over 4 groups

I am trying to find a way to iterate over all possible combinations of dividing 12 objects over equally sized 4 groups (order within the group doesn't matter, the order of the groups does matter).
I know the total amount of combinations is 369600 = 12! / (3!)^4, but I have no idea how I would go about iterating over all these different combinations.
You have objects as O[0], ..., O[11] and groups as G[0], ..., G[3] now you could assign objects to groups with steps like this:
1.Select 3 object for G[0] like this:
for(i = 0 ; i<10 ; i++){
for(j = i+1 ; j<11 ; j++){
for(k = j+1 ; k<12 ; k++){
G[0] = {O[i] , O[j] , O[k]}
2.Create a new object list by removing the O[i], O[j], O[k] from the object list and do the same thing like the pseudo-code above for G[1], I mean something like this:
for(l = 0 ; l<7 ; l++){
for(m = l+1 ; m<8 ; m++){
for(n = m+1 ; n<9 ; n++){
G[1] = {O[l] , O[m] , O[n]}
Do the same thing as step 2 for G[2]
Assign the 3 remaining items to G[3]
Write out G[1], G[2], G[3], G[4]

Can´t undertand 10th line instruction after inner FOR loop

I've been studying "Algorithms and Data Structures" by N.Wirth. He codes his algorithms in a language he created: Oberon. I finished the book but I have one doubt about this algorithim of page 19 coded in Oberon:
PROCEDURE Power (VAR W: Texts.Writer; N: INTEGER);
VAR i, k, r: INTEGER;
d: ARRAY N OF INTEGER;
BEGIN
FOR k := 0 TO N-1 DO
Texts.Write(W, "."); r := 0;
FOR i := 0 TO k-1 DO
r := 10*r + d[i]; d[i] := r DIV 2; r := r MOD 2;
Texts.Write(W, CHR(d[i] + ORD("0")))
END;
d[k] := 5; Texts.Write(W, "5"); Texts.WriteLn(W)
END
END Power
The resulting output text for N = 10 is
.5
.25
.125
.0625
.03125
.015625
.0078125
.00390625
.001953125
.0009765625
I don´t understand what the instructions in line 10 d[k] := 5; Texts.Write(W, "5"); Texts.WriteLn(W) does:
1) Why you would you d[k] := 5? the program already printed all the output required (d[0] to d[k-1]).
2) why would you print a 5 after that? (Texts.Write(W, "5"))
The computation utilizes the fact that the last digit will always be five.
Unless the execution has finished, the variable d[k] is read in the next turn of the outer loop when r becomes 10*r + d[i] in the last turn of the inner loop
The statement Texts.Write(W, "5") requires (marginally) less computation than Texts.Write(W, d[i]).

Vectorizing a code that requires to complement some elements of a binary array

I have a matrix A of dimension m-by-n composed of zeros and ones, and a matrix J of dimension m-by-1 reporting some integers from [1,...,n].
I want to construct a matrix B of dimension m-by-n such that for i = 1,...,m
B(i,j) = A(i,j) for j=1,...,n-1
B(i,n) = abs(A(i,n)-1)
If sum(B(i,:)) is odd then B(i,J(i)) = abs(B(i,J(i))-1)
This code does what I want:
m = 4;
n = 5;
A = [1 1 1 1 1; ...
0 0 1 0 0; ...
1 0 1 0 1; ...
0 1 0 0 1];
J = [1;2;1;4];
B = zeros(m,n);
for i = 1:m
B(i,n) = abs(A(i,n)-1);
for j = 1:n-1
B(i,j) = A(i,j);
end
if mod(sum(B(i,:)),2)~=0
B(i,J(i)) = abs(B(i,J(i))-1);
end
end
Can you suggest more efficient algorithms, that do not use the nested loop?
No for loops are required for your question. It just needs an effective use of the colon operator and logical-indexing as follows:
% First initialize B to all zeros
B = zeros(size(A));
% Assign all but last columns of A to B
B(:, 1:end-1) = A(:, 1:end-1);
% Assign the last column of B based on the last column of A
B(:, end) = abs(A(:, end) - 1);
% Set all cells to required value
% Original code which does not work: B(oddRow, J(oddRow)) = abs(B(oddRow, J(oddRow)) - 1);
% Correct code:
% Find all rows in B with an odd sum
oddRow = find(mod(sum(B, 2), 2) ~= 0);
for ii = 1:numel(oddRow)
B(oddRow(ii), J(oddRow(ii))) = abs(B(oddRow(ii), J(oddRow(ii))) - 1);
end
I guess for the last part it is best to use a for loop.
Edit: See the neat trick by EBH to do the last part without a for loop
Just to add to #ammportal good answer, also the last part can be done without a loop with the use of linear indices. For that, sub2ind is useful. So adopting the last part of the previous answer, this can be done:
% Find all rows in B with an odd sum
oddRow = find(mod(sum(B, 2), 2) ~= 0);
% convert the locations to linear indices
ind = sub2ind(size(B),oddRow,J(oddRow));
B(ind) = abs(B(ind)- 1);

MATLAB: Finding the entry number of the first '1' in a logical array

I have created a logical array of 1's and 0's using the following code:
nWindow = 10;
LowerTotInitial = std(LowerTot(1:nWindow));
UpperTotInitial = std(UpperTot(1:nWindow));
flag = 0;
flagArray = zeros(length(LowerTot), 1);
for n = 1 : nData0 - nWindow
for k = 0 : nWindow - 1
if LowerTot(n + k) < 0.1*LowerTotInitial || UpperTot(n + k) < 0.1*UpperTotInitial
flag = 1;
flagArray(n) = 1;
else
flag = 0;
end
end
end
This returns flagArray, an array of 0's and 1's. I am trying to find the index of the first 1 in the array. ie. 1 = flagArray(index). I am confused as to what is the best way to accomplish this!
What you call an entry number is referred to as an index in MATLAB-speak. To find the index of the first matching element in an array you can use the FIND function:
>> x = [0 0 1 0 1 0];
>> find(x, 1, 'first')
ans =
3
Try this ind = find(flagArray, k, 'first')
with k =1
Read this Matlab Docs - find

Resources