Find the indices of a consecutive flag value in a 1D array

Find the indices of a consecutive flag value in a 1D array - arrays

I am looking to find a value 0 in a 1D array. I have several 0 values in this array, most of the time consecutive 0 values. Now what I want exactly to do is to find the indices of the first zero value occurrence and the last zero value occurrence in each consecutive occurrences, I will give below an example to make things much clear :
Imagine I have the following array :
A= 0.0 0.0 0.0 0.0 0.0 0.0 0.38458693526004206 0.37630968444637147 0.40920888023862656 0.37240138383511134 0.38032672100490084 0.37013107455599198 0.40263333907360693 0.36804456033540955 0.41199172743738527 0.42761170349633443 0.39300715826673704 0.39783513932402137 0.44013743441396674 0.435127008833611 0.48217350280280391 0.47501246018014148 0.49234819258730078 0.54559998531569354 0.47840534103437832 0.0 0.0 0.0 0.51927791704510429 0.0 0.0 0.0 0.0 0.0 0.45862555500619961 0.50158980306905965 0.45676444815553296 0.49679306608627022 0.53878698007533210 0.50186256107128602 0.51714780706878094 0.53005606067091249 0.48409168179213419 0.48594430950932133 0.50963106475909081 0.49300327248076087 0.50531667704394834 0.46415085995913757 0.51930900041928330
so I look for the first location and the last location of zero in each consecutive occurrence, I should obtain the following :
min_loc_1=1
max_loc_1=6
min_loc_2=26
max_loc_2=28
min_loc_3=30
max_loc_3=34
Now I tried a combination of any, minloc, maxloc, or forall, but I can't figure it out
do ijk = 1, size(work1)
if (work1(ijk) .eq. 0) then
location1(ijk) = ijk
end if
end do
min_loc=minloc(location1)
max_loc1=maxloc(location1)
I cannot use where, because I am calling a subroutine inside of it, and Fortran doesn't like it apparently.

A limited amount of testing has convinced me that this solves your immediate problem. I haven't tested it extensively, I'll leave that to you. It writes the indices of the start and stop of each run of 0s into the array b:
INTEGER, DIMENSION(:),ALLOCATABLE :: b
LOGICAL :: zz
...
ALLOCATE(b(0))
zz = .false.
DO ix = 1, SIZE(a)
IF (.NOT.zz.AND.a(ix)==0) THEN
b = [b,ix]
zz = .TRUE.
END IF
IF (zz.AND.a(ix)/=0) THEN
b = [b,ix-1]
zz = .FALSE.
END IF
END DO
This produces, when fed the array you show us,
b == [1 6 26 28 30 34]
If that doesn't appeal, this also seems to work:
b = [(ix,ix=1,SIZE(a))]
WHERE(a/=0.0) b = 0
c = PACK(b,b/=0)
b = PACK(c,(CSHIFT(c,1)-c)*(CSHIFT(c,-1)-c)/=-1)
If you have trouble figuring this version out stick to the explicit looping in the first snippet.

Related

Replacing specific values in an array based on logic

I'm trying to replace the values in array1 following this Logic:
If the values are greater than one use just the decimal value.
If it's exactly 1 stay at 1
If 0 stay 0
If neg make positive and follow the logic
The code I used was:
array1=[0.5 1.3 1.0 0.0 -0.2 -2.78]
array1(array1>1)=mod(abs(array1),1)
what I expected to get is array1 = [0.5 0.3 1.0 0.0 0.2 0.78]
But I get an error: =: nonconformant arguments (op1 is 1x1, op2 is 1x5) how can I fix this?
PS: I'm using Octave 5.2 which is similar to Matlab

This causes an Unable to perform assignment-error because left and right side of the assignment differ in size: You need to use the logical indexing on both sides array1>1.
array1=[0.5 1.3 1.0 0.0 -0.2]
% create logical vector for indexing
lg = array1 > 1
% replace elements
array1(lg) = mod( abs(array1(lg)) ,1)
This should work in MATLAB + Octave.
You can also split the different operations :
% ensure positiveness
array1 = abs(array1);
% force to one
lg = array1 > 1;
array1(lg) = mod(array(1),1);
this returns
array1 = 0.5000 0.3000 1.0000 0 0.20
If you absolutely want to stick to your approach, you can use a little trick: add +1e-10 to the second input of the mod function to let 1 "survive" the operation ;)
array1 = mod( abs(array1) ,1+1e-10)
This trick will yield slightly different results because the modulus is 1.0000000001 and not 1. The error will be higher, the higher the input number. However, from your example-array I would guess that this risk could be OK.

max's answer got me where I needed to get to here's what I used.
array1=[0.5 1.3 1.0 0.0 -0.2 -2.63]
array1=abs(array1) %1) make array positive
lg = array1 > 1 %2) create logical vector for indexing
array1(lg) = mod( abs(array1(lg)) ,1) %3) replace elements
array1 =
0.50000 1.30000 1.00000 0.00000 -0.20000 -2.63000
array1 =
0.50000 1.30000 1.00000 0.00000 0.20000 2.63000
lg =
0 1 0 0 0 1
array1 =
0.50000 0.30000 1.00000 0.00000 0.20000 0.63000

Bounds Error in Julia When Working with Arrays

I'm trying to simulate a 3D random walk in Julia as a way to learn the ropes of Julia programming. I define all my variables and then initialize an (n_steps X 3) array of zeros that I want to use to store my coordinates when I do the walk. Here, "n_steps" is the number of steps in the walk, and the three columns correspond to the x, y, and z coordinates. When I try to update the array with my new coordinates, I get an error:
ERROR: LoadError: BoundsError: attempt to access 100×3 Array{Float64,2} at index [0, 1]
I don't understand why I'm getting this error. As far as I know, I'm looping through all the rows of the array and updating the x, y, and z coordinates. I never mentioned the index 0, as I specified that the loop start at row number 1 in my code. What is going on? Here is my code so far (I haven't plotted yet, since I can't progress further without resolving this problem):
using Plots
using Random
len_step = 1
θ_min, θ_max = 0, pi
ϕ_min, ϕ_max = 0, 2 * pi
n_steps = 100
init = zeros(Float64, n_steps, 3)
for jj = 1:1:length(init)
θ_rand = rand(Float64)* (θ_max - θ_min)
ϕ_rand = rand(Float64)* (ϕ_max - ϕ_min)
x_rand = len_step * sin(θ_rand) * cos(ϕ_rand)
y_rand = len_step * sin(θ_rand) * sin(ϕ_rand)
z_rand = len_step * cos(θ_rand)
init[jj, 1] += init[jj-1, 1] + x_rand
init[jj, 2] += init[jj-1, 2] + y_rand
init[jj, 3] += init[jj-1, 3] + z_rand
end
print(init)
If it's relevant, I'm running Julia Version 1.4.2 on 64-Bit on Windows 10. I'd greatly appreciate any help. Thanks.

The function length returns the length of an array as if it was one dimensional. What you want is size
julia> init = zeros(3,5)
3×5 Array{Float64,2}:
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
julia> length(init)
15
julia> size(init)
(3, 5)
julia> size(init, 2)
5
julia> size(init, 1)
3
Note also that in julia, array indices start at 1, and since you access at index jj-1, you can not start the loop at 1.

Julia way to write k-step look ahead function?

Suppose I have two arrays representing a probabilistic graph:
2
/ \
1 -> 4 -> 5 -> 6 -> 7
\ /
3
Where the probability of going to state 2 is 0.81 and the probability of going to state 3 is (1-0.81) = 0.19. My arrays represent the estimated values of the states as well as the rewards. (Note: Each index of the array represents its respective state)
V = [0, 3, 8, 2, 1, 2, 0]
R = [0, 0, 0, 4, 1, 1, 1]
The context doesn't matter so much, it's just to give an idea of where I'm coming from. I need to write a k-step look ahead function where I sum the discounted value of rewards and add it to the estimated value of the kth-state.
I have been able to do this so far by creating separate functions for each step look ahead. My goal of asking this question is to figure out how to refactor this code so that I don't repeat myself and use idiomatic Julia.
Here is an example of what I am talking about:
function E₁(R::Array{Float64,1}, V::Array{Float64, 1}, P::Float64)
V[1] + 0.81*(R[1] + V[2]) + 0.19*(R[2] + V[3])
end
function E₂(R::Array{Float64,1}, V::Array{Float64, 1}, P::Float64)
V[1] + 0.81*(R[1] + R[3]) + 0.19*(R[2] + R[4]) + V[4]
end
function E₃(R::Array{Float64,1}, V::Array{Float64, 1}, P::Float64)
V[1] + 0.81*(R[1] + R[3]) + 0.19*(R[2] + R[4]) + R[5] + V[5]
end
.
.
.
So on and so forth. It seems that if I was to ignore E₁() this would be exceptionally easy to refactor. But because I have to discount the value estimate at two different states, I'm having trouble thinking of a way to generalize this for k-steps.
I think obviously I could write a single function that took an integer as a value and then use a bunch of if-statements but that doesn't seem in the spirit of Julia. Any ideas on how I could refactor this? A closure of some sort? A different data type to store R and V?

It seems like you essentially have a discrete Markov chain. So the standard way would be to store the graph as its transition matrix:
T = zeros(7,7)
T[1,2] = 0.81
T[1,3] = 0.19
T[2,4] = 1
T[3,4] = 1
T[5,4] = 1
T[5,6] = 1
T[6,7] = 1
Then you can calculate the probabilities of ending up at each state, given an intial distribution, by multiplying T' from the left (because usually, the transition matrix is defined transposedly):
julia> T' * [1,0,0,0,0,0,0] # starting from (1)
7-element Array{Float64,1}:
0.0
0.81
0.19
0.0
0.0
0.0
0.0
Likewise, the probability of ending up at each state after k steps can be calculated by using powers of T':
julia> T' * T' * [1,0,0,0,0,0,0]
7-element Array{Float64,1}:
0.0
0.0
0.0
1.0
0.0
0.0
0.0
Now that you have all probabilities after k steps, you can easily calculate expectations as well. Maybe it pays of to define T as a sparse matrix.

Correct way to get weighted average of concrete array-values along continous interval

I've been looking for a while onto websearch, however, possibly or probably I am missing the right terminology.
I have arbitrary sized arrays of scalars ...
array = [n_0, n_1, n_2, ..., n_m]
I also have a function f->x->y, with 0<=x<=1, and y an interpolated value from array. Examples:
array = [1,2,9]
f(0) = 1
f(0.5) = 2
f(1) = 9
f(0.75) = 5.5
My problem is that I want to compute the average value for some interval r = [a..b], where a E [0..1] and b E [0..1], i.e. I want to generalize my interpolation function f->x->y to compute the average along r.
My mind boggles me slightly w.r.t. finding the right weighting. Imagine I want to compute f([0.2,0.8]):
array --> 1 | 2 | 9
[0..1] --> 0.00 0.25 0.50 0.75 1.00
[0.2,0.8] --> ^___________________^
The latter being the range of values I want to compute the average of.
Would it be mathematically correct to compute the average like this?: *
1 * (1-0.8) <- 0.2 'translated' to [0..0.25]
+ 2 * 1
avg = + 9 * 0.2 <- 0.8 'translated' to [0.75..1]
----------
1.4 <-- the sum of weights

This looks correct.
In your example, your interval's length is 0.6. In that interval, your number 2 is taking up (0.75-0.25)/0.6 = 0.5/0.6 = 10/12 of space. Your number 1 takes up (0.25-0.2)/0.6 = 0.05 = 1/12 of space, likewise your number 9.
This sums up to 10/12 + 1/12 + 1/12 = 1.
For better intuition, think about it like this: The problem is to determine how much space each array-element covers along an interval. The rest is just filling the machinery described in http://en.wikipedia.org/wiki/Weighted_average#Mathematical_definition .

Finding the row with max separation between elements of an array in matlab

I have an array of size m x n. Each row has n elements which shows some probability (between 0 and 1). I want to find the row which has the max difference between its elements while it would be better if its nonzero elements are greater as well.
For example in array Arr:
Arr = [0.1 0 0.33 0 0.55 0;
0.01 0 0.10 0 0.2 0;
1 0.1 0 0 0 0;
0.55 0 0.33 0 0.15 0;
0.17 0.17 0.17 0.17 0.17 0.17]
the best row would be 3rd row, because it has more distinct values with greater values. How can I compute this using Matlab?

It seems that you're looking for the row with the greatest standard deviation, which is basically a measure of how much the values vary from the average.
If you want to ignore zero elements, use Shai's useful suggestion to replace zero elements to NaN. Indeed, some of MATLAB's built-in functions allow ignoring them:
Arr2 = Arr;
Arr2(~Arr) = NaN;
To find the standard deviation we'll employ nanstd (not std, because it doesn't ignore NaN values) along the rows, i.e. the 2nd dimension:
nanstd(Arr2, 0, 2)
To find the greatest standard deviation and it's corresponding row index, we'll apply nanmax and obtain both output variables:
[stdmax, idx] = nanmax(nanstd(Arr2, 0, 2));
Now idx holds hold the index of the desired row.
Example
Let's run this code on the input that you provided in your question:
Arr = [0.1 0 0.33 0 0.55 0;
0.01 0 0.10 0 0.2 0;
1 0.1 0 0 0 0;
0.55 0 0.33 0 0.15 0;
0.17 0.17 0.17 0.17 0.17 0.17];
Arr2 = Arr;
Arr2(~Arr) = NaN;
[maxstd, idx] = nanmax(nanstd(Arr2, 0, 2))
idx =
3
Note that the values in row #3 differ one from another much more than those in row #1, and therefore the standard deviation of row #3 is greater. This also corresponds to your comment:
... ergo a row with 3 zero and 3 non-zero but close values is worse than a row with 4 zeros and 2 very different values.
For this reason I believe that in this case 3 is indeed the correct answer.

It seems like you wish to ignore 0s in your matrix. You may achieve this by setting them to NaN and proceed using special build-in functions that ignore NaNs (e.g., nanmin, nanmax, etc.)
Here is a sample code for finding the row (ri) with the largest difference between minimal (nonzero) response and the maximal response:
nArr = Arr;
nArr( Arr == 0 ) = NaN; % replace zeros with NaNs
mn = nanmin(nArr, [], 2); % find minimal, non zero response at each row
mx = nanmax(nArr, [], 2); % maximal response
[~, ri] = nanmax( mx - mn ); % fid the row with maximal difference