Why current and "old" value of the same expression in a postcondition are equal? - eiffel

I'm trying to get the value of an entry in a 2D-array before the implementation and afterwards. But the following postcondition is failing because the 2 entries are somehow the same (and yes, I have redefined is_equal, so that ~ will be object equality):
ensure
designated_cell_changed:
get_entry (row + 1, column + 1) /~ old get_entry (row + 1, column + 1)
Why do I get a postcondition violation designated_cell_changed?

There could be several reasons:
It's suspicious why the indexes are row + 1 and column + 1 instead of row and column.
If the feature in question takes a new value explicitly, e.g. put (value: G; row, column: ...), it should have a precondition
require
different_value: value /~ entry (row, column)
Side note: for queries it's recommended to use nouns or adjectives, not verbs, thus entry instead of get_entry.
If the feature does not take a new value as an argument, it should update the corresponding value itself.
There could be mistakes in the code of the feature:
It does not change the value all the time (e.g., in some conditional branches).
It changes the value but at some other indices.
If the values of entry (row + 1, column + 1) at the beginning and at the end of the feature are different, the implementation of is_equal may miss some cases that make the objects different.

Related

Excel: #CALC! error (Nested Array) when using MAP functions for counting interval overlaps

I am struggling with the following formula, it works for some scenarios but not in all of them. The name input has the data set that is failing, getting an #CALC! error with the description "Nested Array":
=LET(input, {"N1",0,0;"N1",0,10;"N1",10,20},
names, INDEX(input,,1), namesUx, UNIQUE(names), dates, FILTER(input, {0,1,1}),
byRowResult, BYROW(namesUx, LAMBDA(name,
LET(set, FILTER(dates, names=name),
startDates, INDEX(set,,1), endDates, INDEX(set,,2), onePeriod, IF(ROWS(startDates)=1, TRUE, FALSE),
IF(onePeriod, IF(startDates <= IF(endDates > 0, endDates, startDates + 1),0, 1),
LET(seq, SEQUENCE(ROWS(startDates)),
mapResult, MAP(startDates, endDates, seq, LAMBDA(start,end,idx,
LET(incIdx, 1-N(ISNUMBER(XMATCH(seq,idx))),
startInc, FILTER(startDates, incIdx), endInc, FILTER(endDates, incIdx),
MAP(startInc, endInc,LAMBDA(ss,ee, N(AND(start <= ee, end >= ss))))
))),
SUM(mapResult)))
))), HSTACK(namesUx, byRowResult)
)
If we replace the input values in previous formula with the following range: A2:C4, in G1:H1 would be the expected output:
Provided also a graphical representation to visualize the intervals and their corresponding overlap. From the screenshot, we have 2 overlaps.
If we use the above formula for the same range we get the following output:
If we hover the #CALC! cell, it informs about the specific error:
Let's explain the input data and what the formula does:
Input data
First column: N1, N2, N3, represents names
Second Column: Start of the interval (I am using numeric values, but in my real situation will be dates)
Third Column: End of the interval (I am using numeric values, but in my real situation will be dates)
Formula
The purpose of the formula is to identify for each unique names, how many intervals overlap. The calculation goes by each row (BYROW) of the unique names and for each pair of start-end values, counts the overlaps with respect to the other start-end values. I use FILTER to exclude the current start-end pair with the following condition: FILTER(startDates, incIdx) and I tested it works properly.
The condition to exclude the start data of the current name of the iteration of BYROW is the following:
1-N(ISNUMBER(XMATCH(seq,idx)))
and used as second input argument of the FILTER function.
The rest is just to check the overlap range condition.
I separate the logic when a name has only one interval, from the rest because the calculation is different, For a single interval I just want to check that the end date comes after start date and treat the special case of 0. This particular case I tested it works.
Testing and workarounds
I already isolated where is the issue and when it happens. The problem happens in the following call:
MAP(startInc, endInc,LAMBDA(ss,ee, N(AND(start <= ee, end >= ss))))
when startInc and endInc has more than one row. It has nothing to do with the content of the LAMBDA function. I can use:
MAP(startInc, endInc,LAMBDA(ss,ee, 1))
and still fails. The problem is with the input arrays: startInc, endInc. If I use any other array for example the following ones it doesn't works:
MAP(seq,LAMBDA(ss, 1))
Similar result using names, startDates, etc, even if I use: {1;2;3} fails. If use use idx it works, because it is not an array. Therefore the error happens with any type of array or range.
I have also tested that the input arguments are correct having the correct shape and values. For example replacing the MAP function with: TEXTJOIN(",",, startInc)&" ; " (and also with endInc) and replacing SUM with CONCAT to concatenate the result.
In terms of input data I tested the following scenarios:
{"N1",0,0;"N1",0,10} -> Works
{"N1",0,0;"N1",0,10;"N2",10,0;"N2",10,20;"N3",20,10} -> Works
{"N1",0,0;"N1",0,10;"N1",10,20} -> Error
{"N1",0,0;"N1",0,10;"N1",10,0} -> Error
{"N1",0,0;"N1",0,10;"N1",10,0;"N1",20,10} -> Error
{"N1",0,0;"N1",0,10;"N2",10,0;"N2",10,20;"N2",20,10} -> Error
The cases that work are because it goes to the MAP function an array of size 1 (number of duplicated names is less than 3)
I did some research on internet about #CALC! error, but there is no too much details about this error and it is provided only a very trivial case. I didn't find any indication in the limit of nested calls of the new arrays functions: BYROW, MAP, etc.
Conclusion, it seems that the following nested structure produce this error:
=MAP({1;2;3}, LAMBDA(n, MAP({4;5;6}, LAMBDA(s, TRUE))))
even for a trivial case like this.
On contrary the following situation works:
=MAP({1;2;3}, LAMBDA(n, REDUCE("",{4;5;6}, LAMBDA(a,s, TRUE))))
because the output of REDUCE is not an array.
Any suggestion on how to circumvent this limitation in my original formula?, Is this a real situation of an array that cannot use another array as input?, Is it a bug?
As #JosWoolley pointed out:
LAMBDA's calculation parameter should return a single value and not an
array
I haven't seen that way, or deduced it from #CALC! Nested Array error definition:
The nested array error occurs when you try to input an array formula
that contains an array. To resolve the error, try removing the second
array...For example, =MUNIT({1,2}) is asking Excel to return
a 1x1 array, and a 2x2 array, which isn't currently supported.
=MUNIT(2) would calculate as expected
so the alternative is then to remove this second MAP call. The following link gave me an idea about how to do it: Identify overlapping dates and times in Excel, therefore using SUMPRODUCT or SUM can serve the purpose.
=LET(input, {"N1",0,0;"N1",0,10;"N1",10,20},
names, INDEX(input,,1), namesUx, UNIQUE(names), dates, FILTER(input, {0,1,1}),
byRowResult, BYROW(namesUx, LAMBDA(name,
LET(set, FILTER(dates, names=name),
startDates, INDEX(set,,1), endDates, INDEX(set,,2),
onePeriod, IF(ROWS(startDates)=1, TRUE, FALSE),
IF(onePeriod, IF(startDates <= IF(endDates > 0, endDates, startDates + 1),0, 1),
LET(seq, SEQUENCE(ROWS(startDates)),
mapResult, MAP(startDates, endDates, seq, LAMBDA(start,end,idx,
LET(incIdx, 1-N(ISNUMBER(XMATCH(seq,idx))),
startInc, FILTER(startDates, incIdx), endInc, FILTER(endDates, incIdx),
SUMPRODUCT((startInc <= end) * (endInc >= start ))
))),SUM(mapResult)))/2
))), HSTACK(namesUx, byRowResult)
)
We need to divide by 2 the result, because we are counting the overlapping in both directions. A overlaps with B and vice versa.
It can be further simplified because there is no need to build the names: startInc, endInc to exclude the range itself we are checking for overlap. We can include it and subtract one overlap. This is the way to do it:
=LET(input, {"N1",0,0;"N1",0,10;"N1",10,20},
names, INDEX(input,,1), namesUx, UNIQUE(names), dates, FILTER(input, {0,1,1}),
byRowResult, BYROW(namesUx, LAMBDA(name,
LET(set, FILTER(dates, names=name),
startDates, INDEX(set,,1), endDates, INDEX(set,,2),
onePeriod, IF(ROWS(startDates)=1, TRUE, FALSE),
IF(onePeriod, IF(startDates <= IF(endDates > 0,
endDates, startDates + 1),0, 1),
SUM(MAP(startDates, endDates, LAMBDA(start,end,
SUMPRODUCT((startDates <= end) * (endDates >= start ))-1)))/2)
))), HSTACK(namesUx, byRowResult)
)
Here, the output, removing the array as input and using the corresponding range A2:C4. Providing also a graphical representations of the intervals (highlighted) and in cell G2 putting the corresponding previous formula:
Note: Since we are using SUMPRODUCT with a single input, it can be replaced with SUM.

Python: Finding the row index of a value in 2D array when a condition is met

I have a 2D array PointAndTangent of dimension 8500 x 5. The data is row-wise with 8500 data rows and 5 data values for each row. I need to extract the row index of an element in 4th column when this condition is met, for any s:
abs(PointAndTangent[:,3] - s) <= 0.005
I just need the row index of the first match for the above condition. I tried using the following:
index = np.all([[abs(s - PointAndTangent[:, 3])<= 0.005], [abs(s - PointAndTangent[:, 3]) <= 0.005]], axis=0)
i = int(np.where(np.squeeze(index))[0])
which doesn't work. I get the follwing error:
i = int(np.where(np.squeeze(index))[0])
TypeError: only size-1 arrays can be converted to Python scalars
I am not so proficient with NumPy in Python. Any suggestions would be great. I am trying to avoid using for loop as this is small part of a huge simulation that I am trying.
Thanks!
Possible Solution
I used the following
idx = (np.abs(PointAndTangent[:,3] - s)).argmin()
It seems to work. It returns the row index of the nearest value to s in the 4th column.
You were almost there. np.where is one of the most abused functions in numpy. Half the time, you really want np.nonzero, and the other half, you want to use the boolean mask directly. In your case, you want np.flatnonzero or np.argmax:
mask = abs(PointAndTangent[:,3] - s) <= 0.005
mask is a 1D array with ones where the condition is met, and zeros elsewhere. You can get the indices of all the ones with flatnonzero and select the first one:
index = np.flatnonzero(mask)[0]
Alternatively, you can select the first one directly with argmax:
index = np.argmax(mask)
The solutions behave differently in the case when there are no rows meeting your condition. Three former does indexing, so will raise an error. The latter will return zero, which can also be a real result.
Both can be written as a one-liner by replacing mask with the expression that was assigned to it.

How do I use an across loop in post condition to compare an old array and new array at certain indices?

I have a method that shifts all the items, in an array, to the left by one position. In my post condition I need to ensure that my items have shifted to the left by one. I have already compared the first element of the old array to the last element of the new array. How do i across loop through the old array from 2 until count, loop through the new array from 1 until count-1 and compare them? This is my implementation so far.
items_shifted:
old array.deep_twin[1] ~ array[array.count]
and
across 2 |..| (old array.deep_twin.count) as i_twin all
across 1 |..| (array.count-1) as i_orig all
i_twin.item ~ i_orig.item
end
end
end
I expected the result to be true but instead I get a contract violation pointing to this post condition. I have tested the method out manually by printing out the array before and after the method and I get the expected result.
In the postcondition that fails, the loop cursors i_twin and i_orig iterate over sequences 2 .. array.count and 1 .. array.count - 1 respectively, i.e. their items are indexes 2, 3, ... and 1, 2, .... So, the loop performs comparisons 2 ~ 1, 3 ~ 2, etc. (at run-time, it stops on the first inequality). However, you would like to compare elements, not indexes.
One possible solution is shown below:
items_shifted:
across array as c all
c.item =
if c.target_index < array.upper then
(old array.twin) [c.target_index + 1]
else
old array [array.lower]
end
end
The loop checks that all elements are shifted. If the cursor points to the last element, it compares it against the old first element. Otherwise, it tests whether the current element is equal to the old element at the next index.
Cosmetics:
The postcondition does not assume that the array starts at 1, and uses array.lower and array.upper instead.
The postcondition does not perform a deep twin of the original array. This allows for comparing elements using = rather than ~.
Edit: To avoid potential confusion caused by precedence rules, and to highlight that comparison is performed for all items between old and new array, a better variant suggested by Eric Bezault looks like:
items_shifted:
across array as c all
c.item =(old array.twin)
[if c.target_index < array.upper then
c.target_index + 1
else
array.lower
end]
end

Using Index and Match but changing the column when I drag down instead of the row

I am using this formula to pull the first non 0 value in a column:
{=INDEX(FT!D$16:D$610,MATCH(TRUE,FT!D$16:D$610 >0,0))}
I want to be able to drag down this formula and shift the column. For example, when I drag down, the next formula would be:
{=INDEX(FT!E$16:E$610,MATCH(TRUE,FT!E$16:E$610 >0,0))}
and then:
{=INDEX(FT!F$16:F$610,MATCH(TRUE,FT!F$16:F$610 >0,0))}
Please note that I had to do control-shift-enter when applying these formulas.
I read somewhere that I may need to use offset but I couldn't get it to work.
Thanks!
The showed formula does not take the first non 0 value in a column but the first value which is greater than 0. Even text will be greater than 0. And negative values will be excluded.
But if you have the columns D up to H, then the following will do the same as your formula and is fillable downwards 5 Rows (D up to H).
{=INDEX(FT!$D$16:$H$610,MATCH(TRUE,INDEX(FT!$D$16:$H$610,0,ROW(A1))>0,0),ROW(A1))}
Row(A1) is in the first formula. While filling down it changes to Row(A2), Row(A3), ... which leads to 1, 2, 3, ..., 5 as the column parameter in INDEX.

Combine two rows of cell which includes same element

For example I have a cell array like:
Column1----Column2
'aaaa'--------4
'bbbb'--------5
'cccc'---------2
'cccc'---------0
'dddd'--------0
'dddd'--------3
'eeee'--------0
'ffff'-----------0
And what I want is to merge the rows which has same elements. Finally what I want to obtain is:
'aaaa'--------4
'bbbb'--------5
'cccc'---------2
'dddd'--------3
'eeee'--------0
'ffff'-----------0
I'm looking for an answer without for loops.
Find all completely unique strings (i.e ffff_0 and ffff__1 are unique, but aaaa_1 and aaaa___1 are obviously not unique. (apparently underscores represent formatting?)
Once you have that, do the same thing with just the letters.
I am pretty sure you will have to do that (above) in some capacity to get your desired output, and if that's the case, I think you are right on the edge of the speed tradeoff between for loops and all that extra memory allocation and sorting through values finding unique ones.
Try this:
arr([arr{:,2}] ~= 0,:)
arr -
rows: all rows of arr such that the second column does not equal 0,
columns: all columns
Might be a syntax error in there somewhere, been a while since I used Matlab...
Edit: New answer
non_zero = transpose([arr{:,2}] ~= 0);
arr = arr(non_zero | ~ismember(arr(:,1),arr(non_zero,1)),:)
Essentially what I'm doing: Get all rows such that the right hand side is not zero, OR the left-hand side is not a member of the left-hand sides of the non-zero rows. The latter condition will only be satisfied for rows with zero with no matching left-hand side in the non-zero rows (and hence not repeats). Now keep in mind that this will still not work if you have any duplicate rows (same left and right-hand side). If that's a possibility then do this:
non_zero = transpose([arr{:,2}] ~= 0);
arr = arr(non_zero | ~ismember(arr(:,1),arr(non_zero,1)),:);
[~,U] = unique(arr(:,1));
arr = arr(U,:)

Resources