How to add into map in range loop - loops

package main
import (
"fmt"
)
func main() {
m := make(map[int]int, 4)
m[1] = 0
m[2] = 0
for k, _ := range m {
i := 10 + k
m[i] = 0
}
fmt.Println(m)
fmt.Println("len:", len(m))
}
This code returns: 8 or 10 or 6 as length of map after loop.
Video is here, playgroud here.
I see that new added elements go into range, but can't explain why this loop stops randomly?

Spec: For statements:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If a map entry that has not yet been reached is removed during iteration, the corresponding iteration value will not be produced. If a map entry is created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0.
The spec states that if you add entries to the map you are ranging over, the elements you add may or may not be visited by the loop, and moreover, which is visited is not even deterministic (may change when doing it again).

You are modifying map you are iterating over. This is the cause.

Related

Why doesn't my code compare the first and last number?

I have to compare up to 5 numbers in my script. That also works quite well. However, my first and last number are not compared. What's the problem in my code?
local loaded = game.Workspace.TrommelValue.Value
local randomkugel = {}
for i = 0,loaded-1 do
randomkugel[i] = math.random(1,6)
for index = 0, i+1 do
if i ~= index then
if randomkugel[i]==randomkugel[index] then
randomkugel[i]=math.random(1,6)
index = 0
end
end
end
end
Thank you for helping me!
I don't know what game.Workspace.trommelValue.value is, but let's assume its a positive number.
First iteration
-- assign a random integer from [1-6] to `randomkugel[0]`
randomkugel[0] = math.random(1,6)
Now you run the inner loop the first time. The first cycle is skipped as i == index.
The second cylcle is skipped because randomkugel[i] is the random number and randomkugel[index] is nil
Second iteration of the outer loop. i is 1, random[1] is assigned a random value from [1-6].
Inner loop:
first run i is 1, index is 0 so the first if statement is entered.
randomkugel[1] may randomly equal randomkugel[0]. In that case you would assign a new random value to randomkugel[1] and set index to 0 which does not have any effect.
In case both values are not equal nothing happens.
second cycle of the inner loop, i is still 1, index is 2.
As there is no randomkugel[2] this cycle does nothing.
You will always skip the last cycle of the inner loop as you're comparing vs nil every time.
So your inner loop is effectively
for index = 0, i do
if i ~= index then
if randomkugel[i]==randomkugel[index] then
randomkugel[i]=math.random(1,6)
end
end
end
I guess your greatest misconception here is that you can somehow reset the loop counting variable index inside the loop body.
Any change to index is only valid after that change within the current iteration.

Delete and sort elements in a object [duplicate]

I have a list of strings and I want to keep only the most unique strings. Here is how I have implemented this (maybe there's an issue with the loop),
def filter_descriptions(descriptions):
MAX_SIMILAR_ALLOWED = 0.6 #40% unique and 60% similar
i = 0
while i < len(descriptions):
print("Processing {}/{}...".format(i + 1, len(descriptions)))
desc_to_evaluate = descriptions[i]
j = i + 1
while j < len(descriptions):
similarity_ratio = SequenceMatcher(None, desc_to_evaluate, descriptions[j]).ratio()
if similarity_ratio > MAX_SIMILAR_ALLOWED:
del descriptions[j]
j += 1
i += 1
return descriptions
Please note that the list might have around 110K items which is why I am shortening the list every iteration.
Can anyone please identify what is wrong with this current implementation?
Edit 1:
The current results are "too similar". The filter_descriptions function returned 16 items (from a list of ~110K items). When I tried the following,
SequenceMatcher(None, descriptions[0], descriptions[1]).ratio()
The ratio was 0.99, and with SequenceMatcher(None, descriptions[1], descriptions[2]).ratio() it was around 0.98. But with SequenceMatcher(None, descriptions[0], descriptions[15]).ratio() it was around 0.65 (which is better)
I hope this helps.
If you invert your logic, you can escape having to modify the list in place and still reduce the number of comparisons needed. That is, start with an empty output/unique list and iterate over your descriptions seeing if you can add each one. So for the first description you can add it immediately as it cannot be similar to anything in an empty list. The second description only needs to be compared to the first as opposed to all other descriptions. Later iterations can short circuit as soon as they find a previous description with which they are similar to (and have the candidate description be discarded). ie.
import operator
def unique(items, compare=operator.eq):
# compare is a function that returns True if its two arguments are deemed similar to
# each other and False otherwise.
unique_items = []
for item in items:
if not any(compare(item, uniq) for uniq in unique_items):
# any will stop as soon as compare(item, uniq) returns True
# you could also use `if all(not compare(item, uniq) ...` if you prefer
unique_items.append(item)
return unique_items
Examples:
assert unique([2,3,4,5,1,2,3,3,2,1]) == [2, 3, 4, 5, 1]
# note that order is preserved
assert unique([1, 2, 0, 3, 4, 5], compare=(lambda x, y: abs(x - y) <= 1))) == [1, 3, 5]
# using a custom comparison function we can exclude items that are too similar to previous
# items. Here 2 and 0 are excluded because they are too close to 1 which was accepted
# as unique first. Change the order of 3 and 4, and then 5 would also be excluded.
With your code your comparison function would look like:
MAX_SIMILAR_ALLOWED = 0.6 #40% unique and 60% similar
def description_cmp(candidate_desc, unique_desc):
# use unique_desc as first arg as this keeps the argument order the same as with your filter
# function where the first description is the one that is retained if the two descriptions
# are deemed to be too similar
similarity_ratio = SequenceMatcher(None, unique_desc, candidate_desc).ratio()
return similarity_ratio > MAX_SIMILAR_ALLOWED
def filter_descriptions(descriptions):
# This would be the new definition of your filter_descriptions function
return unique(descriptions, compare=descriptions_cmp)
The number of comparisons should be exactly the same. That is, in your implementation the first element is compared to all the others, and the second element is only compared to elements that were deemed not similar to the first element and so on. In this implementation the first item is not compared to anything initially, but all other items must be compared to it to be allowed to be added to the unique list. Only items deemed not similar to the first item will be compared to the second unique item, and so on.
The unique implementation will do less copying as it only has to copy the unique list when the backing array runs out of space. Whereas, with the del statement parts of the list must be copied each time it is used (to move all subsequent items into their new correct position). This will likely have a negligible impact on performance though, as the bottleneck is probably the ratio calculation in the sequence matcher.
The Problem with your logic is that each time when you delete an item from the array, the index gets re-arranged and skips a string in between. Eg:
Assume that this is the array:
Description : ["A","A","A","B","C"]
iterartion 1:
i=0 -------------0
description[i]="A"
j=i+1 -------------1
description[j]="A"
similarity_ratio>0.6
del description[j]
Now the array is re-indexed like:
Description:["A","A","B","C"]. The next step is:
j=j+1 ------------1+1= 2
Description[2]="B"
You have skipped Description[1]="A"
To fix this :
Replace
j+=1
With
j=i+1
if deleted. Else do the normal j=j+1 iteration
The value of j should not change when an item from the list is deleted (since a different list item will be present on that spot in the next iteration). Doing j=i+1 restarts the iteration every time an item is deleted (which is not what is desired). The updated code now only increments j in the else condition.
def filter_descriptions(descriptions):
MAX_SIMILAR_ALLOWED = 0.6 #40% unique and 60% similar
i = 0
while i < len(descriptions):
print("Processing {}/{}...".format(i + 1, len(descriptions)))
desc_to_evaluate = descriptions[i]
j = i + 1
while j < len(descriptions):
similarity_ratio = SequenceMatcher(None, desc_to_evaluate, descriptions[j]).ratio()
if similarity_ratio > MAX_SIMILAR_ALLOWED:
del descriptions[j]
else:
j += 1
i += 1
return descriptions

How to find a number that was repeated (n/3) times an array of size n, in O(n) time and O(n) space?

I have this question that I just can't figure it out! Any hints would mean a lot. Thank you in advance.
I have an array, A. It's size is n, and I want to find an algorithm that will find x that appears in this array at least n/3 times. If there is no such x in the array then we will print that we didn't find one!
I need to find an algorithm that does this in O(n) time and takes O(n) space.
For example:
A=[1 1 2 2 1 1 1 5 6 7]
For the above array, the algorithm should return 1.
If I was you, I write an algorithm that:
Instantiates a map (i.e. key/value pairs) in whatever language you're using. The key will be the integer you find, the value will be the number of times it has been seen so far.
Iterate over the array. For the current integer, check whether the number exists as a key in your map. If it exists, increment the map's value. If it doesn't exist, insert a new element with a count of 1.
After the iteration is complete, iterate over your map. If any elements have counts of greater than n/3, print it out. Handle the case where none are found, etc.
Here is my solution in pseudocode; note that it is possible to have two solutions as well as one or none:
func anna(A, n) # array and length
ht := {} # create empty hash table
for k in [0,n) # iterate over array
if A[k] in ht # previously seen
ht{k} := ht{k} + 1 # increment count
else # previously seen
ht{k} := 1 # initialize count
solved := False # flag if solution found
for k in keys(ht) # iterate over hash table
if ht{k} > n / 3 # found solution
solved := True # update flag
print k # write it
if not solved # no solution found
print "No solution" # report failure
The first for loop takes O(n) time. The second for loop potentially takes O(n) time if all items in the array are distinct, though most often the second for loop will take much less time. The hash table takes O(n) space if all items in the array are distinct, though most often it takes much less space.
It is possible to optimize the solution so it stops early and reports failure if there are no possible solutions. To do that, keep a variable max in the first for loop, increment it every time it is exceeded by a new hash table count, and check after each element is added to the hash table if max + n - k < n / 3.

Filling a row and columns of a ndarray with a loop

I'm starting with Python and I have a basic question with "for" loop
I have two array which contains a values of a same variables:
A = data_lac[:,0]
In the first array, I have values of area and in the second on, values of mean depth.
I would like to find a way to automatize my calculation with different value of a parameter. The equation is the following one:
g= (np.sqrt(A/pi))/n
Here I can calculte my "g" for each row. Now I want to have a loop with differents values of "n". I did this:
i=0
while i <= len(A)-1:
for n in range(2,6):
g[i] = (np.sqrt(A[i]/pi))/n
i += 1
break
In this case, I just have one column with the calculation for n = 2 but not the following one. I tried to add a second dimension to my array but I have an error message saying that I have too many indices for array.
In other, I would like this array:
g[len(A),5]
g has 5 columns each one calculating with a different "n"
Any tips would be very helpful,
Thanks
Update of the code:
data_lac=np.zeros((106,7))
data_lac[:,0:2]=np.loadtxt("/home...", delimiter=';', skiprows=1, usecols=(0,1))
data_lac[:,1]=data_lac[:,1]*0.001
#Initialisation
A = data_lac[:,0]
#example for A with 4 elements
A=[2.1, 32.0, 4.6, 25]
g = np.zeros((len(A),))
I believe you share the indexes within both loops. You were increasing the i (index for the upper while loop) inside the inner for loop (which index with n).
I guess you have A (1 dim array) and you want to produce G (2 dim array) with size of (Len(A, 5))
I am not sure I'm fully understand your require output but I believe you want something like:
i=0
while i <= len(A)-1:
for n in range(2,6):
g[i][n-2] = (np.sqrt(A[i]/pi))/n # n-2 is to get first index as 0 and last as 4
i += 1 # notice the increace of the i is for the upper while loop
break
Important - remember that in python indentation means a lot -> so make sure the i +=1 is under the while scope and not indent to be inside the for loop
Notice - G definition should be as:
g = np.zeros((len(A),4), dtype=float)
The way you define it (without the 4) cause it to be 1 dim array and not 2-dim

Loop through items and sum items in SPSS

I have two sets of variables called ITEM 1 to ITEM 47, and another called L1 to L47. What I want to do is to calculate the sum of Ls if any ITEM#i=1. What I wrote is as following:
COMPUTE LSUM=0.
LOOP
#i=1 to 47.
IF (ITEM(#i)=1) LSUM=LSUM+L(#i).
END LOOP.
But I got an error message saying the characters do not match any existing function or vector. What should I do then? Your inputs will be very appreciated.
Thanks.
Sincerely,
Lucy
COMPUTE LSUM=0.
exe.
vector vitems = ITEM 1 to ITEM 47.
vector vl = L1 to L47.
LOOP #vecid = 1 to 47.
do IF ( vitems(#vecid) eq 1 and not missing(vl(#vecid)) ).
compute LSUM=LSUM+vl(#vecid).
end if.
END LOOP.
exe.
See the VECTOR command in SPSS. You can not just create loop and treat variables as in array. They must first be put into vectors. Also, check the COMPUTE command. I think SUM would be more appropriate because if you write " compute v1 = v2 + v3 " and v2 has data but v3 is blank, v1 will be blank.

Resources