Defining pointers vs accessing values in an Array in Ruby - arrays

Let's say I'm passing the following array to a method:
input = [1,9,10,3,2,3,11,0,99,30,40,50]
I need to work with sets of 4 numbers from that array as follows:
OPCODE = input[0] # first of the 4 numbers
pos1_pointer = will always be opcode position + 1 position to the right
pos2_pointer = will always be opcode position + 2 positions to the right
output = will always be opcode position + 3 positions to the right
pos1 and pos2 numbers are actually pointers to the actual values (eg. pos1_pointer = 9 (one position from opcode), actual value 30 (position 9 in the array).
How do I define the pointer based on where the OPCODE is sitting?
I've tried:
pos1_pointer = input[input[opcode] + 1] # points to 40 which is wrong (because opcode = input[0] which is 1, and it sums 9 positions to that, position 10 being value 40)
pos1_pointer = input[opcode + 1] # is also wrong because it assigns a value of 2 to it (it sums 1 to the value of opcode which is 1)

Use Iterators Rather Than Indexing
In Ruby, if you want to work with subsets of an array, there's a method for that! Array#each_slice can be used to feed your slices directly into a method, or to deconstruct them further. For example:
input = [1, 9, 10, 3, 2, 3, 11, 0, 99, 30, 40, 50]
input.each_slice(4) do |slice|
opcode, pos1, pos2, output = slice
pp slice
end
You could replace pp slice with a call to your method(s) of choice, passing in either the whole slice or the deconstructed values as positional or keyword arguments. Let Ruby manage the indexing so you can focus on more important things.

Why not try this. The first argument, arr, is the array. The second argument, i, is the index at which you want to start from.
The method bellow will take the array and the index and then return an array starting from the whichever index, in this case 0, and then return everything between index 0 and index 0+3.
Double dots x..y means go from range x up to y. Triple dot x...y means start from x and go up to BUT exclude y.
def four(arr, i)
return arr[i..i+3]
end
print four([1,9,10,3,2,3,11,0,99,30,40,50],0)
# input = [1,9,10,3,2,3,11,0,99,30,40,50]
# output = [1,9,10,3]

Related

Most Efficient Algorithm to Align an Multiple Ordered Sequences

I have a strange feeling this is a very easy problem to solve but I'm not finding a good way of doing this without using brute force or dynamic programming. Here it goes:
Given N arrays of ordered and monotonic values, find the set of positions for each array i1, i2 ... in that minimises pair-wise difference of values at those indexes between all arrays. In other words, find the positions for all arrays whose values are closest to each other. Multiple solutions may exist and arrays may or may not be equally sized.
If A denotes the list of all arrays, the pair-wise difference is given by the sum of absolute differences between all values at the given indexes between all different arrays, as so:
An example, 3 arrays a, b and c:
a = [20 29 30 32 33]
b = [28 29 30 32 33]
c = [10 12 28 31 32 33]
The best alignment for this array would be a[3] b[3] c[4] or a[4] b[4] c[5], because (32,32,32) and (33,33,33) are all equal values and have, therefore minimum pairwise difference between each other. (Assuming array index starts at 0)
This is a common problem in bioinformatics thats usually solved with Dynamic Programming, but due to the fact this is an ordered sequence, I think there's somehow a way of exploiting this notion of order. I first thought about doing this pairwise, but this does not guarantee the global optimum because the best local answer might not be the best global answer.
This is meant to be language agnostic, but I don't really mind an answer for a specific language, as long as there is no loss of generality. I know Dynamic Programming is an option here, but I have a feeling there's an easier way to do this?
The tricky thing is parsing the arrays so that at some point you're guaranteed to be considering the set of indices that realize the pairwise min. Using a min heap on the values doesn't work. Counterexample with 4 arrays: [0,5], [1,2], [2], [2]. We start with a d(0,1,2,2) = 7, optimal is d(0,2,2,2) = 6, but the min heap moves us from 7 to d(5,1,2,2) = 12, then d(5,2,2,2) = 9.
I believe (but haven't proved) that if we alway increment the index that improves pairwise distance the most (or degrades it the least), we're guaranteed to visit every local min and the global min.
Assuming n total elements across k arrays:
Simple approach: we repeatedly get the pairwise distance deltas (delta wrt. incrementing each index), increment the best one, and any time doing so switch us from improvement to degradation (i.e. a local minimum) we calculate the pairwise distance. All this is O(k^2) per increment for a total running time of O((n-k) * (k^2)).
With O(k^2) storage, we could keep an array where (i,j) stores the pairwise distance delta achieve by increment the index of array i wrt. array j. We also store the column sums. Then on incrementing an index we can update the appropriate row & column & column sums in O(k). This gives us a running time of O((n-k)*k)
To just complete Dave's answer, here is the pseudocode of the delta algorithm:
initialise index_table to 0's where each row i denotes the index for the ith array
initialise delta_table with the corresponding cost of incrementing index of ith array and keeping the other indexes at their current values
cur_cost <- cost of current index table
best_cost <- cur_cost
best_solutions <- list with the current index table
while (can_at_least_one_index_increase)
i <- index whose delta is lowest
increment i-th entry of the index_table
if cost(index_table) < cur_cost
cur_cost = cost(index_table)
best_solutions = {} U {index_table}
if cost(index_table) = cur_cost
best_solutions = best_solutions U {index_table}
update delta_table
Important Note: During an iteration, some index_table entries might have already reached the maximum value for that array. Whenever updating the delta_table, it is necessary to never pick those values, otherwise this will result in a Array Out of Bounds,Segmentation Fault or undefined behaviour. A neat trick is to simply check which indexes are already at max and set a sufficiently large value, so they are never picked. If no index can increase anymore, the loop will end.
Here's an implementation in Python:
def align_ordered_sequences(arrays: list):
def get_cost(index_table):
n = len(arrays)
if n == 1:
return 0
sum = 0
for i in range(0, n-1):
for j in range(i+1, n):
v1 = arrays[i][index_table[i]]
v2 = arrays[j][index_table[j]]
sum += math.sqrt((v1 - v2) ** 2)
return sum
def compute_delta_table(index_table):
# Initialise the delta table: we switch each index element to 1, call
# the cost method and then revert the change, this avoids having to
# create copies, which decreases performance unnecessarily
delta_table = []
for i in range(n):
if index_table[i] + 1 >= len(arrays[i]):
# Implementation detail: if the index is outside the bounds of
# array i, choose a "large enough" number
delta_table.append(999999999999999)
else:
index_table[i] = index_table[i] + 1
delta_table.append(get_cost(index_table))
index_table[i] = index_table[i] - 1
return delta_table
def can_at_least_one_index_increase(index_table):
answer = False
for i in range(len(arrays)):
if index_table[i] < len(arrays[i]) - 1:
answer = True
return answer
n = len(arrays)
index_table = [0] * n
delta_table = compute_delta_table(index_table)
best_solutions = [index_table.copy()]
cur_cost = get_cost(index_table)
best_cost = cur_cost
while can_at_least_one_index_increase(index_table):
i = delta_table.index(min(delta_table))
index_table[i] = index_table[i] + 1
new_cost = get_cost(index_table)
# A new best solution was found
if new_cost < cur_cost:
cur_cost = new_cost
best_solutions = [index_table.copy()]
# A new solution with the same cost was found
elif new_cost == cur_cost:
best_solutions.append(index_table.copy())
# Update the delta table
delta_table = compute_delta_table(index_table)
return best_solutions
And here are some examples:
>>> print(align_ordered_sequences([[0,5], [1,2], [2], [2]]))
[[0, 1, 0, 0]]
>> print(align_ordered_sequences([[3, 5, 8, 29, 40, 50], [1, 4, 14, 17, 29, 50]]))
[[3, 4], [5, 5]]
Note 2: this outputs indexes not the actual values of each array.

In MATLAB how can I write out a multidimensional array as a string that looks like a raw numpy array?

The Goal
(Forgive me for length of this, it's mostly background and detail.)
I'm contributing to a TOML encoder/decoder for MATLAB and I'm working with numerical arrays right now. I want to input (and then be able to write out) the numerical array in the same format. This format is the nested square-bracket format that is used by numpy.array. For example, to make multi-dimensional arrays in numpy:
The following is in python, just to be clear. It is a useful example though my work is in MATLAB.
2D arrays
>> x = np.array([1,2])
>> x
array([1, 2])
>> x = np.array([[1],[2]])
>> x
array([[1],
[2]])
3D array
>> x = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
>> x
array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
4D array
>> x = np.array([[[[1,2],[3,4]],[[5,6],[7,8]]],[[[9,10],[11,12]],[[13,14],[15,16]]]])
>> x
array([[[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]]],
[[[ 9, 10],
[11, 12]],
[[13, 14],
[15, 16]]]])
The input is a logical construction of the dimensions by nested brackets. Turns out this works pretty well with the TOML array structure. I can already successfully parse and decode any size/any dimension numeric array with this format from TOML to MATLAB numerical array data type.
Now, I want to encode that MATLAB numerical array back into this char/string structure to write back out to TOML (or whatever string).
So I have the following 4D array in MATLAB (same 4D array as with numpy):
>> x = permute(reshape([1:16],2,2,2,2),[2,1,3,4])
x(:,:,1,1) =
1 2
3 4
x(:,:,2,1) =
5 6
7 8
x(:,:,1,2) =
9 10
11 12
x(:,:,2,2) =
13 14
15 16
And I want to turn that into a string that has the same format as the 4D numpy input (with some function named bracketarray or something):
>> str = bracketarray(x)
str =
'[[[[1,2],[3,4]],[[5,6],[7,8]]],[[[9,10],[11,12]],[[13,14],[15,16]]]]'
I can then write out the string to a file.
EDIT: I should add, that the function numpy.array2string() basically does exactly what I want, though it adds some other whitespace characters. But I can't use that as part of the solution, though it is basically the functionality I'm looking for.
The Problem
Here's my problem. I have successfully solved this problem for up to 3 dimensions using the following function, but I cannot for the life of me figure out how to extend it to N-dimensions. I feel like it's an issue of the right kind of counting for each dimension, making sure to not skip any and to nest the brackets correctly.
Current bracketarray.m that works up to 3D
function out = bracketarray(in, internal)
in_size = size(in);
in_dims = ndims(in);
% if array has only 2 dimensions, create the string
if in_dims == 2
storage = cell(in_size(1), 1);
for jj = 1:in_size(1)
storage{jj} = strcat('[', strjoin(split(num2str(in(jj, :)))', ','), ']');
end
if exist('internal', 'var') || in_size(1) > 1 || (in_size(1) == 1 && in_dims >= 3)
out = {strcat('[', strjoin(storage, ','), ']')};
else
out = storage;
end
return
% if array has more than 2 dimensions, recursively send planes of 2 dimensions for encoding
else
out = cell(in_size(end), 1);
for ii = 1:in_size(end) %<--- this doesn't track dimensions or counts of them
out(ii) = bracketarray(in(:,:,ii), 'internal'); %<--- this is limited to 3 dimensions atm. and out(indexing) need help
end
end
% bracket the final bit together
if in_size(1) > 1 || (in_size(1) == 1 && in_dims >= 3)
out = {strcat('[', strjoin(out, ','), ']')};
end
end
Help me Obi-wan Kenobis, y'all are my only hope!
EDIT 2: Added test suite below and modified current code a bit.
Test Suite
Here is a test suite to use to see if the output is what it should be. Basically just copy and paste it into the MATLAB command window. For my current posted code, they all return true except the ones more than 3D. My current code outputs as a cell. If your solution output differently (like a string), then you'll have to remove the curly brackets from the test suite.
isequal(bracketarray(ones(1,1)), {'[1]'})
isequal(bracketarray(ones(2,1)), {'[[1],[1]]'})
isequal(bracketarray(ones(1,2)), {'[1,1]'})
isequal(bracketarray(ones(2,2)), {'[[1,1],[1,1]]'})
isequal(bracketarray(ones(3,2)), {'[[1,1],[1,1],[1,1]]'})
isequal(bracketarray(ones(2,3)), {'[[1,1,1],[1,1,1]]'})
isequal(bracketarray(ones(1,1,2)), {'[[[1]],[[1]]]'})
isequal(bracketarray(ones(2,1,2)), {'[[[1],[1]],[[1],[1]]]'})
isequal(bracketarray(ones(1,2,2)), {'[[[1,1]],[[1,1]]]'})
isequal(bracketarray(ones(2,2,2)), {'[[[1,1],[1,1]],[[1,1],[1,1]]]'})
isequal(bracketarray(ones(1,1,1,2)), {'[[[[1]]],[[[1]]]]'})
isequal(bracketarray(ones(2,1,1,2)), {'[[[[1],[1]]],[[[1],[1]]]]'})
isequal(bracketarray(ones(1,2,1,2)), {'[[[[1,1]]],[[[1,1]]]]'})
isequal(bracketarray(ones(1,1,2,2)), {'[[[[1]],[[1]]],[[[1]],[[1]]]]'})
isequal(bracketarray(ones(2,1,2,2)), {'[[[[1],[1]],[[1],[1]]],[[[1],[1]],[[1],[1]]]]'})
isequal(bracketarray(ones(1,2,2,2)), {'[[[[1,1]],[[1,1]]],[[[1,1]],[[1,1]]]]'})
isequal(bracketarray(ones(2,2,2,2)), {'[[[[1,1],[1,1]],[[1,1],[1,1]]],[[[1,1],[1,1]],[[1,1],[1,1]]]]'})
isequal(bracketarray(permute(reshape([1:16],2,2,2,2),[2,1,3,4])), {'[[[[1,2],[3,4]],[[5,6],[7,8]]],[[[9,10],[11,12]],[[13,14],[15,16]]]]'})
isequal(bracketarray(ones(1,1,1,1,2)), {'[[[[[1]]]],[[[[1]]]]]'})
I think it would be easier to just loop and use join. Your test cases pass.
function out = bracketarray_matlabbit(in)
out = permute(in, [2 1 3:ndims(in)]);
out = string(out);
dimsToCat = ndims(out);
if iscolumn(out)
dimsToCat = dimsToCat-1;
end
for i = 1:dimsToCat
out = "[" + join(out, ",", i) + "]";
end
end
This also seems to be faster than the route you were pursing:
>> x = permute(reshape([1:16],2,2,2,2),[2,1,3,4]);
>> tic; for i = 1:1e4; bracketarray_matlabbit(x); end; toc
Elapsed time is 0.187955 seconds.
>> tic; for i = 1:1e4; bracketarray_cris_luengo(x); end; toc
Elapsed time is 5.859952 seconds.
The recursive function is almost complete. What is missing is a way to index the last dimension. There are several ways to do this, the neatest, I find, is as follows:
n = ndims(x);
index = cell(n-1, 1);
index(:) = {':'};
y = x(index{:}, ii);
It's a little tricky at first, but this is what happens: index is a set of n-1 strings ':'. index{:} is a comma-separated list of these strings. When we index x(index{:},ii) we actually do x(:,:,:,ii) (if n is 4).
The completed recursive function is:
function out = bracketarray(in)
n = ndims(in);
if n == 2
% Fill in your n==2 code here
else
% if array has more than 2 dimensions, recursively send planes of 2 dimensions for encoding
index = cell(n-1, 1);
index(:) = {':'};
storage = cell(size(in, n), 1);
for ii = 1:size(in, n)
storage(ii) = bracketarray(in(index{:}, ii)); % last dimension automatically removed
end
end
out = { strcat('[', strjoin(storage, ','), ']') };
Note that I have preallocated the storage cell array, to prevent it from being resized in every loop iteration. You should do the same in your 2D case code. Preallocating is important in MATLAB for performance reasons, and the MATLAB Editor should warm you about this too.

Find index corresponding between two different sized array lists with same total elements

How can corresponding array indices be found within two differently shaped arrays of arrays that are the same size?
For example, an array x of size 36 is split into 11 arrays. Another array y of size 36 is split into 4 arrays. Then some modifications happen on the 4 arrays making up y.
N = 6 #some size param
x = np.zeros(N*N,dtype=np.int) #make empty array
s1 = np.array_split(x,11) #split array into arbitrary parts
y = np.random.randint(5, size=(N, N)) #make another same size array (and modify it)
s2 = np.array_split(y,4) #split array into different number of parts
Then iterating through the 4 arrays of y, I need to find the start index in the first array (array_num) of s1, to the end index of the last array of s1 that the values in s2 correspond to.
for sub_s2 in s2:
array_num = ?
s_idx = ?
e_idx = ?
s2_idx = ?
e2_idx = ?
#put the array into the correct ordered indexes of the other array
s1[array_num][s_idx,e_idx] = sub_s2[s2_idx,e2_idx]
res = np.concatenate(s1)
I made this image to try and illustrate the issue. In this case, 'data' means the size of x and y to start. Then s1 and s2 are broken into different chunks, and the problem is finding the index within each chunk that the arrays in s2 correspond to.
Here is how to find the correct indices:
# create example use same data for both splits for easy validation
a = np.arange(36)
s1 = np.array_split(a, 11)
s2 = np.array_split(a, 4)
# recover absolute offsets of bit boundaries
l1 = np.cumsum([0, *map(len,s1)])
l2 = np.cumsum([0, *map(len,s2)])
# find bits in s1 into which the first ...
start_n = l1[1:].searchsorted(l2[:-1], 'right')
# ... and last elements of bits of s2 fall
end_n = l1[1:].searchsorted(l2[1:]-1, 'right')
# find the corresponding indices into bits of s1
start_idx = l2[:-1] - l1[start_n]
end_idx = l2[1:]-1 - l1[end_n]
# check
[s[0] for s in s2]
# [0, 9, 18, 27]
[s1[n][i] for n, i in zip(start_n, start_idx)]
# [0, 9, 18, 27]
[s[-1] for s in s2]
# [8, 17, 26, 35]
[s1[n][i] for n, i in zip(end_n, end_idx)]
# [8, 17, 26, 35]

How to get average of values in array between two given indexes in Swift

I'm trying to get the average of the values between two indexes in an array. The solution I first came to reduces the array to the required range, before taking the sum of values divided by the number of values. A simplified version looks like this:
let array = [0, 2, 4, 6, 8, 10, 12]
// The aim is to take the average of the values between array[n] and array[.count - 1].
I attempted with the following code:
func avgOf(x: Int) throws -> String {
let avgforx = solveList.count - x
// Error handling to check if x in average of x does not overstep bounds
guard avgforx > 0 else {
throw FuncError.avgNotPossible
}
solveList.removeSubrange(ClosedRange(uncheckedBounds: (lower: 0, upper: avgforx - 1)))
let avgx = (solveList.reduce(0, +)) / Double(x)
// Rounding
let roundedAvgOfX = (avgx * 1000).rounded() / 1000
print(roundedAvgOfX)
return "\(roundedAvgOfX)"
}
where avgforx is used to represent the lower bound :
array[(.count - 1) - x])
The guard statement makes sure that if the index is out of range, the error is handled properly.
solveList.removeSubrange was my initial solution, as it removes the values outside of the needed index range (and subsequently delivers the needed result), but this has proved to be problematic as the values not taken in the average should remain.
The line in removeSubrange basically takes a needed index field (e.g. array[5] to array[10]), removes all the values from array[0] to array[4], and then takes the sum of the resulting array divided by the number of elements.
Instead, the values in array[0] to array[4] should remain.
I would appreciate any help.
(Swift 4, Xcode 10)
Apart from the fact that the original array is modified, the error in your code is that it divides the sum of the remaining elements by the count of the removed elements (x) instead of dividing by the count of remaining elements.
A better approach might be to define a function which computes the average of a collection of integers:
func average<C: Collection>(of c: C) -> Double where C.Element == Int {
precondition(!c.isEmpty, "Cannot compute average of empty collection")
return Double(c.reduce(0, +))/Double(c.count)
}
Now you can use that with slices, without modifying the original array:
let array = [0, 2, 4, 6, 8, 10, 12]
let avg1 = average(of: array[3...]) // Average from index 3 to the end
let avg2 = average(of: array[2...4]) // Average from index 2 to 4
let avg3 = average(of: array[..<5]) // Average of first 5 elements

Append new variables to IDL for loop array

If I have the following array:
x = double([1, 1, 1, 10, 1, 1, 50, 1, 1, 1 ])
I want to do the following:
Group the array into groups of 5 which will each be evaluated separately.
Identify the MAX value each of the groups of the array
Remove that MAX value and put it into another array.
Finally, I want to print the updated array x without the MAX values, and the new array containing the MAX values.
How can I do this? I am new to IDL and have had no formal training in coding.
I understand that I can write the code to group and find the max values this way:
FOR i = 1, (n_elements(x)-4) do begin
print, "MAX of array", MAX( MAX(x[i-1:1+3])
ENDFOR
However, how do I implement all of what I specified above? I know I have to create an empty array that will append the values found by the for loop, but I don't know how to do that.
Thanks
I changed your x to have unique elements to make sure I wasn't fooling myself. It this, the number of elements of x must be divisible by group_size:
x = double([1, 2, 3, 10, 4, 5, 50, 6, 7, 8])
group_size = 5
maxes = max(reform(x, group_size, n_elements(x) / group_size), ind, dimension=1)
all = bytarr(n_elements(x))
all[ind] = 1
x_without_maxes = x[where(all eq 0)]
print, maxes
print, x_without_maxes
Lists are good for this, because they allow you to pop out values at specific indices, rather than rewriting the whole array again. You might try something like the following. I've used a while loop here, rather than a for loop, because it makes it a little easier in this case.
x = List(1, 1, 1, 10, 1, 1, 50, 1, 1, 1)
maxValues = List()
pos = 4
while (pos le x.length) do begin
maxValues.add, max(x[pos-4:pos].toArray(), iMax)
x.Remove, iMax+pos-4
pos += 5-1
endwhile
print, "Max Values : ", maxValues.toArray()
print, "Remaining Values : ", x.toArray()
This allows you to do what you want I think. At the end, you have a List object (which can easily be converted to an array) with the max values for each group of 5, and another containing the remaining values.
Also, please tag this as idl-programming-language rather than idl. They are two different tags.

Resources