Repetition of array elements in MATLAB - arrays

I have a MATLAB array and want to make a repetition based on the number of array elements. Below is the example that I want.
a = [2, 4, 6, 8]
If I want 7 elements, the result is
aa = [2, 4, 6, 8, 2, 4, 6]
Or if I want 5 elements,
aa = [2, 4, 6, 8, 2]
Is there any MATLAB function which makes these kind of result?

You can use "modular indexing":
a = [2, 4, 6, 8]; % data vector
n = 7; % desired number of elements
aa = a(mod(0:n-1, numel(a))+1);

One simple option will be to use a temporary variable for that:
a = [2 4 6 8];
k = 7;
tmp = repmat(a,1,ceil(k/numel(a)));
aa = tmp(1:k)
First, you repeat the vector using the smallest integer that makes the result larger than k, and then you remove the excess elements.
If you do that many times you can write a small helper function to do that:
function out = semi_repmat(arr,k)
tmp = repmat(arr,1,ceil(k/numel(arr)));
out = tmp(1:k);
end

Related

How would you split a numpy array where the elements give the partition size?

For an numpy 1d array such as:
In [1]: A = np.array([2,5,1,3,9,0,7,4,1,2,0,11])
In [2]: A
Out[2]: array([2,5,1,3,9,0,7,4,1,2,0,11])
I need to split the array by using the values as a sub-array length.
For the example array:
The first index has a value of 2, so I need the first split to occur at index 0 + 2, so it would result in ([2,5,1]).
Skip to index 3 (since indices 0-2 were gobbled up in step 1).
The value at index 3 = 3, so the second split would occur at index 3 + 3, and result in ([3,9,0,7]).
Skip to index 7
The value at index 7 = 4, so the third and final split would occur at index 7 + 4, and result in ([4,1,2,0,11])
I'm using this simple array as an example, because I think it will help in my actual use case, which is reading data from binary files (either as bytes or unsigned shorts). I'm guessing that numpy will be the fastest way to do it, but I could also use struct/bytearray/lists or whatever would be best.
I hope this makes sense. I had a hard time trying to figure out how best to word the question.
Here is an approach using standard python lists and a while loop:
def custom_partition(arr):
partitions = []
i = 0
while i < len(arr):
pariton_size = arr[i]
next_i = i + pariton_size + 1
partitions.append(arr[i:next_i])
i = next_i
return partitions
a = [2, 5, 1, 3, 9, 0, 7, 4, 1, 2, 0, 11]
b = custom_partition(a)
print(b)
Output:
[[2, 5, 1], [3, 9, 0, 7], [4, 1, 2, 0, 11]]

Rearrange an array A so that A wins maximum number of comparisons with array B when comparison is done one-on-one

Let's say I have an array A = [3, 6, 7, 5, 3, 5, 6, 2, 9, 1] and B = [2, 7, 0, 9, 3, 6, 0, 6, 2, 6]
Rearrange elements of array A so that when we do comparison element-wise like 3 with 2 and 6 with 7 and so on, we have maximum wins (combinations where A[i] > B[i] are maximum (0<=i<len(A))).
I tried below approach:
def optimal_reorder(A,B,N):
tagged_A = [('d',i) for i in A]
tagged_B = [('a',i) for i in B]
merged = tagged_A + tagged_B
merged = sorted(merged,key=lambda x: x[1])
max_wins = 0
for i in range(len(merged)-1):
print (i)
if set((merged[i][0],merged[i+1][0])) == {'a','d'}:
if (merged[i][0] == 'a') and (merged[i+1][0] == 'd'):
if (merged[i][1] < merged[i+1][1]):
print (merged[i][1],merged[i+1][1])
max_wins += 1
return max_wins
as referenced from
here
but this approach doesn't seem to give correct answer for given A and B i,e if A = [3, 6, 7, 5, 3, 5, 6, 2, 9, 1] and B = [2, 7, 0, 9, 3, 6, 0, 6, 2, 6] then maximum wins is 7 but my algorithm is giving 5.
is there something I am missing here.
revised solution as suggested by #chqrlie
def optimal_reorder2(A,B):
arrA = A.copy()
C = [None] * len(B)
for i in range(len(B)):
k = i + 1
all_ele = []
while (k < len(arrA)):
if arrA[k] > B[i]:
all_ele.append(arrA[k])
k += 1
if all_ele:
e = min(all_ele)
else:
e = min(arrA)
C[i] = e
arrA.remove(e)
return C
How about this algorithm:
start with an empty array C.
for each index i in range(len(B)).
if at least one of the remaining elements of A is larger than B[i], choose e as the smallest of these elements, otherwise choose e as the smallest element of A.
set C[i] = e and remove e from A.
C should be a reordering of A that maximises the number of true comparisons C[i] > B[i].
There’s probably a much better algorithm than this, but you can think of this as a maximum bipartite matching problem. Think of the arrays as the two groups of nodes in the bipartite graph, then add an edge from A[i] to B[j] if A[i] > B[j]. Then any matching tells you how to pair elements of A with elements of B such that the A element “wins” against the B element, and a maximum matching tells you how to do this to maximize the number of wins.
I’m sure there’s a better way to do this, and I’m excited to see what other folks come up with. But this at least shows you can solve this in polynomial time.

Finding (multiset) difference between two arrays

Given arrays (say row vectors) A and B, how do I find an array C such that merging B and C will give A?
For example, given
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
then
C = multiset_diff(A, B) % Should be [4, 6, 4, 3, 1, 5]
(the order of the result does not matter here).
For the same A, if B = [2, 4, 5], then the result should be [6, 4, 3, 3, 1, 5, 5].
(Since there were two 4s in A and one 4 in B, the result C should have 2 - 1 = 1 4 in it. Similarly for the other values.)
PS: Note that setdiff would remove all instances of 2, 3, and 5, whereas here they need to be removed just however many times they appear in B.
Performance: I ran some quick-n-dirty benchmarks locally, here are the results for future reference:
#heigele's nested loop method performs best for small lengths of A (say upto N = 50 or so elements). It does 3x better for small (N=20) As, and 1.5x better for medium-sized (N=50) As, compared to the next best method - which is:
#obchardon's histc-based method. This is the one performs the best when A's size N starts to be 100 and above. For eg., this does 3x better than the above nested loop method when N = 200.
#matt's for+find method does comparably to the histc method for small N, but quickly degrades in performance for larger N (which makes sense since the entire C == B(x) comparison is run every iteration).
(The other methods are either several times slower or invalid at the time of writing.)
Still another approach using the histc function:
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
uA = unique(A);
hca = histc(A,uA);
hcb = histc(B,uA);
res = repelem(uA,hca-hcb)
We simply calculate the number of repeated elements for each vectors according to the unique value of vector A, then we use repelem to create the result.
This solution do not preserve the initial order but it don't seems to be a problem for you.
I use histc for Octave compatibility, but this function is deprecated so you can also use histcounts
Here's a vectorized way. Memory-inefficient, mostly for fun:
tA = sum(triu(bsxfun(#eq, A, A.')), 1);
tB = sum(triu(bsxfun(#eq, B, B.')), 1);
result = setdiff([A; tA].', [B; tB].', 'rows', 'stable');
result = result(:,1).';
The idea is to make each entry unique by tagging it with an occurrence number. The vectors become 2-column matrices, setdiff is applied with the 'rows' option, and then the tags are removed from the result.
You can use the second output of ismember to find the indexes where elements of B are in A, and diff to remove duplicates:
This answer assumes that B is already sorted. If that is not the case, B has to be sorted before executing above solution.
For the first example:
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
%B = sort(B); Sort if B is not sorted.
[~,col] = ismember(B,A);
indx = find(diff(col)==0);
col(indx+1) = col(indx)+1;
A(col) = [];
C = A;
>>C
4 6 4 3 1 5
For the second example:
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 4, 5, 5];
%B = sort(B); Sort if B is not sorted.
[~,col] = ismember(B,A);
indx = find(diff(col)==0);
col(indx+1) = col(indx)+1;
A(col) = [];
C = A;
>>C
6 4 3 3 1 5
I'm not a fan of loops, but for random perturbations of A this was the best I came up with.
C = A;
for x = 1:numel(B)
C(find(C == B(x), 1, 'first')) = [];
end
I was curious about looking at the affect of different orders of A on a solution approach so I setup a test like this:
Ctruth = [1 3 3 4 5 5 6];
for testNumber = 1:100
Atest = A(randperm(numel(A)));
C = myFunction(Atest,B);
C = sort(C);
assert(all(C==Ctruth));
end
Strongly inspired by Matt, but on my machine 40% faster:
function A = multiDiff(A,B)
for j = 1:numel(B)
for i = 1:numel(A)
if A(i) == B(j)
A(i) = [];
break;
end
end
end
end

How do I find the complement of an array?

If I have a sorted array of numerical values such as Double, Integer, and Time, what is the general logic to finding a complement?
Over my CS career in college, I've gotten better of understanding complements and edge cases for ranges. As I help students whose skill levels and understanding match mine when I wrote this, I need help finding a generalized way to convey this concept to them for singular elements and ranges.
Try something like this:
def complement(l, universe=None):
"""
Return the complement of a list of integers, as compared to
a given "universe" set. If no universe is specified,
consider the universe to be all integers between
the minimum and maximum values of the given list.
"""
if universe is not None:
universe = set(universe)
else:
universe = set(range(min(l), max(l)+1))
return sorted(universe - set(l))
then
l = [1,3,5,7,10]
complement(l)
yields:
[2, 4, 6, 8, 9]
Or you can specify your own universe:
complement(l, range(12))
yields:
[0, 2, 4, 6, 8, 9, 11]
To add another option - using a data type that is always useful to learn about, for these types of operations.
a = set([1, 3, 5, 7, 10])
b = set(range(1, 11))
c = sorted(list(b.symmetric_difference(a)))
print(c)
[2, 4, 6, 8, 9]
>>> nums = [1, 3, 5, 7, 10]
>>> [n + ((n&1)*2-1) for n in nums]
[2, 4, 6, 8, 9]
The easiest way is to iterate from the beginning of your list to the second to last element. Set j equal to the index + 1. While j is less than the next number in your list, append it to your list of complements and increment it.
# find the skipped numbers in a list sorted in ascending order
def getSkippedNumbers (arr):
complement = []
for i in xrange(0, len(arr) - 1):
j = arr[i] + 1
while j < arr[i + 1]:
complement.append(j)
j += 1
return complement
test = [1, 3, 5, 7, 10]
print getSkippedNumbers(test) # returns [2, 4, 6, 8, 9]
You can find the compliment of two lists using list comprehension. Here we are taking the complement of a set x with respect to a set y:
>>> x = [1, 3, 5, 7, 10]
>>> y = [1, 2, 3, 4, 8, 9, 20]
>>> z = [n for n in x if not n in y]
>>> z
[5, 7, 10]
>>>

Find indices of elements in an array based on a search from another array

Imagine that i have two arrays:
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0];
b = [5, 9, 6];
I want to find the indices of the values of b in a (only the first hit) ie:
c = [3, 6, 5];
Is there an easy Matlab native way to do this without looping and searching.
I have tried to use find() with:
find(a == b)
and it would work if you did this:
for i = 1:length(b)
index = find(a == b(i));
c = [c, index(1)]
end
But it would be ideal for it to be easier then this.
You can compact your for loop easily with arrayfun into a simple one-liner:
arrayfun(#(x) find(a == x,1,'first'), b )
also see Scenia's answer for newer matlab versions (>R2012b).
This is actually built into ismember. You just need to set the right flag, then it's a one liner and you don't need arrayfun. Versions newer than R2012b use this behavior by default.
Originally, ismember would return the last occurence if there are several, the R2012a flag makes it return the first one.
Here's my testing results:
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0, 6];
b = [5, 9, 6];
[~,c] = ismember(b,a,'R2012a');
>> c
c =
3 6 5
This is a fix to the ismember approach that #Pursuit suggested. This way it handles multiple occurrences of one of the numbers, and returns the result in the correct order:
[tf,loc] = ismember(a,b);
tf = find(tf);
[~,idx] = unique(loc(tf), 'first');
c = tf(idx);
The result:
>> c
c =
3 6 5
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0, 6];
b = [5, 9, 6];
[r c]=find(bsxfun(#eq,a,b')');
[~,ia,~]=unique(c,'first');
>> r(ia)
ans =
3
6
5
Note: I added an extra 6 at the end of a to demonstrate finding only the first occurrence of each value.
You could try this:
[c,ind_a] = intersect(a,b)
Have you tried ismember?
c_logical_mask = ismember(a, b);
or
c_indexes = find(ismember(a, b));
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0];
b = [5, 9, 6];
c = dsearchn(a',b');
Matlab requires a and b need to be column vectors, hence the transpose.
Similar to #tmpearce's answer, but possibly faster:
[valid, result] = max(bsxfun(#eq, a(:), b(:).')); %'// max finds first occurrence
result = result(valid); %// this is necessary in case some value of b is not in a

Resources