Related
I have p = [1,2,3,4]. I would like a 100x4 numpy matrix with p in each row. What's the best way to create that?
I tried pvect = np.array(p for i in range(10)) but that doesn't seem to be right.
Use numpy.tile:
pvect = np.tile(p, (100, 1))
output:
array([[1, 2, 3, 4],
[1, 2, 3, 4],
...
[1, 2, 3, 4],
[1, 2, 3, 4]])
Using matrix algebra: the multiplication of a column vector of ones times your row vector p effectively places the vector p in each row.
p = np.array([[1,2,3,4]])
OutputArray = np.ones((100, 1)) # p
you can try:
pvect = np.array(p*100).reshape((100,4))
I had two list:
a=[0,0,0,1,1,1,1,2,2]
b=[2,5,12,2,3,8,9,4,6]
And I wanted to get:
c=[[0,2,5,12],[1,2,3,8,9],[2,4,6]]
A and b correlated to each other, a[i] related to b[i], when the value in a change like 0 to 1, 12 end in the first inner-list of c.
I tried it with if else statement but it failed
How to get c in python?
This code produces c in a good enough way (provided a and b are always adjusted in the same way as in the example):
a=[0,0,0,1,1,1,1,2,2]
b=[2,5,12,2,3,8,9,4,6]
c = []
i = 0
while i < len(a):
d = a.count(a[i])
c.append([a[i]] + b[i:i + d])
i += d
print(c) # ==> [[0, 2, 5, 12], [1, 2, 3, 8, 9], [2, 4, 6]]
We can zip the lists, group by first value from a, and make lists with the second:
from itertools import groupby
from operator import itemgetter
a=[0,0,0,1,1,1,1,2,2]
b=[2,5,12,2,3,8,9,4,6]
[list(map(itemgetter(1), group)) for _, group in groupby(zip(a, b), key=itemgetter(0))]
#[[2, 5, 12], [2, 3, 8, 9], [4, 6]]
Similar to #Thierry Lathuille's answer, but does actually prepend the keys to the sub lists as requested by OP:
import itertools as it
ib = iter(b)
[[k, *(next(ib) for _ in gr)] for k, gr in it.groupby(a)]
# [[0, 2, 5, 12], [1, 2, 3, 8, 9], [2, 4, 6]]
Here's my simple solution. Notice that you are splitting the list by by the counts of elemets in the list a. deque is used for popping elements in O(1) time from the left.
import itertools
from collections import Counter, deque
a = [0,0,0,1,1,1,1,2,2]
b = deque([2,5,12,2,3,8,9,4,6])
c = Counter(a)
new_list=[]
for x in c:
new_list.append([x]+[b.popleft() for i in range(a[x])])
Given arrays (say row vectors) A and B, how do I find an array C such that merging B and C will give A?
For example, given
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
then
C = multiset_diff(A, B) % Should be [4, 6, 4, 3, 1, 5]
(the order of the result does not matter here).
For the same A, if B = [2, 4, 5], then the result should be [6, 4, 3, 3, 1, 5, 5].
(Since there were two 4s in A and one 4 in B, the result C should have 2 - 1 = 1 4 in it. Similarly for the other values.)
PS: Note that setdiff would remove all instances of 2, 3, and 5, whereas here they need to be removed just however many times they appear in B.
Performance: I ran some quick-n-dirty benchmarks locally, here are the results for future reference:
#heigele's nested loop method performs best for small lengths of A (say upto N = 50 or so elements). It does 3x better for small (N=20) As, and 1.5x better for medium-sized (N=50) As, compared to the next best method - which is:
#obchardon's histc-based method. This is the one performs the best when A's size N starts to be 100 and above. For eg., this does 3x better than the above nested loop method when N = 200.
#matt's for+find method does comparably to the histc method for small N, but quickly degrades in performance for larger N (which makes sense since the entire C == B(x) comparison is run every iteration).
(The other methods are either several times slower or invalid at the time of writing.)
Still another approach using the histc function:
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
uA = unique(A);
hca = histc(A,uA);
hcb = histc(B,uA);
res = repelem(uA,hca-hcb)
We simply calculate the number of repeated elements for each vectors according to the unique value of vector A, then we use repelem to create the result.
This solution do not preserve the initial order but it don't seems to be a problem for you.
I use histc for Octave compatibility, but this function is deprecated so you can also use histcounts
Here's a vectorized way. Memory-inefficient, mostly for fun:
tA = sum(triu(bsxfun(#eq, A, A.')), 1);
tB = sum(triu(bsxfun(#eq, B, B.')), 1);
result = setdiff([A; tA].', [B; tB].', 'rows', 'stable');
result = result(:,1).';
The idea is to make each entry unique by tagging it with an occurrence number. The vectors become 2-column matrices, setdiff is applied with the 'rows' option, and then the tags are removed from the result.
You can use the second output of ismember to find the indexes where elements of B are in A, and diff to remove duplicates:
This answer assumes that B is already sorted. If that is not the case, B has to be sorted before executing above solution.
For the first example:
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 3, 5, 5];
%B = sort(B); Sort if B is not sorted.
[~,col] = ismember(B,A);
indx = find(diff(col)==0);
col(indx+1) = col(indx)+1;
A(col) = [];
C = A;
>>C
4 6 4 3 1 5
For the second example:
A = [2, 4, 6, 4, 3, 3, 1, 5, 5, 5];
B = [2, 4, 5, 5];
%B = sort(B); Sort if B is not sorted.
[~,col] = ismember(B,A);
indx = find(diff(col)==0);
col(indx+1) = col(indx)+1;
A(col) = [];
C = A;
>>C
6 4 3 3 1 5
I'm not a fan of loops, but for random perturbations of A this was the best I came up with.
C = A;
for x = 1:numel(B)
C(find(C == B(x), 1, 'first')) = [];
end
I was curious about looking at the affect of different orders of A on a solution approach so I setup a test like this:
Ctruth = [1 3 3 4 5 5 6];
for testNumber = 1:100
Atest = A(randperm(numel(A)));
C = myFunction(Atest,B);
C = sort(C);
assert(all(C==Ctruth));
end
Strongly inspired by Matt, but on my machine 40% faster:
function A = multiDiff(A,B)
for j = 1:numel(B)
for i = 1:numel(A)
if A(i) == B(j)
A(i) = [];
break;
end
end
end
end
If I have a sorted array of numerical values such as Double, Integer, and Time, what is the general logic to finding a complement?
Over my CS career in college, I've gotten better of understanding complements and edge cases for ranges. As I help students whose skill levels and understanding match mine when I wrote this, I need help finding a generalized way to convey this concept to them for singular elements and ranges.
Try something like this:
def complement(l, universe=None):
"""
Return the complement of a list of integers, as compared to
a given "universe" set. If no universe is specified,
consider the universe to be all integers between
the minimum and maximum values of the given list.
"""
if universe is not None:
universe = set(universe)
else:
universe = set(range(min(l), max(l)+1))
return sorted(universe - set(l))
then
l = [1,3,5,7,10]
complement(l)
yields:
[2, 4, 6, 8, 9]
Or you can specify your own universe:
complement(l, range(12))
yields:
[0, 2, 4, 6, 8, 9, 11]
To add another option - using a data type that is always useful to learn about, for these types of operations.
a = set([1, 3, 5, 7, 10])
b = set(range(1, 11))
c = sorted(list(b.symmetric_difference(a)))
print(c)
[2, 4, 6, 8, 9]
>>> nums = [1, 3, 5, 7, 10]
>>> [n + ((n&1)*2-1) for n in nums]
[2, 4, 6, 8, 9]
The easiest way is to iterate from the beginning of your list to the second to last element. Set j equal to the index + 1. While j is less than the next number in your list, append it to your list of complements and increment it.
# find the skipped numbers in a list sorted in ascending order
def getSkippedNumbers (arr):
complement = []
for i in xrange(0, len(arr) - 1):
j = arr[i] + 1
while j < arr[i + 1]:
complement.append(j)
j += 1
return complement
test = [1, 3, 5, 7, 10]
print getSkippedNumbers(test) # returns [2, 4, 6, 8, 9]
You can find the compliment of two lists using list comprehension. Here we are taking the complement of a set x with respect to a set y:
>>> x = [1, 3, 5, 7, 10]
>>> y = [1, 2, 3, 4, 8, 9, 20]
>>> z = [n for n in x if not n in y]
>>> z
[5, 7, 10]
>>>
Imagine that i have two arrays:
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0];
b = [5, 9, 6];
I want to find the indices of the values of b in a (only the first hit) ie:
c = [3, 6, 5];
Is there an easy Matlab native way to do this without looping and searching.
I have tried to use find() with:
find(a == b)
and it would work if you did this:
for i = 1:length(b)
index = find(a == b(i));
c = [c, index(1)]
end
But it would be ideal for it to be easier then this.
You can compact your for loop easily with arrayfun into a simple one-liner:
arrayfun(#(x) find(a == x,1,'first'), b )
also see Scenia's answer for newer matlab versions (>R2012b).
This is actually built into ismember. You just need to set the right flag, then it's a one liner and you don't need arrayfun. Versions newer than R2012b use this behavior by default.
Originally, ismember would return the last occurence if there are several, the R2012a flag makes it return the first one.
Here's my testing results:
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0, 6];
b = [5, 9, 6];
[~,c] = ismember(b,a,'R2012a');
>> c
c =
3 6 5
This is a fix to the ismember approach that #Pursuit suggested. This way it handles multiple occurrences of one of the numbers, and returns the result in the correct order:
[tf,loc] = ismember(a,b);
tf = find(tf);
[~,idx] = unique(loc(tf), 'first');
c = tf(idx);
The result:
>> c
c =
3 6 5
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0, 6];
b = [5, 9, 6];
[r c]=find(bsxfun(#eq,a,b')');
[~,ia,~]=unique(c,'first');
>> r(ia)
ans =
3
6
5
Note: I added an extra 6 at the end of a to demonstrate finding only the first occurrence of each value.
You could try this:
[c,ind_a] = intersect(a,b)
Have you tried ismember?
c_logical_mask = ismember(a, b);
or
c_indexes = find(ismember(a, b));
a = [1, 2, 5, 7, 6, 9, 8, 3, 4, 7, 0];
b = [5, 9, 6];
c = dsearchn(a',b');
Matlab requires a and b need to be column vectors, hence the transpose.
Similar to #tmpearce's answer, but possibly faster:
[valid, result] = max(bsxfun(#eq, a(:), b(:).')); %'// max finds first occurrence
result = result(valid); %// this is necessary in case some value of b is not in a