Scipy/Numpy: summation over multiple indices - arrays

Suppose I have an expression of which I need to find the sum:
where the bounds are finite and known. What is the fastest or most efficient way to go about calculating such a sum in scipy/numpy. It could be done with nested for loops, but is there a better way?

How about
np.dot(x[:amax], np.cumsum(y[:amax] * np.sum(z[cmin:cmax])))

np.einsum may be an option too for these kind of sum. As nevsan showed though, for b which is bounded by a you need to use np.cumsum first, and np.einsum should not be faster in the given example.
it could look like this:
y_acc = np.add.accumulate(y[:amax]) # same as cumsum
result = np.einsum('i,i,j->', x[:amax], y_acc, z[cmin:cmax])
However this is increadibly slow, because einsum does not optimize the fact that the z summation only needs to be done once, so you need to reformulate it by hand:
result = np.einsum('i,i->', x[:amax], y_summed) * z[cmin:cmax].sum()
Which should in this case however should be slower then nevsan's np.dot based approach, since dot should normally be better optimized (ie. np.einsum(ii->, a, b) is slower then np.dot(a, b)). However if you have more arrays to sum over, it may be a nice option.

Related

Optimal combination of “baskets” to best approximate target basket?

Say I have some array of length n where arr[k] represents how much of object k I want. I also have some arbitrary number of arrays which I can sum integer multiples of in any combination - my goal being to minimise the sum of the absolute differences across each element.
So as a dumb example if my target was [2,1] and my options were A = [2,3] and B = [0,1], then I could take A - 2B and have a cost of 0
I’m wondering if there is an efficient algorithm for approximating something like this? It has a weird knapsack-y flavour to is it maybe just intractable for large n? It doesn’t seem very amenable to DP methods
This is the (NP-hard) closest vector problem. There's an algorithm due to Fincke and Pohst ("Improved methods for calculating vectors of short length in a lattice, including a complexity analysis"), but I haven't personally worked with it.

computing function of neighbors efficiently on lattice

I'm studying the Ising model, and I'm trying to efficiently compute a function H(σ) where σ is the current state of an LxL lattice (that is, σ_ij ∈ {+1, -1} for i,j ∈ {1,2,...,L}). To compute H for a particular σ, I need to perform the following calculation:
where ⟨i j⟩ indicates that sites σ_i and σ_j are nearest neighbors and (suppose) J is a constant.
A couple of questions:
Should I store my state σ as an LxL matrix or as an L2 list? Is one better than the other for memory accessing in RAM (which I guess depends on the way I'm accessing elements...)?
In either case, how can I best compute H?
Really I think this boils down to how can I access (and manipulate) the neighbors of every state most efficiently.
Some thoughts:
I see that if I loop through each element in the list or matrix that I'll be double counting, so is there a "best" way to return the unique neighbors?
Is there a better data structure that I'm not thinking of?
Your question is a bit broad and a bit confusing for me, so excuse me if my answer is not the one you are looking for, but I hope it will help (a bit).
An array is faster than a list when it comes to indexing. A matrix is a 2D array, like this for example (where N and M are both L for you):
That means that you first access a[i] and then a[i][j].
However, you can avoid this double access, by emulating a 2D array with a 1D array. In that case, if you want to access element a[i][j] in your matrix, you would now do, a[i * L + j].
That way you load once, but you multiply and add your variables, but this may still be faster in some cases.
Now as for the Nearest Neighbor question, it seems that you are using a square-lattice Ising model, which means that you are working in 2 dimensions.
A very efficient data structure for Nearest Neighbor Search in low dimensions is the kd-tree. The construction of that tree takes O(nlogn), where n is the size of your dataset.
Now you should think if it's worth it to build such a data structure.
PS: There is a plethora of libraries implementing the kd-tree, such as CGAL.
I encountered this problem during one of my school assignments and I think the solution depends on which programming language you are using.
In terms of efficiency, there is no better way than to write a for loop to sum neighbours(which are actually the set of 4 points{ (i+/-1,j+/-1)} for a given (i,j). However, when simd(sse etc) functions are available, you can re-express this as a convolution with a 2d kernel {0 1 0;1 0 1;0 1 0}. so if you use a numerical library which exploits simd functions you can obtain significant performance increase. You can see the example implementation of this here(https://github.com/zawlin/cs5340/blob/master/a1_code/denoiseIsingGibbs.py) .
Note that in this case, the performance improvement is huge because to evaluate it in python I need to write an expensive for loop.
In terms of work, there is in fact some waste as the unecessary multiplications and sum with zeros at corners and centers. So whether you can experience performance improvement depends quite a bit on your programming environment( if you are already in c/c++, it can be difficult and you need to use mkl etc to obtain good improvement)

Updating values in an array with logical indexing with a non-constant value

A common problem I encounter when I want to write concise/readable code:
I want to update all the values of a vector matching a logical expression with a value that depends on the previous value.
For example, double all even entries:
weights = [10 7 4 8 3];
weights(mod(weights,2)==0) = weights(mod(weights,2)==0) * 2;
% weights = [20 7 8 16 3]
Is it possible to write the second line in a more concise fashion (i.e. avoiding the double use of the logical expression, something like i+=3 for i=i+3 in other languages). If I often use this kind of vector operation in different contexts/variables, and I have long conditionals, I feel that my code is less concise and readable than it could be.
Thanks!
How about
ind = mod(weights,2)==0;
weights(ind) = weights(ind)*2;
This way you avoid calculating the indices twice and it's easy to read.
Starting your other comment to Wauzl, such powerful operation capabilities is the Fortran side. This is purely matlab's design that is quickly getting obsolete. Let's use this horribleness further:
for i=1:length(weights),if (mod(weights(i),2)==0)weights(i)=weights(i)*2;end,end
It is even slightly faster than your two liner because you are doing the conditional indexing twice on both sides. In general, consider switching to Python3.
Well, I after more searching around, I found this link that deals with this issue (I used search before posting, I swear!), and there is interesting further discussion regarding this topic in the links in that thread. So apparently there are issues with ambiguity when introducing such an operator.
Looks like that is the price we have to pay in terms of syntactic limitations for having such powerful matrix operation capabilities.
Thanks a lot anyway, Wauzl!

Optimising Matrix Updating and Multiplication

Consider as an example the matrix
X(a,b) = [a b
a a]
I would like to perform some relatively intensive matrix algebra computations with X, update the values of a and b and then repeat.
I can see two ways of storing the entries of X:
1) As numbers (i.e. floats). Then after our matrix algebra operation we update all the values in X to the correct values of a and b.
2) As pointers to a and b, so that after updating them, the entries of X are automatically updated.
Now, I initially thought method (2) was the way to go as it skips out the updating step. However I believe that using method (1) allows a better use of the cache when doing for example matrix multiplication in parallel (although I am no expert so please correct me if I'm wrong).
My hypothesis is that for unexpensive matrix computations you should use method (2) and there will be some threshold as the computation becomes more complex that you should switch to (1).
I imagine this is not too uncommon a problem and my question is which is the optimal method to use for general matrices X?
Neither approach sounds particularly hard to implement. The simplest answer is make a test calculation, try both, and benchmark them. Take the faster one. Depending on what sort of operations you're doing (matrix multiplication, inversion, etc?) you can potentially reduce the computation by simplifying the operations given the assumptions you can make about your matrix structure. But I can't speak to that any more deeply since I'm not sure what types of operations you're doing.
But from experience, with a matrix that size, you probably won't see a performance difference. With larger matrices, you will, since the CPU's cache starts to fill. In that case, doing things like separating multiplication and addition operations, pointer indexes, and passing inputs as const enable the compiler to make significant performance enhancements.
See
Optimized matrix multiplication in C and
Cache friendly method to multiply two matrices

Minimize function in adjacent items of an array

I have an array (arr) of elements, and a function (f) that takes 2 elements and returns a number.
I need a permutation of the array, such that f(arr[i], arr[i+1]) is as little as possible for each i in arr. (and it should loop, ie. it should also minimize f(arr[arr.length - 1], arr[0]))
Also, f works sort of like a distance, so f(a,b) == f(b,a)
I don't need the optimum solution if it's too inefficient, but one that works reasonable well and is fast since I need to calculate them pretty much in realtime (I don't know what to length of arr is, but I think it could be something around 30)
What does "such that f(arr[i], arr[i+1]) is as little as possible for each i in arr" mean? Do you want minimize the sum? Do you want to minimize the largest of those? Do you want to minimize f(arr[0],arr[1]) first, then among all solutions that minimize this, pick the one that minimizes f(arr[1],arr[2]), etc., and so on?
If you want to minimize the sum, this is exactly the Traveling Salesman Problem in its full generality (well, "metric TSP", maybe, if your f's indeed form a metric). There are clever optimizations to the naive solution that will give you the exact optimum and run in reasonable time for about n=30; you could use one of those, or one of the heuristics that give you approximations.
If you want to minimize the maximum, it is a simpler problem although still NP-hard: you can do binary search on the answer; for a particular value d, draw edges for pairs which have f(x,y)
If you want to minimize it lexiocographically, it's trivial: pick the pair with the shortest distance and put it as arr[0],arr[1], then pick arr[2] that is closest to arr[1], and so on.
Depending on where your f(,)s are coming from, this might be a much easier problem than TSP; it would be useful for you to mention that as well.
You're not entirely clear what you're optimizing - the sum of the f(a[i],a[i+1]) values, the max of them, or something else?
In any event, with your speed limitations, greedy is probably your best bet - pick an element to make a[0] (it doesn't matter which due to the wraparound), then choose each successive element a[i+1] to be the one that minimizes f(a[i],a[i+1]).
That's going to be O(n^2), but with 30 items, unless this is in an inner loop or something that will be fine. If your f() really is associative and commutative, then you might be able to do it in O(n log n). Clearly no faster by reduction to sorting.
I don't think the problem is well-defined in this form:
Let's instead define n fcns g_i : Perms -> Reals
g_i(p) = f(a^p[i], a^p[i+1]), and wrap around when i+1 > n
To say you want to minimize f over all permutations really implies you can pick a value of i and minimize g_i over all permutations, but for any p which minimizes g_i, a related but different permatation minimizes g_j (just conjugate the permutation). So therefore it makes no sense to speak minimizing f over permutations for each i.
Unless we know something more about the structure of f(x,y) this is an NP-hard problem. Given a graph G and any vertices x,y let f(x,y) be 1 if there is no edge and 0 if there is an edge. What the problem asks is an ordering of the vertices so that the maximum f(arr[i],arr[i+1]) value is minimized. Since for this function it can only be 0 or 1, returning a 0 is equivalent to finding a Hamiltonian path in G and 1 is saying that no such path exists.
The function would have to have some sort of structure that disallows this example for it to be tractable.

Resources