Matlab applying operations based on element indices - arrays

In Matlab, what is the preferred way to apply operations that make use of the indices of elements they're accessing? Some simple scenarios:
A(i, j) = A(i, j) + 2*i + 3*j
A(i,j) = A(i,j) + A(i+1,j+1)
Besides using loops, is there any straightforward way to make use of the matrix elements' indices when carrying out operations like these? Answers to similar questions, such as "Initialize MATLAB matrix based on indices" make extensive use of repmat(). While solutions involving repmat() work, it is not easy for someone not proficient in Matlab (such as myself) to develop if the problem is somewhat complicated.

There's often nothing wrong with using a for loop, so bear that in mind.
For your first case, I can't think of any solution without using repmat, arrayfun or similar. Something that could work is something like the following:
[m,n]=size(A)
A=A+2*ones(m,1)*(1:n)+3*(1:m).'*ones(1,n)
using matrix multiplication, but I agree that this is not very obvious!
In your second example, Matlab's indexing can help. It's not clear what you want to happen to elements along the last row/column of A, but you can do something like this:
A(1:end-1,1:end-1)=A(1:end-1,1:end-1)+A(2:end,2:end)
although you may want to make a new matrix or save your old one if you need to do more things to it.
Your question is quite broad, and there are many techniques, hopefully these two give you some ideas, and don't automatically reject using a for loop either. There are also lots of handy Matlab functions that can help with this sort of thing that you will see pop up in answers here.

For the first case, you could make use of bsxfun, which is a function which carries out a Binary operation with Singleton Expansion on two arrays. It takes in two arrays and copies along dimensions with size==1 so that the arrays have the same size, then performs a binary operation on them. For your first example, you can do the following:
i = 1:10; % range for the first dimension
j = (1:5)'; % range for the second dimension, note the transpose
A(i, j) = A(i, j) + bsxfun(#plus,2*i,3*j);
for the second case, it's a simple matter of doing exactly what you've got there
% define i and j - make sure that you won't get an out of bounds error
i = 1:10; % range for the first dimension
j = 2:8; % range for the second dimension
A(i,j) = A(i,j) + A(i+1,j+1)

Related

Good Hashing Function

I'm looking to make an hashtable to store some data that I need to access quickly instead of iterating through a linked list and I'm having problems defining a good hash function.
Consider S as the hashtable.
I initialize S[10] with labels (0,...,0) and S[1w1] = (v11,v12)
then I have two loops, j=2 to N, a=0 to W.
N and W can be any positive integer.
In there, I do S[ja] = addSomeDifferentStuff(S[(j-1)a]), creating the node S[ja].
I really can't find a hash function that doesn't create collisions, a friend of mine has suggested hash = j + a * W.
Any suggestions?
UPDATE:
Ok, so I to clarify myself. This was a implementation of solution on the bi-criteria 0.1 knapsack problem based on a labeling algorithm that converts the knapsack problem to a shortest path problem. W is my capacity, and n is the number of items. Consider wj the weight of item j.
Inside of the loops, I'm verifying if the item can be added, if it is then I'll make S[ja] = S[(j-1)a-wj] + values[j1,j2], and otherwise I just copy S[ja] = S[(j-1)a]. But accessing the labels in S[(j-1)a] or S[(j-1)a-wj] is expensive with linked lists since I need to iterate through every element until I find it. That is the purpose of the hashtable.
N and W can be any positive integer.
Well that's surely going to present a computability problem. You seem to be asking how to construct a perfect hash function for objects consisting of pairs of integers drawn from the ranges 0 ... N and 0 ... W, respectively. Such a function must compute (N + 1) * (W + 1) distinct values, and the bounds on N and W affect the suitable data types and algorithms.
Note, too, that it is probably most useful to consider the keys to be integer pairs, not integer powers, because N and W don't need to get very large before the powers involved are too large to be represented by any built-in type offered by your implementation. The pairs will be easier to work with on several levels.
a friend of mine has suggested hash = j + a * W.
I suppose your friend meant hash(j,a) = j + a * (N + 1). Provided that it does not overflow, that will produce a different value for each pair (j, a) drawn from the ranges specified. Alternatively, you could also use hash(j,a) = j * (W + 1) + a, subject to the same proviso about overflow. If indeed you need a perfect hash function over the full domain you've described, then I don't see much room for improvement over that on the performance side, except possibly by replacing the multiplication with a suitably-large left shift.
The values of those functions do vary with a and j in a completely systematic way, however, and that would be an undesirable characteristic for some uses of such a function. Finding a perfect hash function that does not have that property is a difficult problem. One typically would use a program such as gperf for such a task but that's not amenable to dynamic adaptation to different values of N and W.
Note that although that answers the question that I think you actually asked, I'm not certain it's what you are really looking for. Inasmuch as you seem to have rejected my characterization of S as an array of hashtables, instead going back to it being a singular hashtable, I suspect that you mean something different by the term "hashtable" than I do. Nevertheless, I take the question to be about the hash function, and the use to which you put that function is a separate concern.
Maybe look at https://github.com/Cyan4973/xxHash for both xxHash, and its list of competing hash functions.

VBA vector output returning 0

I'm trying to brush up a little on my VBA skills and I got stuck on arrays. I have the very simple code below, that takes in a few numbers in a vector, multiply with two and the return the numbers. But the cells are all 0? In locals the calculations are right, and the TestVector is populated correctly, so what seems to be the problem?
Function test(Vec)
n = Vec.Rows.Count
Dim TestVector
ReDim TestVector(n, 1)
For i = 1 To n
A = Vec(i) * 2
TestVector(i, 1) = A
Next i
test = TestVector
End Function
VBA arrays are 0-based as a default. It is possible to override this by using Option Base 1 at the top of the module, but that is generally frowned upon among VBA programmers. Instead: just declare the lower bounds:
ReDim TestVector(1 To n, 1 To 1)
Then your code will work as intended.
Even though Option Base 1 is probably not a good idea, using Option Explicit is an extremely good idea. It will save you a great deal of debugging time. You can do this once and for all by enabling Require Variable Declarations in the VBA editor options.

What is algorithm to find K for finding medians in two sorted array in leetcode

The solution implementing find medians in two sorted array is awesome. However, I am still very confused about code to calculate K
var aMid = aLength * k / (aLength + bLength)
var bMid = k - aMid - 1
I guess this is the key part of this algorithm which I really dont know why is calculated like this. To explain more clearly what I mean, the core logic is divide and conquer, considering the fact that different size list should be divided differently. I wonder why this formula is working perfectly.
Can someone give me some insight about it. I searched lots of online documents and it is very hard to find materials to explain this part well.
Many thanks in advance
The link shows two different ways of computing the comparison points in each array: one always uses k/2, even if the array doesn't have that many elements; the other (which you quote) tries to distribute the comparison points based on the size of the arrays.
As can be seen from these two examples, neither of which is optimal, it doesn't make much difference how you compute the comparison points, as long as the size of the the two components is generally linear in K (using a fixed size of 5 for one of the comparison points won't work, for example.)
The algorithm effectively reduces the problem size by either aMid or bMid on each iteration. Ideally, the problem size would be reduced by k/2; and that's the computation you should use if both arrays have at least k/2 members. If one has two few members, you can set the comparison point for the array to its last element, and compute the other comparison point so that the total is k - 1. If you end up discarding all of the elements from some array, you can then immediately return element k of the other array.
That strategy will usually perform fewer iterations than either of the proposals in your link, but it is still O(log k).

how to add up the elements of an array in GPU? any function similar to cublasDasum in CUBLAS?

I know I can do the parallel reduction to sum up the elements of an array in parallel.
But it is a little difficult for me to follow it. I saw that in cublas, there is this function called cublasDasum that sums up the absolute values of the elements. It seems there should be a very similar function that sums up the elements, not the absolute values. Is there any way I can find the source code of cublasDasum and see how this is done?
Adding up an array is such a basic operation. I can't believe that there is no such a function that does it ... .
Take a look at the answers here for some good ideas. Thrust has pretty easy to use reduction operations.
You can sum all the elements of a matrix by treating it as a 1 x N array, creating an N x 1 array of ones, and doing a cublasDgemm operation.
I don't think you're going to find the source code for cublas anywhere.
You can use cublasDaxpy ( AXPY blas equivalent) with alpha = 1 which performs:
y = alpha.x + y
And if you work on matrix, you can use cublasDgeam ( no BLAS equivalent )

Making a for-loop in Matlab faster by using arrayfun?

currently I have the following portion of code:
for i = 2:N-1
res(i) = k(i)/m(i)*x(i-1) -(c(i)+c(i+1))/m(i)*x(N+i) +e(i+1)/m(i)*x(i+1);
end
where as the variables k, m, c and e are vectors of size N and x is a vector of size 2*N. Is there any way to do this a lot faster using something like arrayfun!? I couldn't figure this out :( I especially want to make it faster by running on the GPU later and thus, arrayfun would be also helpfull since matlab doesn't support parallelizing for-loops and I don't want to buy the jacket package...
Thanks a lot!
You don't have to use arrayfun. It works if use use some smart indexing:
clear all
N=20;
k=rand(N,1);
m=rand(N,1);
c=rand(N,1);
e=rand(N,1);
x=rand(2*N,1);
% for-based implementation
%Watch out, you are not filling the first element of forres!
forres=zeros(N-1,1); %Initialize array first to gain some speed.
for i = 2:N-1
forres(i) = k(i)/m(i)*x(i-1) -(c(i)+c(i+1))/m(i)*x(N+i) +e(i+1)/m(i)*x(i+1);
end
%vectorized implementation
parres=k(2:N-1)./m(2:N-1).*x(1:N-2) -(c(2:N-1)+c(3:N))./m(2:N-1).*x(N+2:2*N-1) +e(3:N)./m(2:N-1).*x(3:N);
%compare results; strip the first element from forres
difference=forres(2:end)-parres %#ok<NOPTS>
Firstly, MATLAB does support parallel for loops via PARFOR. However, that doesn't have much chance of speeding up this sort of computation since the amount of computation is small compared to the amount of data you're reading and writing.
To restructure things for GPUArray "arrayfun", you need to make all the array references in the loop body refer to the loop iterate, and have the loop run across the full range. You should be able to do this by offsetting some of the arrays, and padding with dummy values. For example, you could prepend all your arrays with NaN, and replace x(i-1) with a new variable x_1 = [x(2:N) NaN]

Resources