I have a matrix S(n x m) and a vector Sigma(n), and I would like to multiply each row S(i) by Sigma(i).
I have thought of 3 things :
-> Convert Sigma to a square diagonal matrix and compute S = Sigma * S, but it seems the functions exist only for general or triangular matrix...
-> Multiply each line by a scalar Sigma[i] using a DSCAL, in a loop
-> mkl_ddiamm, but it seems kinda obscure to me.
Any advices on how I should implement that ? Thank you !
It is a very simple operation that MKL/BLAS does not provide a function for it. You could implement it by yourself with for loops.
for(int i=0; i<nrow; ++i) {
for(int j=0; j<ncols; ++j) {
s[i][j] += sigma[i];
}
}
Related
//Cannot understand use of this function
for(int i=0; i<n; i++) {
for(int j=0; j<n; j++) {
double sum = 0;
for(int k=0; k<n; k++) {
//Why is i*n+k used here?
sum += A[i*n+k]*A[j*n+k];
}
C[i*n+j] = sum;
int main() {
double *m4 = (double*)malloc(sizeof(double)*n*n);
//Why was gemm_ATA function used here?
gemm_ATA(m3, m4, n); //make a positive-definite matrix
printf("\n");
//show_matrix(m4,n);
}
I am making a project for parallelizing Cholesky method and found a useful code. In the given, project this function is used and I have no idea why is it used.
Also, can someone help me understand the code and its function used in the code given in the link:-
http://coliru.stacked-crooked.com/a/6f5750c20d456da9
The function gemm_ATA takes an input matrix A and calculates C = A^T * A, which is positive semi-definite by definition (note the semi-definiteness, depending on the properties of the input matrix).
Mathematically, calculating this matrix would be:
c_i,j = sum_k a_k,i * a_k,j
c_i,j is the entry of C in the i-th row and j-th column. The expressions i*n+k and j*n+k transform these 2D indices (row and column) to a 1D index of the underlying array.
gemm_ATA calculate AA^T and stores it in C. A^T is A but flipped over its diagonal. So A*A^T multiply each row of matrix A (call A[-,i]) with each column of the matrix A (call A^T[j,-]) which is also the row of A (A[-,j]).
Also, if you flatten an n*n 2D matrix to 1D matrix, you can access the first element of each i-th row by i*n+0. So if you want k-th column of ith row you have it with A[i*n+k].
Note that since you pass C by reference to your function, after calling the function, m4 is your positive definite matrix created from m3.
I'm working on a demo that requires a lot of vector math, and in profiling, I've found that it spends the most time finding the distances between given vectors.
Right now, it loops through an array of X^2 vectors, and finds the distance between each one, meaning it runs the distance function X^4 times, even though (I think) there are only (X^2)/2 unique distances.
It works something like this: (pseudo c)
#define MATRIX_WIDTH 8
typedef float vec2_t[2];
vec2_t matrix[MATRIX_WIDTH * MATRIX_WIDTH];
...
for(int i = 0; i < MATRIX_WIDTH; i++)
{
for(int j = 0; j < MATRIX_WIDTH; j++)
{
float xd, yd;
float distance;
for(int k = 0; k < MATRIX_WIDTH; k++)
{
for(int l = 0; l < MATRIX_WIDTH; l++)
{
int index_a = (i * MATRIX_LENGTH) + j;
int index_b = (k * MATRIX_LENGTH) + l;
xd = matrix[index_a][0] - matrix[index_b][0];
yd = matrix[index_a][1] - matrix[index_b][1];
distance = sqrtf(powf(xd, 2) + powf(yd, 2));
}
}
// More code that uses the distances between each vector
}
}
What I'd like to do is create and populate an array of (X^2) / 2 distances without redundancy, then reference that array when I finally need it. However, I'm drawing a blank on how to index this array in a way that would work. A hash table would do it, but I think it's much too complicated and slow for a problem that seems like it could be solved by a clever indexing method.
EDIT: This is for a flocking simulation.
performance ideas:
a) if possible work with the squared distance, to avoid root calculation
b) never use pow for constant, integer powers - instead use xd*xd
I would consider changing your algorithm - O(n^4) is really bad. When dealing with interactions in physics (also O(n^4) for distances in 2d field) one would implement b-trees etc and neglect particle interactions with a low impact. But it will depend on what "more code that uses the distance..." really does.
just did some considerations: the number of unique distances is 0.5*n*n(+1) with n = w*h.
If you write down when unique distances occur, you will see that both inner loops can be reduced, by starting at i and j.
Additionally if you only need to access those distances via the matrix index, you can set up a 4D-distance matrix.
If memory is limited we can save up nearly 50%, as mentioned above, with a lookup function that will access a triangluar matrix, as Code-Guru said. We would probably precalculate the line index to avoid summing up on access
float distanceArray[(H*W+1)*H*W/2];
int lineIndices[H];
searchDistance(int i, int j)
{
return i<j?distanceArray[i+lineIndices[j]]:distanceArray[j+lineIndices[i]];
}
I'm trying to concatenate the same matrix in C, and the only idea that crossed to my mind is addition, but it doesn't work. For example, if I have: {1,1;2,2}, my new matrix should be {1,1,1,1;2,2,2,2}. I want to double the number of rows. I Googled, but I didn't find anything.
Here is my code:
matrix2=realloc(matrix1,sizeof(int*)*(row));
int i,j;
for(i=0;i<row;i++){
for(j=0;j<col;j++){
matrix2[i][j]=matrix1[i][j]+matrix1[i][j];
}
}
Use the psuedocode I provide below. Note that for any C before C99, you cannot instantiate arrays with int matrix[2*W][H] (if W and H are not #defines)
Given matrix1 and matrix 2 of equal W,H
make matrix3 of 2*W,H
for h to H
for i to W
matrix3[h][i] = matrix1[h][i]
matrix3[h][i+W] = matrix2[h][i]
Making the matrix will require 1 malloc per row, plus 1 malloc to store the array of row pointers.
Note how you will need 2 assignments in the loop instead of the one you had before. This is because you are setting in two places.
You sound like you have a background with higher level languages like matlab. In C the plus operator does not concatenate matrices. This will add the values in the matrices and store the new value into the new matrix.
Here we are copying the input matrix into a new matrix twice
for(int i = 0; i < m; i++){for(int j = 0; j < n;j++)
{ mat2[i][j] = mat[i][j];}}
for(int i = 0 ; i < m ; i++){for(int j = n; j < (2*n) ; j++){ mat2[i][j] = mat[i][j-n];}}
I am trying to calculate the Mahalanobis distance between two vectors a and b. Eventually, I will be using this as a distance measure in statistical algorithms. I am using gsl to implement them. The formula for the mahalanobis distance is sqrt((a-b)'c^-1(a-b)), where c is the covariance matrix. According to this gsl documentation, it takes in two data sets and returns one covariance value. I am not sure how to calculate the covariance matrix using that.
Any help is appreciated.
Thanks.
I think you need to understand the calcuation of a covariance matrix first, second heres a sample code to get you started
for (i = 0; i < A->size1; i++) {
for (j = i; j < A->size2; j++) {
a = gsl_matrix_column (A, i);
b = gsl_matrix_column (A, j);
double cov = gsl_stats_covariance(a.vector.data, a.vector.stride,b.vector.data, b.vector.stride, a.vector.size);
gsl_matrix_set (C, i, j, cov);
}
}
A is an MxK matrix, B is a vector of size K, and C is a KxN matrix. What set of BLAS operators should I use to compute the matrix below?
M = A*diag(B)*C
One way to implement this would be using three for loops like below
for (int i=0; i<M; ++i)
for (int j=0; j<N; ++j)
for (int k=0; k<K; ++k)
M(i,j) = A(i,k)*B(k)*C(k,j);
Is it actually worth implementing this in BLAS in order to gain better speed efficiency?
First compute D = diag(B)*C, then use the appropriate BLAS matrix-multiply to compute A*D.
You can implement diag(B)*C using a loop over elements of B and calling to the appropriate BLAS scalar-multiplication routine.