Intel MKL SpareBlas mm CSR one-based indexing not working - c

I am testing the functions of Intel MKL in a C test-program and I found that I just can't make the spareblas: mkl_scsrmm function CSR one-based indexing work. I am using CSR with the val, columns, pntrb and pntre variation. The original examples where placed in:
This is the first code for zero-based indexing:
Example #1
#include <stdio.h>
#include "mkl_types.h"
#include "mkl_spblas.h"
int main() {
#define M 2
#define NNZ 4
MKL_INT m = M, nnz = NNZ;
float values[NNZ] = {2.0,4.0,4.0,2.0};
MKL_INT columns[NNZ] = {1,2,1,2};
MKL_INT rowIndex[M+1] = {1,3,5};
#define N 2
MKL_INT n = N;
float b[M][N] = {2.0, 1.0, 5.0, 2.0};
float c[M][N] = {0.0, 0.0, 0.0, 0.0};
float alpha = 1.0, beta = 0.0;
char transa, uplo, nonunit;
char matdescra[6];
MKL_INT i, j, is;
transa = 'N';
matdescra[0] = 'S';
matdescra[1] = 'L';
matdescra[2] = 'N';
matdescra[3] = 'F';
mkl_scsrmm(&transa, &m, &n, &m, &alpha, matdescra, values, columns, rowIndex, &(rowIndex[1]), &(b[0][0]), &n, &beta, &(c[0][0]), &n);
printf(" \n");
for (i = 0; i < m; i++) {
for (j = 0; j < n; j++) {
printf("%7.1f", c[i][j]);
return 0;
The results I get are this:
Zero-based indexing(the right one):
24.0 10.0
18.0 8.0
One-based indexing:
8.0 10.0
18.0 24.0
Though it seems like it only changes the diagonal elements position, with 3x3 matrix the solution its completly different from the right one. I suspected that it might be something with the input format of the matrix b. I think that there's lack of clarity on the description of array b for the function mkl_scsrmm placed in the MKL reference manual. Thus I changed b format, in this example and it worked (I placed elements position in this order: `{2.0, 5.0, 1.0, 2.0}) But I did the same for another 3x3 example I coded and it didn't work so I think it may be just a coincidence. I don't really know what to do with this problem, I would like to understand what happens here.
CSR format
Spare Blas mkl_scsrmm function
Spare Blas Interface Considerations
And here is another example, the 3x3 one:
Example #2
// matrix A
// 2 4 3
// 4 2 1
// 3 1 6
// matrix B
// 2 1 3
// 4 5 6
// 7 8 9
// a = {2 4 3 4 2 1 3 1 6}
// columns= {0 1 2 0 1 2 0 1 2}
// idexRow = {0 3 6 9}
// b = {2 1 3 4 5 6 7 8 9} (row order array)
// We print the array in row-major order
// a = {2 4 3 4 2 1 3 1 6}
// columns={1 2 3 1 2 3 1 2 3}
// indexRow = {0 3 6 9}
// b = {2 4 7 1 5 8 3 6 9} (column order array)
// We print the array in column-major order (because the resoult is in column major order, ie transposed)
#include <stdio.h>
#include "mkl_types.h"
#include "mkl_spblas.h"
int main()
#define M 3
#define NNZ 9
#define N 3
MKL_INT m = M, nnz = NNZ, n=N;
float a[NNZ] = {2.0,4.0,3.0,4.0,2.0,1.0,3.0,1.0,6.0};
MKL_INT columns[NNZ] = {0,1,2,0,1,2,0,1,2};
MKL_INT rowIndex[M+1] = {0,3,6,9};
float b[M][N] = {2.0, 1.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0};
float c[M][N] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
float alpha = 1.0, beta = 0.0;
MKL_INT i, j;
char transa;
char matdescra[6];
float a1[NNZ] = {2.0,4.0,3.0,4.0,2.0,1.0,3.0,1.0,6.0};
MKL_INT columns1[NNZ] = {1,2,3,1,2,3,1,2,3};
MKL_INT rowIndex1[M+1] = {1,4,7,10};
float b1[M][N] = {2.0, 4.0, 7.0, 1.0, 5.0, 8.0, 3.0, 6.0, 9.0};
float c1[M][N] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
transa = 'n';
matdescra[0] = 's';
matdescra[1] = 'l';
matdescra[2] = 'n';
matdescra[3] = 'c';
mkl_scsrmm(&transa, &m, &n, &m, &alpha, matdescra, a, columns, rowIndex, &(rowIndex[1]), &(b[0][0]), &n, &beta, &(c[0][0]), &n);
printf(" \n");
printf(" Right Solution: ZERO-BASED: C \n");
for (i = 0; i < m; i++)
for (j = 0; j < n; j++) {
printf("%7.1f", c[i][j]);
printf(" \n");
printf(" ZERO-BASED: C' \n");
for (i = 0; i < m; i++)
for (j = 0; j < n; j++)
printf("%7.1f", c[j][i]);
matdescra[3] = 'f';
mkl_scsrmm(&transa, &m, &n, &m, &alpha, matdescra, a1, columns1, rowIndex1, &(rowIndex1[1]), &(b1[0][0]), &n, &beta, &(c1[0][0]), &n);
printf(" \n");
printf(" ONE-BASED: C \n");
for (i = 0; i < m; i++)
for (j = 0; j < n; j++)
printf("%7.1f", c1[i][j]);
printf(" \n");
printf(" ONE-BASED: C' \n");
for (i = 0; i < m; i++)
for (j = 0; j < n; j++)
printf("%7.1f", c1[j][i]);
return 0;

I asked the same question at Intel's forum and I got some help there and get to the solution of the problem. The deal was that when calling the routine from C interface with zero-based indexing you can send the matrix stored in an array following row-major order (native C array storage), and when you call the routine with one-based indexing you have to store the matrix in column-major order. This changes the way the matrix B and C need to be stored, and the way the result will be stored. For the matrix A it only changes the indexing (from 0 to 1). From Intel's documentation you may think that the C interface accepts always the row-major ordering for both types of indexing.
Notice that it that in general, column-major ordering is not the same as storing the transposed matrix in row-major ordering (it is the same if matrices are square).


set of directions for an array to follow in C

So for my coding class (in C), we had to write a program that would fill a square of N >=4 elements in a certain manner (from left to right and then down and then right to left to go up and end at the origin).
After a while, and getting a solution with layers repeating the set of instructions for each circle we get this program:
#include <stdio.h>
#define N 4
int main(void){
int map[N][N];
int dirs[4][2] = {
{0, 1},
{1, 0},
{0, -1},
{-1, 0}
for (int layer=0; layer < (N+1)/2; layer++){
// for each layer the starting point is (layer, layer)
// for each layer and each direction the number of repeat is N - layer*2 -1
int x=layer, y=layer;
int number = 1;
map[x][y] = number; // in case of N is odd
for (int dir=0; dir < 4; dir ++){
for (int i=0; i<N-layer*2-1; i++){
map[x][y] = number;
number ++;
x = x + dirs[dir][0];
y = y + dirs[dir][1];
printf("Final map is: \n");
for (int i = 0; i < N; i++){
for (int j = 0; j < N; j++){
printf("%4d ", map[i][j]);
return 0;
Final map is:
1 2 3 4
12 1 2 5
11 4 3 6
10 9 8 7
Process finished with exit code 0
But I don't understand why do we put the supposed last instruction {0,1} (meaning go up 1 element and stay on said column) as the first instruction, considering we start counting from 1 and end up at 4 by repeating the {0,1} instruction N-layer*2-1 times (3 if n=4 and in the first layer).
Shouldn't dirs be:
int dirs[4][2] = {
{1, 0},
{0, -1},
{-1, 0},
{0, 1}
(0,1) means moving from column y=0 to column y=1 thus moving rightwards (ryyker's comment enlightens this).

Moving through each element of a (potentially non square) 2d array, diagonally

I was playing around with 2d arrays in c, and I am wondering how to traverse a 2d array, fully and diagonally.
Horizontally, in the matrix of dimensions width,height
you can just move through each index i, and inspect elements at index j
Something like:
const int width = 10;
const int height = 10;
const int mat[width][height] = {0};
for (i = 0, i<width, i++){
for (j = 0; j<height; j++){
mat[i][0] = j;
I just added in something random so the loop did something..., however, the key is that I was traversing in the correct direction
vertically would be similar, with some flipped parameters
however diagonally...I am a bit lost; I cannot think of a way to traverse in a diagonal way. Conceptually I may want to hit the 4x3 matrix in the following order:
1 2 4 7
3 5 8 10
6 9 11 12
Or with indices i,j :
0,0 ->
1,0 -> 0,1 ->
2,0 -> 1,1 -> 0,2 ->
2,1 -> 1,2 -> 0,3 ->
2,2 -> 1,3 ->
Is there a straightforward way to hit these elements(not necessarily in that order per say, but I think it would be useful to increment diagonally)
Also, is it possible to check the diagonals in the opposite direction ?
For diagonal traverse, use only one for loop, as both i and j are identical:
for (i = 0, i < min(width, hight), i++){
// something to do with element[i][i]
This is a duplicate of Traverse Matrix in diagonal strips where is it suggested something like :
int main(void)
int m[3][4] = { 1, 2, 4, 7,
3, 5, 8,10,
6, 9,11,12 };
const int h = 3; /* height*/
const int w = 4; /* width */
for (int diag = 0; diag < (w + h - 1); ++diag)
printf("diag %d:",diag);
int i_start = diag < w ? 0 : diag - w + 1;
int i_end = diag < h ? diag + 1 : h;
for (int i = i_start; i < i_end; ++i)
printf("%d ",m[i][diag - i]);
return 0;
diag 0:1
diag 1:2 3
diag 2:4 5 6
diag 3:7 8 9
diag 4:10 11
diag 5:12

Program to print and display identity matrix in C

i am having trouble writing a program that prints a matrix, and then I generate the identity matrix. Here is my ccode below and any help would be greatly appreciated.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int PrintMatrix(int dim, double matrix[dim][dim]);
int main()
int PrintMatrix(int dim, double matrix[dim][dim]) {
int aa, bb;
for (aa = 0; aa <= dim; aa++) {
for (bb = 0; bb <= dim; bb++) {
printf("%lf ", matrix[aa][bb]);
double TestMatrix[7][7] = {
PrintMatrix(7, TestMatrix);
return 0;
Your code won't compile successfully.
After main there is no opening brace.
You are defining function inside main, which is an issue.
Check for parentheses in whole code.
Fixed the loop controls from <= to <.
Here is the modified code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int PrintMatrix(int dim, double matrix[dim][dim]);
int main()
double TestMatrix[7][7] = {
PrintMatrix(7, TestMatrix);
return 0;
int PrintMatrix(int dim, double matrix[dim][dim]) {
int aa, bb;
for (aa = 0; aa < dim; aa++) {
for (bb = 0; bb < dim; bb++) {
printf("%lf ", matrix[aa][bb]);
The code in the question is an appalling non-compiling mess. One of the comments is:
It still isn't returning the identity for dim = 2 up to 7; any thoughts?
As BluePixy hinted, if you lie to your compiler about the size of the input matrix to the function, for example by passing a 7x7 matrix but telling that it has a 3x3 matrix, it gets its revenge by printing different information from what you wanted. Don't lie to the compiler!
If you want to print identity matrices of sizes 1..7 from a 7x7 matrix, tell the compiler (function) both the actual size of the matrix and the size you want printed. For an identity matrix, you don't actually need the original matrix — you could synthesize the data.
#include <stdio.h>
static void printIdentityMatrix(int size)
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++)
printf("%4.1f", (i == j) ? 1.0 : 0.0);
int main(void)
for (int i = 1; i < 8; i++)
return 0;
For printing the top left square subset of an arbitrarily sized square matrix, you must pass both the size of the data to be printed and the actual size of the matrix.
#include <assert.h>
#include <stdio.h>
static void PrintMatrix(int size, int dim, double matrix[dim][dim])
assert(size <= dim);
for (int aa = 0; aa < size; aa++)
for (int bb = 0; bb < size; bb++)
printf("%lf ", matrix[aa][bb]);
int main(void)
double TestMatrix[7][7] =
for (int i = 1; i < 8; i++)
PrintMatrix(i, 7, TestMatrix);
return 0;
Printing an arbitrary rectangular submatrix of an arbitrarily sized rectangular matrix requires many more function parameters (7 if I am counting correctly:
void PrintSubMatrix(int x_off, int y_off, int x_len, int y_len, int x_size, int y_size,
double matrix[x_size][y_size]);
and that's before you specify the file stream to write on.
#include <assert.h>
#include <stdio.h>
static void PrintSubMatrix(int x_off, int y_off, int x_len, int y_len, int x_size, int y_size,
double matrix[x_size][y_size])
assert(x_off >= 0 && x_off < x_size && x_off + x_len <= x_size);
assert(y_off >= 0 && y_off < y_size && y_off + y_len <= y_size);
printf("SubMatrix size %dx%d at (%d,%d) in M[%d][%d]\n",
x_len, y_len, x_off, y_off, x_size, y_size);
for (int x = x_off; x < x_off + x_len; x++)
for (int y = y_off; y < y_off + y_len; y++)
printf("%4.1f ", matrix[x][y]);
int main(void)
double TestMatrix[7][9] =
{ 1, 2, 3, 4, 3, 2, 1, 2, 3 },
{ 2, 1, 9, 8, 4, 6, 0, 0, 1 },
{ 3, 0, 8, 7, 5, 5, 0, 0, 1 },
{ 4, 0, 5, 6, 6, 8, 4, 4, 4 },
{ 5, 0, 1, 4, 7, 9, 0, 0, 1 },
{ 6, 0, 1, 0, 8, 1, 0, 0, 1 },
{ 7, 0, 0, 0, 9, 0, 1, 0, 1 },
PrintSubMatrix(0, 0, 7, 9, 7, 9, TestMatrix);
for (int i = 1; i < 4; i++)
for (int j = 2; j < 4; j++)
PrintSubMatrix(i, j, 3 + j - i, i + j, 7, 9, TestMatrix);
return 0;
Sample run:
SubMatrix size 7x9 at (0,0) in M[7][9]
1.0 2.0 3.0 4.0 3.0 2.0 1.0 2.0 3.0
2.0 1.0 9.0 8.0 4.0 6.0 0.0 0.0 1.0
3.0 0.0 8.0 7.0 5.0 5.0 0.0 0.0 1.0
4.0 0.0 5.0 6.0 6.0 8.0 4.0 4.0 4.0
5.0 0.0 1.0 4.0 7.0 9.0 0.0 0.0 1.0
6.0 0.0 1.0 0.0 8.0 1.0 0.0 0.0 1.0
7.0 0.0 0.0 0.0 9.0 0.0 1.0 0.0 1.0
SubMatrix size 4x3 at (1,2) in M[7][9]
9.0 8.0 4.0
8.0 7.0 5.0
5.0 6.0 6.0
1.0 4.0 7.0
SubMatrix size 5x4 at (1,3) in M[7][9]
8.0 4.0 6.0 0.0
7.0 5.0 5.0 0.0
6.0 6.0 8.0 4.0
4.0 7.0 9.0 0.0
0.0 8.0 1.0 0.0
SubMatrix size 3x4 at (2,2) in M[7][9]
8.0 7.0 5.0 5.0
5.0 6.0 6.0 8.0
1.0 4.0 7.0 9.0
SubMatrix size 4x5 at (2,3) in M[7][9]
7.0 5.0 5.0 0.0 0.0
6.0 6.0 8.0 4.0 4.0
4.0 7.0 9.0 0.0 0.0
0.0 8.0 1.0 0.0 0.0
SubMatrix size 2x5 at (3,2) in M[7][9]
5.0 6.0 6.0 8.0 4.0
1.0 4.0 7.0 9.0 0.0
SubMatrix size 3x6 at (3,3) in M[7][9]
6.0 6.0 8.0 4.0 4.0 4.0
4.0 7.0 9.0 0.0 0.0 1.0
0.0 8.0 1.0 0.0 0.0 1.0
It would be better if the code was fixed not to print a blank at the end of each line; that's left as an exercise for the reader.

Gauss Seidel (Specific equation solver) in C

Apologies for posting a badly formed question prior to this attempt.
I'm trying to get a Gauss Seidel method to work in C, to check how much quicker it is than higher level interpreted languages (i.e python), but I'm having some issues with the results that I'm obtaining.
My input matrix is
Symmetric Positive-Definitive
& Diagonally dominant
so I believe it should converge.
The problem attempts to solve "Ax=b" ,
(Where 'A' = 'a[ ][ ]' ,'b' = 'b[ ]', and 'x'= 'x[ ]')
The final array 'check [ ]' is obtained via a dot product between 'a' and 'x' to see if it returns something close to 'b'.
The below code is fully executable.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main(void)
int i=0,j=0;
int num=0;
double h = 1.0/(3+1);
double h2 = pow(h,2);
double w=1.5, sum=0.0;
long double x[9],y[9], check[9];
long double tol = pow(10, -10);
long double a[9][9] = {{-4, 1, 0, 1, 0, 0, 0, 0, 0} ,
{1, -4, 1, 0, 1, 0, 0, 0, 0} ,
{0, 1, -4, 0, 0, 1, 0, 0, 0} ,
{1, 0, 0, -4, 1, 0, 1, 0, 0} ,
{0, 1, 0, 1, -4, 1, 0, 1, 0} ,
{0, 0, 1, 0, 1, -4, 0, 0, 1} ,
{0, 0, 0, 1, 0, 0, -4, 1, 0} ,
{0, 0, 0, 0, 1, 0, 1, -4, 1} ,
{0, 0, 0, 0, 0, 1, 0, 1, -4}};
long double b[9] = {0.000000,
0.000000 };
for(i=0;i<9;i++){ // initialise the arrays to 0
for(i=0;i<9;i++){ // prints 'a' matrix, to check if its right
printf("a[%d][%d] = %LF ",i,j,a[i][j]);
printf("\n" );
for(i=0;i<9;i++){ // prints b matrix
printf("b[%d] = %LF \n",i,b[i]);
do{ // the Gauss seidel Solver
for(i =0;i<9;i++){
for(j=0; j<9; j++){
sum += a[i][j]*x[j];
x[i] = (w/a[i][i])* (b[i] - sum + a[i][i]*x[i]) + (1-w)*x[i];
}while (fabs(y[i]-x[i])>tol);
for(i=0;i<9;i++){ // Prints the solution X
printf("x[%d] = %LF \n",i,x[i]);
for(i=0;i<9;i++){ // Ititialises matrix
for (i = 0; i < 9; i++){ // performs a dot product of
// 'a' and 'x', to see if we get
// 'b' as the result
for(j = 0; j< 9; j++){
check[i]+= a[i][j] * x[j];
check[i] = check[i]/h2; // the 'b' matrix was multiplied by h2,
// hence to get it back, a division is necessary
printf("check[%d] = %LF\n",i,check[i]);
printf("num=%d\n",num );
return 0;
The output i.e 'x' that I get is:
x[0] = -0.000000
x[1] = -0.000000
x[2] = -0.000000
x[3] = -0.000000
x[4] = -0.421875
x[5] = -0.791016
x[6] = -1.423828
x[7] = -3.816650
x[8] = -11.702087
and the output for 'Check' I get is:
check[0] = 0.000000
check[1] = -4.500000
check[2] = -5.625000
check[3] = -14.625000
check[4] = -10.968750
check[5] = -42.328125
check[6] = 17.156250
check[7] = 18.421875
check[8] = 212.343750
Ideally, if everything works, check[4] should output 2 (the reason for which is given in a comment in the code when outputting 'check'), and every other element of check should be 0.
Any suggestions?
sum should be reinitialized to 0 inside the for-loop before starting the next row and the equation is incorrect. The equation you are using from the python implementation assumes that a[i][i]*x[i] was added to make the full dot-product, they used numpy to get the dot product instead of looping so they had no opportunity to do i != j. Also I'm not sure the equation in that implementation is the Gauss Seidel method, it looks more like Successive Over Relaxation because of the w and (1 - w). Anyway, here is my modified solution. I check for convergence using the error, |Ax - b| < tol for all entries. The for-loop is split into two as a small optimization. a[i][i] * x[i] is added to sum to get the current value for (Ax)i in the error check.
int converged;
do {
converged = 1;
for (i = 0; i < 9; i++) {
sum = 0;
for (j = 0; j < i; j++) {
sum += a[i][j] * x[j];
for (j = i + 1; j < 9; j++) {
sum += a[i][j] * x[j];
x[i] += w * ((b[i] - sum) / a[i][i] - x[i]);
if (fabs(sum + a[i][i] * x[i] - b[i]) > tol) {
converged = 0;
} while (!converged);
which gives the output:
x[0] = -0.007812
x[1] = -0.015625
x[2] = -0.007812
x[3] = -0.015625
x[4] = -0.046875
x[5] = -0.015625
x[6] = -0.007812
x[7] = -0.015625
x[8] = -0.007813
check[0] = 0.000000
check[1] = -0.000000
check[2] = -0.000000
check[3] = -0.000000
check[4] = 2.000000
check[5] = 0.000000
check[6] = -0.000000
check[7] = 0.000000
check[8] = 0.000000
For the benefit of those following along at home. I suggest reading with the wikipedia article on Gauss-Seigel. I will attempt to explain what the algorithm is doing, and provide C code that implements the algorithm.
The Python example in the wikipedia page uses this simple example for matrix A and B
| 10 -1 2 0 | | 6 |
A = | -1 11 -1 3 | B = | 25 |
| 2 -1 10 -1 | | -11 |
| 0 3 -1 8 | | 8 |
Those matrices represent the following system of equations
10*x1 - x2 + 2*x3 = 6
-x1 + 11*x2 - x3 + 3*x4 = 25
2*x1 - x2 + 10*x3 - x4 = -11
3*x2 - x3 + 8*x4 = 15
The solution that we're trying to find with Gauss-Seigel is
x1=1 x2=2 x3= -1 x4=1
So how does the algorithm work? Well first take a wild a guess at the answer, e.g.
x1=0 x2=0 x3=0 x4=0
Then plug those guesses into the equations and try to improve the guesses. Specifically, plug the values for x2,x3,x4 into the first equation, and then compute a new value for x1.
10*x1 - 0 + 0 = 6 ==> x1 = 6/10 = 0.6
Then plug the new value of x1, and the old values of x3,x4 into the second equation to get an improved guess for x2
-0.6 + 11*x2 - 0 + 0 = 25 ==> 11*x2 = 25.6 ==> x2 = 2.327273
And for x3 and x4
2*0.6 - 2.327273 + 10*x3 - 0 = -11 ==> 10*x3 = -9.872727 ==> x3 = -0.987273
3*2.327273 + 0.987273 + 8*x4 = 15 ==> 8*x4 = 7.030908 ==> x4 = 0.878864
So after one iteration of Gauss-Seigel, the improved guess at the answer is
x1=0.6 x2=2.327273 x3= -0.987273 x4=0.878864
The algorithm continues until either the solution converges or the maximum number of iterations is exceeded.
Here's what the code looks like in C. The counter k limits the number of iterations (just in case the solution doesn't converge). The Gauss-Seidel method is applied by evaluating each of the equations while skipping X[i]. Then the new value for X[i] is computed. The code displays the new values of X[], and the checks if the answer is good enough by evaluating each equation and verifying that the sum is within epsilon of B[i].
#include <stdio.h>
#include <math.h>
#define SIZE 4
double A[SIZE][SIZE] = {
{ 10, -1, 2, 0 },
{ -1, 11, -1, 3 },
{ 2, -1, 10, -1 },
{ 0, 3, -1, 8 }
double B[SIZE] = { 6, 25, -11, 15 };
double X[SIZE] = { 0, 0, 0, 0 };
int main( void )
int i, j, k, done;
double sum;
done = 0;
for ( k = 0; k < 100 && !done; k++ )
// perform the next iteration of Gauss-Seidel
for ( i = 0; i < SIZE; i++ )
sum = 0;
for ( j = 0; j < SIZE; j++ )
if ( j != i )
sum += A[i][j] * X[j];
X[i] = (B[i] - sum) / A[i][i];
// print the k'th iteration of X[]
printf( "%2d --", k );
for ( i = 0; i < SIZE; i++ )
printf( " %lf", X[i] );
printf( "\n" );
// check for convergence
done = 1;
for ( i = 0; i < SIZE; i++ )
sum = 0;
for ( j = 0; j < SIZE; j++ )
sum += A[i][j] * X[j];
if ( fabs( B[i] - sum ) > 1e-6 )
done = 0;

How to rotate a matrix 90 degrees without using any extra space? [duplicate]

By saying 90 degrees i mean to say if:
A = {1,2,3,
then after 90 degree rotation A becomes:
A = {7,4,1,
Transpose and interchange rows or columns (depends whether you want to rotate left or right).
e. g.
1) original matrix
1 2 3
4 5 6
7 8 9
2) transpose
1 4 7
2 5 8
3 6 9
3-a) change rows to rotate left
3 6 9
2 5 8
1 4 7
3-b) or change columns to rotate right
7 4 1
8 5 2
9 6 3
All these operations can be done without allocating memory.
let a be an nxn array 0 based indexing
f = floor(n/2)
c = ceil(n/2)
for x = 0 to f - 1
for y = 0 to c - 1
temp = a[x,y]
a[x,y] = a[y,n-1-x]
a[y,n-1-x] = a[n-1-x,n-1-y]
a[n-1-x,n-1-y] = a[n-1-y,x]
a[n-1-y,x] = temp
Edit If you want to avoid using temp, this works (it also rotates in the correct direction) this time in python.
def rot2(a):
n = len(a)
c = (n+1) / 2
f = n / 2
for x in range(c):
for y in range(f):
a[x][y] = a[x][y] ^ a[n-1-y][x]
a[n-1-y][x] = a[x][y] ^ a[n-1-y][x]
a[x][y] = a[x][y] ^ a[n-1-y][x]
a[n-1-y][x] = a[n-1-y][x] ^ a[n-1-x][n-1-y]
a[n-1-x][n-1-y] = a[n-1-y][x] ^ a[n-1-x][n-1-y]
a[n-1-y][x] = a[n-1-y][x] ^ a[n-1-x][n-1-y]
a[n-1-x][n-1-y] = a[n-1-x][n-1-y]^a[y][n-1-x]
a[y][n-1-x] = a[n-1-x][n-1-y]^a[y][n-1-x]
a[n-1-x][n-1-y] = a[n-1-x][n-1-y]^a[y][n-1-x]
Note: This only works for matrices of integers.
The algorithm is to rotate each "ring", working from the outermost to the innermost.
The algorithm would rotate all the A's first, then B's then C's. Rotating a ring requires moving 4 values at once.
The index i ranges from 0..ring-width-1, e.g. for A the width is 5.
(i,0) -> temp
(0, N-i-1) -> (i, 0)
(N-i-1, N-1) -> (0, N-i-1)
(N-1, i) -> (N-i-1, N-1)
temp -> (N-1, i)
This is then repeated for each successive inner ring, offsetting the co-ordinates reducing the ring width by 2.
[Another answer has appeared with the code, so I'll not repeat that.]
Complete implementation in C using the method described by #Narek above
#include <stdio.h>
int n;
unsigned int arr[100][100];
void rotate() {
int i,j,temp;
for(i=0; i<n; i++) {
for(j=i; j<n; j++) {
if(i!=j) {
for(i=0; i<n/2; i++) {
for(j=0; j<n; j++) {
void display(){
int i,j;
for(i=0;i<n;i++) {
for(j=0;j<n;j++) {
int main(int argc, char *argv[]){
int i,j;
printf("%s","Enter size of matrix:");
printf("%s","Enter matrix elements\n");
for(i=0;i<n;i++) {
for(j=0;j<n;j++) {
return 0;
See this article for in-place matrix transposition; also google for "in-place matrix transposition". It can be easily adapted to perform rotation by 90 degrees. To transpose square matrices, you just interchange b[i][j] with b[j][i] where b[k][l] is a[n*k+l]. On nonsquare matrices, it's considerably more difficult. "Without any extra space" is a rather strong requirement, maybe you meant O(1) space? (assuming integers are fixed size) Implementation in C++: here.
You need one temp variable, then it is just to jump elements around.
temp = A[0];
A[0] = A[6];
A[6] = A[8];
A[8] = A[2];
A[2] = temp;
temp = A[1];
A[1] = A[3];
A[3] = A[7];
A[7] = A[5];
A[5] = temp;
I came across the following implementation:
For square matrices:
for n = 0 to N - 2
for m = n + 1 to N - 1
swap A(n,m) with A(m,n)
For rectangular matrices:
for each length>1 cycle C of the permutation
pick a starting address s in C
let D = data at s
let x = predecessor of s in the cycle
while x ≠ s
move data from x to successor of x
let x = predecessor of x
move data from D to successor of s
For more info, one can refer here:
