I have a trained neural network for classification problems in Matlab. I want to use the trained weight and apply it in C. My output of the neural network gives me a vector of 7 (output2[i]).
How can I use the same vec2ind function in Matlab which takes a matrix of vectors, each containing a single 1, and returns the indices of the ones, and stop as soon as it finds the 1?
I want to implement it in C languages.
I attached part of the code
Thank you
double sum = 0;
/// matrix multiplication
for (int i = 0; i < 29; i++)
{
for (int k = 0; k < 2; k++)
{
sum += inputs[k] * weights[i][k];
}
/// apply activation function
output[i] = tanh_func(sum + biases[i]);
sum = 0;
}
/// output layer
for (int i = 0; i < 7; i++)
{
for (int k = 0; k < 29; k++)
{
sum += output[k] * weights2[i][k];
}
/// apply activation function
output2[i] = sigmoid(sum + biases2[i]);
sum = 0;
}
For writing just vec2ind in C, see other answers.
Alternatively, if you have access to MATLAB Coder, that can convert MATLAB code (the entire algorithm) to C code automatically:
https://www.mathworks.com/help/coder/index.html
To get a better understanding you could use a triple pointer double*** tensor3d
initialize it with
tensor3d = malloc( dim0 * sizeof( double** ));
for( int i = 0; i < dim0; ++i )
{
tensor3d[i] = malloc( dim1 * sizeof( double* ));
for( int j = 0; j < dim1; ++j )
{
tensor3d[i][j] = malloc( dim2 * sizeof( double ));
}
}
if you have copied the matrix of vectors inside that 3dim-array you can do that
int indices[dim0][dim1];
for( int i = 0; i < dim0; ++i )
{
for( int j = 0; j < dim1; ++j )
{
for( int k = 0; k < dim2; ++k )
{
if( tensor3d[i][j][k] > 0.99 && tensor3d[i][j][k] < 1.01 )
{
indices[i][j] = k;
break;
}
}
}
}
but in real life you would use a 1 dim-array flattensor3d[] and
tensor3d[i][j][k] == flattensor3d[i * dim0 * dim1 + j * dim1 + k]
Related
I am writing a program that creates arrays of a given length and manipulates them. You cannot use other libraries.
First, an array M1 of length N is formed, after which an array M2 of length N is formed/2.
In the M1 array, the division by Pi operation is applied to each element, followed by elevation to the third power.
Then, in the M2 array, each element is alternately added to the previous one, and the tangent modulus operation is applied to the result of addition.
After that, exponentiation is applied to all elements of the M1 and M2 array with the same indexes and the resulting array is sorted by dwarf sorting.
And at the end, the sum of the sines of the elements of the M2 array is calculated, which, when divided by the minimum non-zero element of the M2 array, give an even number.
The problem is that the result X gives is -nan(ind). I can't figure out exactly where the error is.
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
const int A = 441;
const double PI = 3.1415926535897931159979635;
inline void dwarf_sort(double* array, int size) {
size_t i = 1;
while (i < size) {
if (i == 0) {
i = 1;
}
if (array[i - 1] <= array[i]) {
++i;
}
else
{
long tmp = array[i];
array[i] = array[i - 1];
array[i - 1] = tmp;
--i;
}
}
}
inline double reduce(double* array, int size) {
size_t i;
double min = RAND_MAX, sum = 0;
for (i = 0; i < size; ++i) {
if (array[i] < min && array[i] != 0) {
min = array[i];
}
}
for (i = 0; i < size; ++i) {
if ((int)(array[i] / min) % 2 == 0) {
sum += sin(array[i]);
}
}
return sum;
}
int main(int argc, char* argv[])
{
int i, N, j;
double* M1 = NULL, * M2 = NULL, * M2_copy = NULL;
double X;
unsigned int seed = 0;
N = atoi(argv[1]); /* N равен первому параметру командной строки */
M1 = malloc(N * sizeof(double));
M2 = malloc(N / 2 * sizeof(double));
M2_copy = malloc(N / 2 * sizeof(double));
for (i = 0; i < 100; i++)
{
seed = i;
srand(i);
/*generate*/
for (j = 0; j < N; ++j) {
M1[j] = (rand_r(&seed) % A) + 1;
}
for (j = 0; j < N / 2; ++j) {
M2[j] = (rand_r(&seed) % (10 * A)) + 1;
}
/*map*/
for (j = 0; j < N; ++j)
{
M1[j] = pow(M1[j] / PI, 3);
}
for (j = 0; j < N / 2; ++j) {
M2_copy[j] = M2[j];
}
M2[0] = fabs(tan(M2_copy[0]));
for (j = 0; j < N / 2; ++j) {
M2[j] = fabs(tan(M2[j] + M2_copy[j]));
}
/*merge*/
for (j = 0; j < N / 2; ++j) {
M2[j] = pow(M1[j], M2[j]);
}
/*sort*/
dwarf_sort(M2, N / 2);
/*sort*/
X = reduce(M2, N / 2);
}
printf("\nN=%d.\n", N);
printf("X=%f\n", X);
return 0;
}
Knowledgeable people, does anyone see where my mistake is? I think I'm putting the wrong data types to the variables, but I still can't solve the problem.
Replace the /* merge */ part with this:
/*merge*/
for (j = 0; j < N / 2; ++j) {
printf("%f %f ", M1[j], M2[j]);
M2[j] = pow(M1[j], M2[j]);
printf("%f\n", M2[j]);
}
This will print the values and the results of the pow operation. You'll see that some of these values are huge resulting in an capacity overflow of double.
Something like pow(593419.97, 31.80) will not end well.
Im learning Intrinsics. I dont know how to load a matrix correctly. I want to do matrix multiplication.
This is my code:
int i, j, k;
__m128 mat2values = _mm_setzero_ps();
__m128 mat1values = _mm_setzero_ps();
__m128 r = _mm_setzero_ps();
for (i = 0; i < N; ++i)
{
for (j = 0; j < N - 3; j += 4)
{
for (k = 0; k < N - 3; k += 4)
{
mat1values = _mm_load_ps(&mat1[i][k]);
mat2values = _mm_load_ps(&mat2[k][j]);
r = _mm_add_ps(r, _mm_mul_ps(mat1values, mat2values));
}
result[i][j] = r.m128_f32[0] + r.m128_f32[1] + r.m128_f32[2] + r.m128_f32[3];
for (; k < N; k++)
result[i][j] += mat1[i][j] * mat2[k][j];
}
}
When debugging result will still hold all 0 values after loop.
Are you sure the expression
_mm_load_ps(mat1[i][k])
returns the correct memory address in float*?
I am trying to find the location of a target inside of a 1-D array that acts like a table with rows and cols. I could do it using divide and mod, but I am stuck on finding it using nested loops. specifically, I can't seem to assign values inside the nested loop.
here is my code:
#include <stdio.h>
int main()
{
int arr[9] = // act as a 3 X 3 table
{ 2, 34, 6,
7, 45, 45,
35,65, 2
};
int target = 7;// r = 1; c = 0
int r = 0; // row of the target
int c = 0; // col of the target
int rows = 3;
int cols = 3;
for (int i = 0; i < rows; i++){
for (int j = 0; j + i * cols < cols + i * cols; i++ ){
if (arr[j] == target){
c = j; // columns of the target
r = i; // rows of the target
}
}
}
printf ("%d, %d",c, r);
return 0;
}
The code outputs: 0,0.
The problem isn't with the assignment, it's with the wrong loop and if condition.
The outer loop should loop over the i rows
The inner loop should loop over the j columns
within both loops, the cell to evaluate is i * cols + j
Put it all together and you'll get:
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++ ) {
if (arr[i * cols + j] == target) {
c = j; // columns of the target
r = i; // rows of the target
}
}
}
Since arr is 1D array and inside for loop, for any i value j will reach upto max 3 only so its not checking after arr[3]
To avoid this problem take int pointer and points to arr and do the operation as below
int *p = arr;
for (i = 0; i < rows; i++){
for ( j = 0; j < cols ; j++ ){
if (p[j] == target){
c = j; // columns of the target
r = i; // rows of the target
}
}
p = p + j;/*make p to points to next row */
}
A better solution would use only one loop:
for (int i = 0; i < rows * cols; i++){
if (arr[i] == target){
r = i / 3;
c = i % r;
}
}
I'm working on a class assignment and I've run into an issue I haven't been able to figure out. I'm implementing the Ford-Fulkerson algorithm using BFS to find max flow. But while trying to set my Residual Capacity matrix to the given capacity, I hit a segmentation fault. In the test code we received, I can see that the original capacity matrix was passed by value by its address, but I have a feeling that in my code I'm not interacting with it the way I think I am? Which leads me to believe that I may have the same issue recurring elsewhere. I worked with gdb and saw that I hit a segmentation fault on this line here in my nested for loop :
resCap[i][j] = *(capacity + i*n + j);
However, nothing I have tried has worked for me though so I am pretty stumped.
void maximum_flow(int n, int s, int t, int *capacity, int *flow)
{
int i, j, resCap[n][n], path[n]; // residual capacity and BFS augmenting path
int min_path = INT_MAX; // min of the augmenting path
// Assign residual capacity equal to the given capacity
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
{
resCap[i][j] = *(capacity + i*n + j);
*(flow + i*n + j) = 0; // no initial flow
}
// Augment path with BFS from source to sink
while (bfs(n, s, t, &(resCap[0][0]), path))
{
// find min of the augmenting path
for (j = t; j != s; j = path[j])
{
i = path[j];
min_path = min(min_path, resCap[i][j]);
}
// update residual capacities and flows on both directions
for (j = t; j != s; j = path[j])
{
i = path[j];
if(*(capacity + i*n + j) > 0)
*(flow + i*n + j) += min_flow_path;
else
*(flow + j*n + i) -= min_flow_path;
resCap[i][j] -= min_flow_path;
resCap[j][i] += min_flow_path;
}
}
}
And here is the test code provided to us in case it is needed:
int main(void)
{ int cap[1000][1000], flow[1000][1000];
int i,j, flowsum;
for(i=0; i< 1000; i++)
for( j =0; j< 1000; j++ )
cap[i][j] = 0;
for(i=0; i<499; i++)
for( j=i+1; j<500; j++)
cap[i][j] = 2;
for(i=1; i<500; i++)
cap[i][500 + (i/2)] =4;
for(i=500; i < 750; i++ )
{ cap[i][i-250]=3;
cap[i][750] = 1;
cap[i][751] = 1;
cap[i][752] = 5;
}
cap[751][753] = 5;
cap[752][753] = 5;
cap[753][750] = 20;
for( i=754; i< 999; i++)
{ cap[753][i]=1;
cap[i][500]=3;
cap[i][498]=5;
cap[i][1] = 100;
}
cap[900][999] = 1;
cap[910][999] = 1;
cap[920][999] = 1;
cap[930][999] = 1;
cap[940][999] = 1;
cap[950][999] = 1;
cap[960][999] = 1;
cap[970][999] = 1;
cap[980][999] = 1;
cap[990][999] = 1;
printf("prepared capacity matrix, now executing maxflow code\n");
maximum_flow(1000,0,999,&(cap[0][0]),&(flow[0][0]));
for(i=0; i<=999; i++)
for(j=0; j<=999; j++)
{ if( flow[i][j] > cap[i][j] )
{ printf("Capacity violated\n"); exit(0);}
}
flowsum = 0;
for(i=0; i<=999; i++)
flowsum += flow[0][i];
printf("Outflow of 0 is %d, should be 10\n", flowsum);
flowsum = 0;
for(i=0; i<=999; i++)
flowsum += flow[i][999];
printf("Inflow of 999 is %d, should be 10\n", flowsum);
printf("End Test\n");
}
This line is likely going to segfault, it does using Clang.
int i, j, resCap[n][n], path[n];
You're declaring a very large array on the stack. Just how big can be seen when you try and allocated it using calloc. Try this instead and don't forget to free it using the same sort of loop.
int **resCap2 = calloc(1, n * sizeof(int *));
assert(resCap2);
for (i = 0; i < n; i++) {
resCap2[i] = calloc(1, n * sizeof(int));
assert(resCap2[i]);
}
This is a lot of space ie
(1000 * sizeof(int*) * (1000 * n * sizeof(int)))
I'd like to allocate a 3D matrix in one big chunk. It should be possible to access this matrix in the [i][j][k] fashion, without having to calculate the linearized index every time.
I think it should be something like below, but I'm having trouble filling the ...
double ****matrix = (double ****) malloc(...)
for (int i = 0; i < imax; i++) {
matrix[i] = &matrix[...]
for (int j = 0; j < jmax; j++) {
matrix[i][j] = &matrix[...]
for (int k = 0; k < kmax; k++) {
matrix[i][j][k] = &matrix[...]
}
}
}
For the single allocation to be possible and work, you need to lay out the resulting memory like this:
imax units of double **
imax * jmax units of double *
imax * jmax * kmax units of double
Further, the 'imax units of double **' must be allocated first; you can reorder the other two sections, but it is most sensible to deal with them in the order listed.
You also need to be able to assume that double and double * (and double **, but that's not much of a stretch) are sufficiently well aligned that you can simply allocate the chunks contiguously. That is going to hold OK on most 64-bit systems with type double, but be aware of the possibility that it does not hold on 32-bit systems or for other types than double (basically, the assumption could be problematic when sizeof(double) != sizeof(double *)).
With those caveats made, then this code works cleanly (tested on Mac OS X 10.10.2 with GCC 4.9.1 and Valgrind version valgrind-3.11.0.SVN):
#include <stdio.h>
#include <stdlib.h>
typedef double Element;
static Element ***alloc_3d_matrix(size_t imax, size_t jmax, size_t kmax)
{
size_t i_size = imax * sizeof(Element **);
size_t j_size = imax * jmax * sizeof(Element *);
size_t k_size = imax * jmax * kmax * sizeof(Element);
Element ***matrix = malloc(i_size + j_size + k_size);
if (matrix == 0)
return 0;
printf("i = %zu, j = %zu, k = %zu; sizes: i = %zu, j = %zu, k = %zu; "
"%zu bytes total\n",
imax, jmax, kmax, i_size, j_size, k_size, i_size + j_size + k_size);
printf("matrix = %p .. %p\n", (void *)matrix,
(void *)((char *)matrix + i_size + j_size + k_size));
Element **j_base = (void *)((char *)matrix + imax * sizeof(Element **));
printf("j_base = %p\n", (void *)j_base);
for (size_t i = 0; i < imax; i++)
{
matrix[i] = &j_base[i * jmax];
printf("matrix[%zu] = %p (%p)\n",
i, (void *)matrix[i], (void *)&matrix[i]);
}
Element *k_base = (void *)((char *)j_base + imax * jmax * sizeof(Element *));
printf("k_base = %p\n", (void *)k_base);
for (size_t i = 0; i < imax; i++)
{
for (size_t j = 0; j < jmax; j++)
{
matrix[i][j] = &k_base[(i * jmax + j) * kmax];
printf("matrix[%zu][%zu] = %p (%p)\n",
i, j, (void *)matrix[i][j], (void *)&matrix[i][j]);
}
}
/* Diagnostic only */
for (size_t i = 0; i < imax; i++)
{
for (size_t j = 0; j < jmax; j++)
{
for (size_t k = 0; k < kmax; k++)
printf("matrix[%zu][%zu][%zu] = %p\n",
i, j, k, (void *)&matrix[i][j][k]);
}
}
return matrix;
}
int main(void)
{
size_t i_max = 3;
size_t j_max = 4;
size_t k_max = 5;
Element ***matrix = alloc_3d_matrix(i_max, j_max, k_max);
if (matrix == 0)
{
fprintf(stderr, "Failed to allocate matrix[%zu][%zu][%zu]\n", i_max, j_max, k_max);
return 1;
}
for (size_t i = 0; i < i_max; i++)
{
for (size_t j = 0; j < j_max; j++)
{
for (size_t k = 0; k < k_max; k++)
matrix[i][j][k] = (i + 1) * 100 + (j + 1) * 10 + k + 1;
}
}
for (size_t i = 0; i < i_max; i++)
{
for (size_t j = 0; j < j_max; j++)
{
for (size_t k = k_max; k > 0; k--)
printf("[%zu][%zu][%zu] = %6.0f\n", i, j, k-1, matrix[i][j][k-1]);
}
}
free(matrix);
return 0;
}
Example output (with some boring bits omitted):
i = 3, j = 4, k = 5; sizes: i = 24, j = 96, k = 480; 600 bytes total
matrix = 0x100821630 .. 0x100821888
j_base = 0x100821648
matrix[0] = 0x100821648 (0x100821630)
matrix[1] = 0x100821668 (0x100821638)
matrix[2] = 0x100821688 (0x100821640)
k_base = 0x1008216a8
matrix[0][0] = 0x1008216a8 (0x100821648)
matrix[0][1] = 0x1008216d0 (0x100821650)
matrix[0][2] = 0x1008216f8 (0x100821658)
matrix[0][3] = 0x100821720 (0x100821660)
matrix[1][0] = 0x100821748 (0x100821668)
matrix[1][1] = 0x100821770 (0x100821670)
matrix[1][2] = 0x100821798 (0x100821678)
matrix[1][3] = 0x1008217c0 (0x100821680)
matrix[2][0] = 0x1008217e8 (0x100821688)
matrix[2][1] = 0x100821810 (0x100821690)
matrix[2][2] = 0x100821838 (0x100821698)
matrix[2][3] = 0x100821860 (0x1008216a0)
matrix[0][0][0] = 0x1008216a8
matrix[0][0][1] = 0x1008216b0
matrix[0][0][2] = 0x1008216b8
matrix[0][0][3] = 0x1008216c0
matrix[0][0][4] = 0x1008216c8
matrix[0][1][0] = 0x1008216d0
matrix[0][1][1] = 0x1008216d8
matrix[0][1][2] = 0x1008216e0
matrix[0][1][3] = 0x1008216e8
matrix[0][1][4] = 0x1008216f0
matrix[0][2][0] = 0x1008216f8
…
matrix[2][2][4] = 0x100821858
matrix[2][3][0] = 0x100821860
matrix[2][3][1] = 0x100821868
matrix[2][3][2] = 0x100821870
matrix[2][3][3] = 0x100821878
matrix[2][3][4] = 0x100821880
[0][0][4] = 115
[0][0][3] = 114
[0][0][2] = 113
[0][0][1] = 112
[0][0][0] = 111
[0][1][4] = 125
[0][1][3] = 124
[0][1][2] = 123
[0][1][1] = 122
[0][1][0] = 121
[0][2][4] = 135
…
[2][2][0] = 331
[2][3][4] = 345
[2][3][3] = 344
[2][3][2] = 343
[2][3][1] = 342
[2][3][0] = 341
There is a lot of diagnostic output in the code shown.
This code will work with C89 (and C99 and C11), without requiring support for variable-length arrays or VLAs — though since I declare variables in for loops, the code as written requires C99 or later, but it can easily be fixed to declare the variables outside the for loops and it can then compile with C89.
This can be done with one simple malloc() call in C (not in C++, though, there are no variable length arrays in C++):
void foo(int imax, int jmax, int kmax) {
double (*matrix)[jmax][kmax] = malloc(imax*sizeof(*matrix));
//Allocation done. Now fill the matrix:
for(int i = 0; i < imax; i++) {
for(int j = 0; j < jmax; j++) {
for(int k = 0; k < kmax; k++) {
matrix[i][j][k] = ...
}
}
}
}
Note that C allows jmax and kmax to be dynamic values that are only known at runtime. That is the ability that's missing in C++, which makes C arrays much more powerful than their C++ counterpart.
The only drawback of this approach, as WhozCraig rightly notes, is that you can't return the resulting matrix as the return value of the function without resorting to a void*. However, you can return it by reference like this:
void foo(int imax, int jmax, int kmax, double (**outMatrix)[jmax][kmax]) {
*outMatrix = malloc(imax*sizeof(**outMatrix));
double (*matrix)[jmax][kmax] = *outMatrix; //avoid having to write (*outMatrix)[i][j][k] everywhere
... //as above
}
This function would need to be called like this:
int imax = ..., jmax = ..., kmax = ...;
double (*myMatrix)[jmax][kmax];
foo(imax, jmax, kmax, &myMatrix);
That way you get full type checking on the inner two dimension sizes even though they are runtime values.
Note: This was intended to be a comment but it got too long, until it turned into a proper answer.
You can't use a single chunk of memory without performing some calculations.
Note that the beginning of each row is marked by the formula
// row_begin is the memory address of the row at index row_idx
row_begin = row_idx * jmax * kmax
And then, each column depends on where the row starts:
// column_begin is the memory address of the column
// at index column_idx of the row starting at row_begin
column_begin = row_begin + column_idx * kmax
Which, using absolute addresses (relative to the matrix pointer, of course) translates to:
column_begin = (row_idx * jmax * kmax) + column_idx * kmax
Finally, getting the k-index of an element is very straightforward, following the previous rule this could turn in an infinite recursion:
// element address = row_address + column_address + element_k_index
element_k_idx = column_begin + element_k_idx
Which translates to
element_k_idx = (row_idx * jmax * kmax) + column_idx * kmax + element_k_idx
This works for me:
void foo(int imax, int jmax, int kmax)
{
// Allocate memory for all the numbers.
// Think of this as (imax*jmax) number of memory chunks,
// with each chunk containing kmax doubles.
double* data_0 = malloc(imax*jmax*kmax*sizeof(double));
// Allocate memory for the previus dimension of pointers.
// This of this as imax number of memory chunks,
// with each chunk containing jmax double*.
double** data_1 = malloc(imax*jmax*sizeof(double*));
// Allocate memory for the previus dimension of pointers.
double*** data_2 = malloc(imax*sizeof(double**));
for (int i = 0; i < imax; i++)
{
data_2[i] = &data_1[i*jmax];
for (int j = 0; j < jmax; j++)
{
data_1[i*jmax+j] = &data_0[(i*jmax+j)*kmax];
}
}
// That is the matrix.
double ***matrix = data_2;
for (int i = 0; i < imax; i++)
{
for (int j = 0; j < jmax; j++)
{
for (int k = 0; k < kmax; k++)
{
matrix[i][j][k] = i+j+k;
}
}
}
for (int i = 0; i < imax; i++)
{
for (int j = 0; j < jmax; j++)
{
for (int k = 0; k < kmax; k++)
{
printf("%lf ", matrix[i][j][k]);
}
printf("\n");
}
}
// Deallocate memory
free(data_2);
free(data_1);
free(data_0);
}