SDSoC: Using only part of an array as sequential input in a function in HLS - vivado-hls

This question concerns the data movement in Xilinx SDSoC and HLS.
I have a large 1D array in my main function, which is being allocated using sds_alloc. It is basically a 2D array (of N rows and M columns) transformed into a 1D array of N*M elements.
I also have a function that accepts two arrays of size N as inputs, on the PL part.
I want this function to process two columns of the original 2D array - so, two parts of N elements stored sequentially in the 1D array, which has been allocated using sds_alloc in the main function.
Is there an efficient way to access these two parts of the array sequentially as a stream in the accelerated function?

As far as I know, sds_alloc contiguosly allocates memory buffers and SDSoC infers DMA transfers over those buffers (that is what I assume you're aiming for).
I'm not entirely sure whether SDSoC is able infer parallel accesses to a "shared" array, but my gut feeling is that it can.
I'm sure you can call your hardware function over pointers poiting at different locations of the same array (e.g. an argument looking like: &(x[i * N])).
I would try something like this approach:
void kernel(const data_t* col_1st,
const data_t* col_2nd,
// [...]
) {
// [...]
}
// [...]
data_t* x = sds_alloc(sizeof(data_t) * N * M);
for (int i = 0; i < M; i = i + 2) {
kernel(&(x[i * N]), &(x[(i + 1) * N]), ...);
}

Related

Struct of 2D Variable Length Array in C

I am trying to create a struct containing 2 Variable Length Array (buffer_size is the variable parameter acquired at run time).
Here is my code:
struct data {
float *c; //2D array
float *mtdt; //1D array
};
struct data c_matrice;
c_matrice.c = malloc((90 * sizeof (float*) * buffer_size));
c_matrice.mtdt = malloc(90 * sizeof (float*));
The idea is to link the structure's members to arrays that are dynamically allocated.
Here is the compiler error
expected « = », « , », « ; », « asm » or « __attribute__ » before « . » token
c_matrice.c = malloc((90 * sizeof (float*) * buffer_size));
And when I try to access those members, I get
subscripted value is neither array nor pointer nor vector
I haven't been able to find a solution to my problem in the previous questions. Frankly as a beginner I don't get everything. What am I missing?
EDIT 1: Ok I got rid of the first error by moving the last two lines into my main.c rather than a .h file (This was a basic stupid mistake). Now I still face the
subscripted value is neither array nor pointer nor vector
when I try to access the struct with something like this
pmoy = pow(10,(c_matrice->c[i][curve2apply]/20))*pmax;
And by the way, the whole code is really big, and what I presented you was a small part of the actual code.
What you've done here:
c_matrice.c = malloc((90 * sizeof (float*) * buffer_size));
Is allocate one long buffer of size 90 * size of pointer-to-float * buffer_size.
You have a bunch of options in how you implement a 2D array in C.
One approach is to change what you have there to:
c_matrice.c = malloc((90 * sizeof (float) * buffer_size));
So you've allocated space for 90*buffer_size floats (rather than pointers to floats).
You then need to calculate indexes yourself:
float get_matrix_element(struct data *c_matrix, size_t row, size_t column) {
return c_matrix->c[row*buffer_size+column];
}
That's a very popular and very efficient way of storing the data because it's stored as one block of memory and you can do useful things like allocate it in a single block and iterate through it without concern for structure:
float get_matrix_sum(struct data *c_matrix) {
size_t sz=buffer_size*90;
float sum=0.0f;
for(size_t i=0;i<sz;++i){
sum+=c_matrix->c[i];
}
return sum;
}
There are other ways of doing this including:
Declare a 90 long 1D array of pointers to float and then allocate rows of floats.
The downside is 91 malloc()/free() operations instead of 1.
The upside is you could allocate a ragged array with different length rows.
Declare a static (compile time) sized array float c[90][buffer_size];
Where buffer_size is a compile time constant.
The downside is it's compile time fixed (and if large and a local variable may break the stack).
The upside is managing the internal r*90+c row calculation is taken off you.
If c is a member of the struct, then you must use c_matrice.c, not c_matrice->c. And carefully note everything people tell you about c not being a two dimensional array. To allocate these, there's a ton of question/answers on SO and you must not ask that question once again. :-)

Equivalence between Subscript Notation and Pointer Dereferencing

It is more than one questions. I need to deal with an NxN matrix A of integers in C. How can I allocate the memory in the heap? Is this correct?
int **A=malloc(N*sizeof(int*));
for(int i=0;i<N;i++) *(A+i)= malloc(N*sizeof(int));
I am not absolutely sure if the second line of the above code should be there to initiate the memory.
Next, suppose I want to access the element A[i, j] where i and j are the row and column indices starting from zero. It it possible to do it via dereferencing the pointer **A somehow? For example, something like (A+ni+j)? I know I have some conceptual gap here and some help will be appreciated.
not absolutely sure if the second line of the above code should be there to initiate the memory.
It needs to be there, as it actually allocates the space for the N rows carrying the N ints each you needs.
The 1st allocation only allocates the row-indexing pointers.
to access the element A[i, j] where i and j are the row and column indices starting from zero. It it possible to do it via dereferencing the pointer **
Sure, just do
A[1][1]
to access the element the 2nd element of the 2nd row.
This is identical to
*(*(A + 1) + 1)
Unrelated to you question:
Although the code you show is correct, a more robust way to code this would be:
int ** A = malloc(N * sizeof *A);
for (size_t i = 0; i < N; i++)
{
A[i] = malloc(N * sizeof *A[i]);
}
size_t is the type of choice for indexing, as it guaranteed to be large enough to hold any index value possible for the system the code is compiled for.
Also you want to add error checking to the two calls of malloc(), as it might return NULL in case of failure to allocate the amount of memory requested.
The declaration is correct, but the matrix won't occupy continuous memory space. It is array of pointers, where each pointer can point to whatever location, that was returned by malloc. For that reason addressing like (A+ni+j) does not make sense.
Assuming that compiler has support for VLA (which became optional in C11), the idiomatic way to define continuous matrix would be:
int (*matrixA)[N] = malloc(N * sizeof *matrixA);
In general, the syntax of matrix with N rows and M columns is as follows:
int (*matrix)[M] = malloc(N * sizeof *matrixA);
Notice that both M and N does not have to be given as constant expressions (thanks to VLA pointers). That is, they can be ordinary (e.g. automatic) variables.
Then, to access elements, you can use ordinary indice syntax like:
matrixA[0][0] = 100;
Finally, to relase memory for such matrices use single free, e.g.:
free(matrixA);
free(matrix);
You need to understand that 2D and higher arrays do not work well in C 89. Beginner books usually introduce 2D arrays in a very early chapter, just after 1D arrays, which leads people to assume that the natural way to represent 2-dimensional data is via a 2D array. In fact they have many tricky characteristics and should be considered an advanced feature.
If you don't know array dimensions at compile time, or if the array is large, it's almost always easier to allocate a 1D array and access via the logic
array[y*width+x];
so in your case, just call
int *A;
A = malloc(N * N * sizeof(int))
A[3*N+2] = 123; // set element A[3][2] to 123, but you can't use this syntax
It's important to note that the suggestion to use a flat array is just a suggestion, not everyone will agree with it, and 2D array handling is better in later versions of C. However I think you'll find that this method works best.

Split C Array on element

Say, I have an array T*array and a predicate p, I want to split the array into different sub-arrays T**subs on every element matching p.
So something like:
typedef bool (*P) (T element);
T**subs(T*array,P p){....}
How can the code for subs() look like?
Note, that the code is just pseudo code, you can use variables like array_length and so on in your example, because I just want to get the idea on how to implement subs().
Of more importance than "what would the code look like" is the question "what data structure do you want/need to use?"
For example, if you need to change the sub arrays without changing the original values, you need to copy the array elements into new arrays. If you do not change the values of the sub arrays, you can just return an array of pointers or indices into the original array. Or the array of pointers is a list.
Once you have decided on a data structure that matches your requirements, you can develop the algorithm. But if your algorithm turns out to be cumbersome or slow, you might need to adapt your data structure to allow faster processing.
So you see, your question needs a lot of "design" and decissions from you, based on your requirements.
Assuming there will be no gaps between the sub-arrays, you can return a pointer to a dynamically created array T * result with size N for M=N-2 detected elements.
This array needs to be NULL terminated to indicate it's size, that is result[N-1] needs to be NULL.
Each element of result points into the source array indicating the start (the 1st element) of a sub-array.
The result[N-2] points just beyond the last element.
The size of sub-array i (for i = {0 ... M}) can then be derived by doing result[i+1]-result[i].
No copying, no additional array to indicated the sub-arrays' sizes is needed. Just the source array's size needs to be passed to subs().
We call that a callback function, not predicate.
typedef bool (*P) (T element);
T * * subs(T * array, P callback) {
T * * retval = malloc(sizeof(T*) * max_groups); // either count before, or realloc as needed
size_t group = 0;
retval[group] = array;
for (size_t i = 0; i < array_length; ++i) {
if (callback(array[i])) {
retval[++group]=array + i;
}
}
return retval;
}
This reuses the memory of the argument array and doesn't return any information about the lengths of the groups, but since you only wanted a general idea on how to solve this, I think this should be enough starting point for you to get exactly what you want.

Passing multi-dimensional arrays to functions in C

Why is it necessary to specify the number of elements of a C-array when it is passed as a parameter to a function (10 in the following example)?
void myFun(int arr[][10]) {}
Is it so because the number of elements is needed to determine the address of the cell being accessed?
Yes. It's because arr[i][j] means ((int *)arr)[i * N + j] if arr is an int [][N]: the pointer-arithmetic requires the length of a row.
The compiler needs to have an idea when the next row starts in memory (as a 2D array is just a continuous chunk of memory, one row after the other). The compiler is not psyche!
It is only necessary if you used static allocation for your array thought. Because the generate code create a continuous memory block for the array, like pointed out ruakh.
However if you use dynamic allocation it is not necessary, you only need to pass pointers.
Regards

What is the reason C compiler demands that number of columns in a 2d array will be defined?

given the following function signature:
void readFileData(FILE* fp, double inputMatrix[][], int parameters[])
this doesn't compile.
and the corrected one:
void readFileData(FILE* fp, double inputMatrix[][NUM], int parameters[])
my question is, why does the compiler demands that number of columns will be defined when handling a 2D array in C? Is there a way to pass a 2D array to a function with an unknown dimensions?
thank you
Built-in multi-deminsional arrays in C (and in C++) are implemented using the "index-translation" approach. That means that 2D (3D, 4D etc.) array is laid out in memory as an ordinary 1D array of sufficient size, and the access to the elements of such array is implemented through recalculating the multi-dimensional indices onto a corresponding 1D index. For example, if you define a 2D array of size M x N
double inputMatrix[M][N]
in reality, under the hood the compiler creates an array of size M * N
double inputMatrix_[M * N];
Every time you access the element of your array
inputMatrix[i][j]
the compiler translates it into
inputMatrix_[i * N + j]
As you can see, in order to perform the translation the compiler has to know N, but doesn't really need to know M. This translation formula can easily be generalized for arrays with any number of dimensions. It will involve all sizes of the multi-dimensional array except the first one. This is why every time you declare an array, you are required to specify all sizes except the first one.
As the array in C is purely memory without any meta information about dimensions, the compiler need to know how to apply the row and column index when addressing an element of your matrix.
inputMatrix[i][j] is internally translated to something equivalent to *(inputMatrix + i * NUM + j)
and here you see that NUM is needed.
C doesn't have any specific support for multidimensional arrays. A two-dimensional array such as double inputMatrix[N][M] is just an array of length N whose elements are arrays of length M of doubles.
There are circumstances where you can leave off the number of elements in an array type. This results in an incomplete type — a type whose storage requirements are not known. So you can declare double vector[], which is an array of unspecified size of doubles. However, you can't put objects of incomplete types in an array, because the compiler needs to know the element size when you access elements.
For example, you can write double inputMatrix[][M], which declares an array of unspecified length whose elements are arrays of length M of doubles. The compiler then knows that the address of inputMatrix[i] is i*sizeof(double[M]) bytes beyond the address of inputMatrix[0] (and therefore the address of inputMatrix[i][j] is i*sizeof(double[M])+j*sizeof(double) bytes). Note that it needs to know the value of M; this is why you can't leave off M in the declaration of inputMatrix.
A theoretical consequence of how arrays are laid out is that inputMatrix[i][j] denotes the same address as inputMatrix + M * i + j.¹
A practical consequence of this layout is that for efficient code, you should arrange your arrays so that the dimension that varies most often comes last. For example, if you have a pair of nested loops, you will make better use of the cache with for (i=0; i<N; i++) for (j=0; j<M; j++) ... than with loops nested the other way round. If you need to switch between row access and column access mid-program, it can be beneficial to transpose the matrix (which is better done block by block rather than in columns or in lines).
C89 references: §3.5.4.2 (array types), §3.3.2.1 (array subscript expressions)
C99 references: §6.7.5.2 (array types), §6.5.2.1-3 (array subscript expressions).
¹ Proving that this expression is well-defined is left as an exercise for the reader. Whether inputMatrix[0][M] is a valid way of accessing inputMatrix[1][0] is not so clear, though it would be extremely hard for an implementation to make a difference.
This is because in memory, this is just a contiguous area, a single-dimension array if you will. And to get the real offset of inputMatrix[x][y] the compiler has to calculate (x * elementsPerColumn) + y. So it needs to know elementsPerColumn and that in turn means you need to tell it.
No, there's not. The situation's pretty simple really: what the function receives is really just a single, linear block of memory. Telling it the number of columns tells it how to translate something like block[x][y] into a linear address in the block (i.e., it needs to do something like address = row * column_count + column).
Other people have explained why, but the way to pass a 2D array with unknown dimensions is to pass a pointer. The compiler demotes array parameters to pointers anyway. Just make sure it's clear what you expect in your API docs.

Resources