About matrix operations - c

This is a theoretical question, there is no need to show code.
So, I would like to know how to return the result of a sum function, which will add two arrays using pointers.
First, should matrices be declared as a pointer right at the beginning?
So, in the function we will have
void sumMatrix (int ** m) (?)
From here, how to proceed to return the result of this sum, since the matrix itself cannot be returned

Options include:
Pass the function a pointer to where you want the result matrix stored.
Write code in the function to allocate space for the result matrix and return a pointer to that space.
Create a structure type to hold the result matrix and have the function return that structure by value.
Write the results into one of the input matrices.

It is impossible to pass arrays of any dimension to functions in C. It is impossible even to express the concept, because in most circumstances, including function-call expressions, values of array type are automatically converted to pointers. Thus,
Your function has no alternative but to receive its arguments in the form of pointers.
However, you should give some thought to the specific pointer types. C multidimensional arrays (e.g. int arr[3][4]) are structured as arrays of arrays, and the aforementioned automatic conversions yield pointers to arrays (int (*p)[4]), not pointers to pointers. On the other hand, you can construct arrays of pointers (int *arr[3]) and use the same syntax to access them as one does with multidimensional arrays. The automatic conversion of these to pointers does yield double pointers (int **p). Despite the matching access syntax, these alternatives very different in terms of memory layout and access efficiency.
It depends. Ignoring the for the moment the question of returning the sum, you have at least three good alternatives:
void sumMatrix(int r, int c, int **m1, int **m2); This is appropriate for array of pointers data layout. You an express the same thing as
void sumMatrix(int r, int c, int *m1[], int *m2[]);, and I would probably be inclined to do that myself.
void sumMatrix(int r, int c, int m1[r][c], int m2[r][c]); This is equivalent to
void sumMatrix(int r, int c, int m1[][c], int m2[][c]); and to
void sumMatrix(int r, int c, int (*m1)[c], int (*m2)[c]);
These rely on the variable-length array feature added to C in C99, and it is worth knowing that this feature became optional in C11. It assumes compact, efficient array-of-array data layout.
void sumMatrix(int r, int c, int *m1, int *m2); or, equivalently,
void sumMatrix(int r, int c, int m1[], int m2[]); This supposes the same array-of-array data layout as the previous, but requires you to perform the index calculations manually (x * c + y). It is useful if you want to have array-of-array layout with variable array dimensions, without depending on VLAs.
Personally, I would be inclined to choose array-of-arrays layout and one of the variations on the second signature option.
From here, how to proceed to return the result of this sum, since the matrix itself cannot be returned
You again have multiple options, but I would be inclined to add a fifth parameter, of the same type as the third and fourth, representing the result matrix. Because, again, it is necessarily a pointer, the data written into the pointed-to object by the function will be visible to the caller. The caller will then be responsible for passing a pointer to an existing object, which is convenient because it allows (but does not require) using an automatically allocated object.
Thus one complete possibility would be
void sumMatrix(int r, int c, int m1[r][c], int m2[r][c], int result[r][c]) {
// ...
}
which could be called like this:
int a[3][4], b[3][4], c[3][4];
// ... fill arrays a and b ...
summMatrix(3, 4, a, b, c);
// the result is in matrix c

Related

is there an advantage writing a single function that takes void*, instead of multiple functions each of which takes a different type?

I've been recently trying to make both a dynamic array library + a matrix library to get my head wrapped around C more and especially pointers.
Lately however I found I've been doing things in a different way than some other libraries such as gsl for example. I've been trying to make a single function/struct that can handle every type in c + user defined ones, however when I look at gsl and specifically that matrices part of it they define it in a much different way. The gsl libraries has multiple structs for varying data types (matrix_int, matrix_float, matrix_double, etc.) as well as a set of functions that would only work with that struct (matrix_int_add, etc). My question is, is there an advantage in having a function/struct for each data type? Why not just use a void pointer instead to only have one set of those structs/functions?
You could easily and correctly write a function matrixop(void *arg, int type_of_arg, etc) and then cast arg as necessary according to type_of_arg. Of course, as has already been said, for the function to work correctly, some operations might have to be performed differently for different type_of_arg's. But the user wouldn't have to know that, and would see just the one function. And since you say, "get my head wrapped around C more and especially pointers", I'd definitely recommend you give it a try this way. Your particular matrix example may not be the very, very best situation where void pointers are most useful, but it's definitely fine and dandy for practice.
Addressing the question, yes there are several advantages in implementing a function for each type, and in some cases is mandatory. Some libraries which are aimed for high performance
use very specific instructions in order to handle the fetch/processing/write of the data an efficient manner based on the variable type.
A clear example of this is would be the case for float and int, even when they have the same size (for 32 bit processors),
the representation is compĺetely different and the operation is handled by a different operational unit, ALU for int and FPU for floats.
Also only for a C11 compiler you could use the _Generic() but if you are using C99 there's no way to know the type of the variable. (AFAIK)
_Generic() works at compile time so either you end up with a function for each type.
My question is, is there an advantage in having a function/struct for each data type?
Yes. It adds compile-time type checking.
The code used to implement the same operation for different types does differ, and not necessarily by just the element type used. (For matrix operations, the optimum caching strategy may differ between integer and floating-point types of the same size, for example; especially if the hardware supports vectorization.) This means each element type requires their own version of each operation.
It is possible to use some templating techniques to generate element type specific versions of operations that only differ by the type, but usually the end result is more complicated (and thus harder to maintain) than just maintaining the slightly differing implementations separately.
It is quite possible to add an additional layer -- no modifications, just an additional header file included after GSL --, using the preprocessor and either GCC extensions (__typeof__) or C11 _Generic() to present a single "function" for each matrix operation, that chooses the function called at compile time based on the type of the parameter(s).
Why not just use a void pointer instead to only have one set of those structs/functions?
Because not only do you lose the compile-time type checking -- the user can supply say a literal string, and the compiler won't warn about it, no matter what warnings are enabled --, but it would also add run-time overhead.
Instead of choosing the proper function (implementation) to call at compile time, the data type field would have to be examined and the correct function called at run time. The generic matrix multiply function, for example, might look like
status_code_type matrix_multiply(void *dest, void *left, void *right)
{
const element_type tleft = ((struct generic_matrix_type *)left)->type;
const element_type tright = ((struct generic_matrix_type *)right)->type;
if (tleft != tright)
return ERROR_TYPES_MISMATCH;
switch (tleft) {
case ELEMENT_TYPE_INT:
return matrix_mul_int_int(dest, left, right);
case ELEMENT_TYPE_FLOAT:
return matrix_mul_float_float(dest, left, right);
case ELEMENT_TYPE_DOUBLE:
return matrix_mul_double_double(dest, left, right);
case ELEMENT_TYPE_COMPLEX_FLOAT:
return matrix_mul_cfloat_cfloat(dest, left, right);
case ELEMENT_TYPE_COMPLEX_DOUBLE:
return matrix_mul_cdouble_cdouble(dest, left, right);
default:
return ERROR_UNSUPPORTED_TYPE;
}
}
All of the above code is pure overhead, with the sole purpose of making it "slightly easier" on the programmer. The GSL developers, for example, didn't find it necessary or useful.
Quite a lot of C code -- including most C libraries' FILE implementation -- does utilize a related approach, however: the data structure itself contains function pointers for each operation the data type supports, in an object-oriented fashion.
For example, you could have
struct matrix {
long rows;
long cols;
long rowstep; /* Number of bytes to next row */
long colstep; /* Number of bytes to next element */
size_t size; /* Size of each element */
int type; /* Type of each element */
char *data; /* Logically void*, but allows pointer arithmetic */
int (*supports)(int, int);
int (*get)(struct matrix *, long, long, int, void *);
int (*set)(struct matrix *, long, long, int, const void *);
int (*mul)(struct matrix *, long, long, int, const void *);
int (*div)(struct matrix *, long, long, int, const void *);
int (*add)(struct matrix *, long, long, int, const void *);
int (*sub)(struct matrix *, long, long, int, const void *);
};
where the
int supports(int source_type, int target_type);
is used to find out whether the other callbacks support the necessary operations between the two types, and the rest of the member functions,
int get(struct matrix *m, long row, long col, int to_type, void *to);
int set(struct matrix *m, long row, long col, int from_type, void *from);
int mul(struct matrix *m, long row, long col, int by_type, void *by);
int div(struct matrix *m, long row, long col, int by_type, void *by);
int add(struct matrix *m, long row, long col, int by_type, void *by);
int sub(struct matrix *m, long row, long col, int by_type, void *by);
operate on a single element of a given matrix. Note how we need to pass a reference to the matrix itself; if we call e.g. some->get(...), the function that the get function pointer points to, does not automatically get a pointer to the structure via which it was called.
Also note how the value read from the matrix (get), or otherwise used in the operation, is provided via a pointer; and the type of the data specified by the pointer is separately provided. This is needed, if you want a function that say initializes a matrix to identity to work, without the user implementing every single matrix operation function for their custom type themselves.
Because access to an element involves an indirect call, the overhead of the function pointers is quite significant -- especially if you consider how simple and fast the single-element operations actually take. (For example, a 5 clock cycle indirect call overhead on an operation that itself only takes 10 clock cycles, adds 50% overhead!)
It depends what your functions do. If they do not actually use the data at all then void* could be correct, whereas if they do need to know anything about the data then specifying the types is the correct way to go.
For example, your dynamic array library probably does not need separate functions to add, remove, sort (etc) int and float data items to the array. In this instance the functions do not need to know anything about the type of the object being stored, just its location; in which case passing a void* is correct.
On the other hand, the matrix library may need different prototypes for the different data types, because int and float (etc) data use different instructions to manipulate them.

Array lengths in array parameters

I am reading C Programming: A Modern Approach by K.N.King to learn the C programing language and the current chapter tells about functions, and also array parameters. It is explained that one can use constructs like these to express the length of array parameters:
1.
void myfunc(int a, int b, int[a], int[b], int[*]); /* prototype */
void myfunc(int a, int b, int n[a], int m[b], int c[a+b+other_func()]) {
... /* body */
}
2.
void myfunc(int[static 5]); /* prototype */
void myfunc(int a[static 5]) {
... /* body */
}
So the question(s) are:
a. Are the constructs in example 1 purely cosmetic or do they have an effect on the compiler?
b. Is the static modifier in this context only of cosmetic nature? what exactly does it mean and do?
c. Is it also possible to declare an array parameter like this; and is it as cosmetic as example 1 is?
void myfunc(int[4]);
void myfunc(int a[4]) { ... }
The innermost dimension of function array parameters is always rewritten to a pointer, so the values that you give there don't have much importance, unfortunately. This changes for multidimensional arrays: starting from the second dimension these are then used by the compiler to compute things like A[i][j].
The static in that context means that a caller has to provide at least as many elements. Most compilers ignore the value itself. Some recent compilers deduce from it that a null pointer is not allowed as an argument and warn you accordingly, if possible.
Also observe that the prototype may have * so clearly the value isn't important there. In case of multidimensional arrays the concrete value is the one computed with the expression for the definition.

Using qsort() with multiple types

I'd like to use C's qsort() function to sort arrays each having different types, like these:
int a[] = {1, 2, 3};
const char *b[] = {"foo", "bar", "bas"};
my_defined_type_t *c[100]; for (i=0; i<100; i++) { fill(c[i]); }
Is it necessary to write comparison functions for each type, like intComparator(), stringComparitor(), myDefinedTypeComparitor() and make calls to qsort with each comparison function in turn, or can something like this be done in C:
int myGrandUnifiedComparisonFunction(const void* a, const void* b) {
if *a, *b are integers: intComparatorCode;
if *a, *b are strings: stringComparitorCode;
if *a, *b are my_defined_type_t's: myDefinedTypeComparitorCode;
/* etc. */
}
There are two problems to consider:
1) The Problem of Information
Your comparison function gets handed two void pointers. That's just some bit patterns which could mean anything. C attaches no information to, for example, floats or character pointers, so it's impossible to tell if some piece of data is the one or the other if you don't know beforehand.
That said, you can attach this information yourself by wrapping you data in a struct together with an enum value telling you what's inside. But you wouldn't technically be comparing floats or char pointers, but wrapped floats and wrapped char pointers. Something like:
enum { Float, String, MyType } typ;
typedef struct {
typ t;
union {
float f;
char *s;
myType mt;
} wrappedData;
Then you can just write one function which compares wrappedData *.
That's just about what every dynamic language does.
And then, even your grand unified function would still have to compare them appropriately, that is, differently for each type, so you wouldn't have gained much. On the contrary, you would mould logic together into one function which doesn't really belong together.
2) The Problem of Efficiency
While this may not bother you, unwrapping a pointer and checking it's type would be done with every single comparison operation, which may increase the runtime of your sort by a lot.
Conclusion:
You'd have to go some way and wrap your data, for a dubious advantage and a significant disadvantage (efficiency). Don't do it.
C have no introspection, so there's no way of knowing what type a void* is pointing to.
You need one comparison function per type, and have to call qsort with the correct callback.
Your idea of
int myGrandUnifiedComparisonFunction(const void* a, const void* b) {
if *a, *b are integers: intComparatorCode;
if *a, *b are strings: stringComparitorCode;
if *a, *b are my_defined_type_t's: myDefinedTypeComparitorCode;
/* etc. */
}
Is pretty great. Have you tried implementing it?
The problem is that there is no way of implementing it in C or C++. There is no way of determining what kind of variable the void* points to.
No you can not have a generic function because types are now passed in C at runtime (unlike in Object-oriented languages). The type has to be known at compile time.
Thus you need to have a function that knows how to compare each type and tell qsort that.
I guess you can if you somehow magically know the types, but why bother? Also, you've got the size to pass in as well so you'd need to do the check in 2 places.
Not sure what the advantage of this is.

Not able to understand this function definition

How to interpret this function definition? How should I pass arguments to it?
void matmul(float (*A)[N],int BlockX, int BlockY)
The first argument is a pointer to an array of N elements:
float a[N];
matmul(&a, 2, 3);
(Note that N has to be a compile-time constant in C89 and C++; in C89 it would essentially have to be #defined as some literal value. In C99 you have variable-length arrays.)
Since arrays decay to pointers, you can also feed it an array:
float b[M][N];
matmul(b, 2, 3);
Another way of writing the same prototype would be
void matmul(float A[][N],int BlockX, int BlockY)
which better shows what this usually supposed to receive, a two dimensional array, for which N is
a compile time integer constant (not a const variable!) if you only have C89
any integer expression which can be evaluated at the point of the definition if you have modern C99
The other dimension is not specified and you have to know or transmit it somehow.
It looks to me that this interface is an oldish one, since it seems to use int parameters to pass size information. The modern way to do this (and avoid 32/64 bit problems and stuff like that) would be to use size_t for such quantities.
If by chance the two parameters would correspond to the "real" matrix dimension, in modern C your definition should look like
void matmul(size_t m, size_t n, float A[m][n]) {
...
}
where it is important that m and n come before A, such that they are already known, there.

Prototype for variable-length arrays

I am trying to write a function that takes an array of an variable size in c.
void sort(int s, int e, int arr[*]){
...
}
It says that for variable length arrays, it needs to be bounded in the function declaration. What does that mean? I am using xcode 4.0, with the LLVM compiler 2.0.
Thanks for the help.
As I see that no one answers the real question, here I give mine.
In C99 you have variable length arrays (VLA) that are declare with a length that is evaluated at run time, and not only at compile time as for previous versions of C. But passing arrays to functions is a bit tricky.
A one dimensional array is always just passed as a pointer so
void sort(size_t n, int arr[n]) {
}
is equivalent to
void sort(size_t n, int *arr){
}
Higher dimensions are well passed through to the function
void toto(size_t n, size_t m, int arr[n][m]){
}
is equivalent to
void toto(size_t n, size_t m, int (*arr)[m]){
}
With such a definition in the inside of such a function you can access the elements with expressions as arr[i][j] and the compiler knows how to compute the correct element.
Now comes the syntax that you discovered which is only useful for prototypes that is places where you forward-declare the interface of the function
void toto(size_t, size_t, int arr[*][*]);
so here you may replace the array dimension by * as placeholders. But this is only usefull when you don't have the names of the dimensions at hand, and it is much clearer to use exactly the same version as for the definition.
void toto(size_t n, size_t m, int arr[n][m]);
In general for a consistent use of that it is just important that you have the dimensions first in the the parameter list. Otherwise they would not be known when the compiler parses the declaration of arr.
If you're not using the C99 variable length arrays (it appears you are, so see below), the usual solution is to pass in a pointer to the first element, along with any indexes you want to use for accessing the elements.
Here's a piece of code that prints out a range of an array, similar to what you're trying to do with your sort.
#include <stdio.h>
static void fn (int *arr, size_t start, size_t end) {
size_t idx;
for (idx = start; idx <= end; idx++) {
printf ("%d ", arr[idx]);
}
putchar ('\n');
}
int main (void) {
int my_array[] = {9, 8, 7, 6, 5, 4, 3, 2, 1, 0};
fn (my_array, 4, 6);
return 0;
}
This outputs elements four through six inclusive (zero-based), giving:
5 4 3
A couple of points to note.
Using my_array in that function call to fn automatically "decays" the array into a pointer to its first element. This actually happens under most (not all) circumstances when you use arrays, so you don't have to explicitly state &(my_array[0]).
C already has a very good sort function built in to the standard library, called qsort. In many cases, that's what you should be using (unless either you have a specific algorithm you want to use for sorting, or you're doing a homework/self-education exercise).
If you are using real VLAs, you should be aware that the [*] construct is only valid in the function prototype, not in an actual definition of the function.
So, while:
void xyzzy(int, int[*]);
is valid, the following is not:
void xyzzy(int sz, int plugh[*]) { doSomething(); }
That's because, while you don't need the size parameter in the prototype, you do very much need it in the definition. And, since you have it, you should just use it:
void xyzzy(int sz, int plugh[sz]) { doSomething(); }
The gcc compiler actually has a reasonably clear error message for this, far better than the "needs to be bounded in the function declaration" one you saw:
error: ‘[*]’ not allowed in other than function prototype scope
What you want to do it make your argument an int *; pass in the length of the array (which the caller presumably knows, but this routine does not) as a separate argument. You can pass an array as such an argument.
The usage of * inside of array brackets for variable-length arrays is limited to prototypes, and serves merely as a placeholder. When the function is later defined, the array's size should be stored in a variable available at either file scope or as one of the parameters. Here's a simple example:
void foo(int, int[*]);
/* asterisk is placeholder */
void foo(int size, int array[size]) {
/* note size of array is specified now */
}

Resources