Is defining a function this way legal or helpful in any way?
void f(int arr[ARR_SIZE])
It's legal (assuming that ARR_SIZE is a positive integral type), and is perhaps a useful annotation, although confusing as it seems to make a false promise. On balance I wouldn't use it.
It doesn't mean that arr is an array of that size: arr is still an int* (due to pointer decay) and all size information is lost.
Far better then to write
void f(int* arr, size_t n)
with f(arr, ARR_SIZE) at the calling site, or f(arr, sizeof(arr)/sizeof(arr[0])) if the non-decayed type of arr is available.
I see one place where passing the (minimum) size can be useful:
void bar(int myArray[static 10]){...}
This tells the compiler that it should assume that the array passed to bar has at least 10 elements and can emit a warning if it is not the case:
int a[9];
bar(a);
returns:
warning: array argument is too small; contains 9 elements, callee requires at least 10 [-Warray-bounds]
bar(a);
^ ~
More info
Legal - yes. Helpful? Not really. Look at this:
#define ARR_SIZE 4
int arr[ARR_SIZE], arr2[345];
f(arr); //works
f(arr2); //also works, compiler doesn't care about what size you specified in the []
This means that you still need to pass the size as a separate parameter:
void f2(int arr[ARR_SIZE], int size);
You can then call this function like this:
f2(arr, ARR_SIZE); //works
f2(arr2, 345); //works
So, don't use this syntax. The recommended prototypes are:
int f(int arr[], int size);
int f(int* arr, int size);
arr in this function is a pointer, for bounds check better pass size of the array as another parameter.
int fun(int *arr, int siz_arr) //function definition
{
//.....
}
fun(arr,sizeof(arr)/sizeof(arr[0])); //function call for
In
void f(int arr[ARR_SIZE]);, ARR_SIZE is practically as good as a comment except the compiler will also verify that ARR_SIZE is either empty or a positive (>=0) integer constant.
Then it's 100% equivalent to void f(int *arr);.
C11 has void f(int arr[static ARR_SIZE]); where you can effectively require that the function only be passed pointers to the first element of arrays that have at least ARR_SIZE members (where ARR_SIZE must be a positive integer constant). E.g., void take_nonnull(int arr[static 1]); or void take_at_least2(int arr[static 2]);. Compilers may or may not issue diagnostics about subsequent calls to f that violate such a requirement (clang routinely does, gcc doesn't).
Yes, but ARR_SIZE must be defined at compile time, like this: #define ARR_SIZE 3
void f(int arr[ARR_SIZE]){}
Related
I want to use quicksort in C, which has a function signature of void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*)), but the signature of my comparison function is int (*compar)(const void *, const void*, const int) with the third argument being constant during one call to quicksort.
As an illustration, assume that I want to sort an array of vectors according to different norms (L0, L1, L2 and Linifinity norm for example). Which norm it actually is, is passed as a third argument to the comparison function, but remains constant during the call of qsort. Is it possible to do an assignment in a form like
//Function declaration for parametric comparison
int cmp3(int* a_vec, int* b_vec, int x);
// Somewhere in main
int (*cmp2)(int, int);
cmp2 = cmp3(int*, int*, 2);//2 could mean L2 norm
to be able to call something like
qsort(a, 100, sizeof(a), cmp2);
I know this does not work, but I hope it gives an idea what I want to accomplish. Also it is not possible to make different comparison functions and calls to qsort as the number of different ways of comparing is too big.
This is called partial function application, and you can only achieve something like this with wrappers in C:
int cmp3(int *a, int *b) {
return cmp2(a, b, 2);
}
If you're into partial function application or maybe mappings or pure functions, you may want to look into functional programming languages, like Haskell.
The main problem is that the function signature expects 3 elements in the stack before being called. Old C compilers were "smart" and if you don't pass enough parameters, then they "complete" the stack with empty variables (zeroed).
Nowadays if you do that (assuming the compiler accept),it will have a 3rd variable with undefined value in the stack and not the value you are expecting.
You should do a "proxy" function as a previous comment said:
int cmp3(int *a, int *b) {
return cmp2(a, b, 2);
}
When using variable-Length Array as parameter in function
int sum(int n, int a[n]);
it is easy to understand first parameter(n) specifies the length of the second parameter(a). But encountered with another prototype used for VLAs as parameter
int sum(int n, int a[*]);
is really difficult to understand why * is used instead of n inside []?
The [*] syntax is intended to be used when declaring function prototypes. The key detail here is that in function prototypes you are not required to name your parameters, you just have to specify the type of each parameter.
In your example, if you leave the first parameter unnamed, then obviously you won't be able to use n in your second (array) parameter declaration. Yet, in many cases you have to tell the compiler that some parameter is a VLA. This is when the [*] syntax comes to the rescue.
In your case, if you omit the parameter names, the prototype might look as
int sum(int, int [*]);
However, it is important to note that in your specific example this syntax is legal, but it is not exactly necessary. Just like with non-VLA arrays, an int [n] parameter is still equivalent to int * parameter (even for non-constant n). This means that you can simply prototype your function as
int sum(int, int []);
or as
int sum(int, int *);
and the prototype will still serve its purpose, i.e. it will properly match the function definition. In other words, VLA properties of a parameter declared as an 1D array are completely inconsequential and the [*] feature is not really needed with such VLA arrays.
The [*] becomes important in situations when the "variable arrayness" of the type is not lost, as would be the case with 2D VLA (or a pointer to a VLA). E.g. a function defined as
int sum2d(int n, int m, int a[n][m])
{
...
}
might be prototyped as any of the following
int sum2d(int, int, int a[*][*]);
int sum2d(int n, int, int a[n][*]);
int sum2d(int, int m, int a[*][m]);
int sum2d(int n, int m, int a[n][m]);
All of the above prototypes properly match the function definition.
Of course, if you have the habit of always naming all parameters in function prototypes, then you'll never need this [*] syntax, since you will be able to use the last prototype in the above list.
P.S. Again, as is the case with all arrays in parameter declarations, the first [] is always inconsequential and always decays to a pointer, meaning that the following are also valid equivalent prototype declarations for the above sum2d
int sum2d(int, int, int a[][*]);
int sum2d(int, int, int (*a)[*]);
int sum2d(int n, int m, int (*a)[m]);
It is the second [] that really matters and has to be declared as "variable length".
When you put the star in an actual function it gives this error test.c:3: error: ‘[*]’ not allowed in other than function prototype scope. After some research this is actually a way to declare a VLA in a function prototype, with the * in place of the variable name. VLA.
The issue at hand here is that if you put a variable instead of the star for a VLA, it will tell you that it is undeclared, so the star is a way that c99 built in to get around that.
How to interpret this function definition? How should I pass arguments to it?
void matmul(float (*A)[N],int BlockX, int BlockY)
The first argument is a pointer to an array of N elements:
float a[N];
matmul(&a, 2, 3);
(Note that N has to be a compile-time constant in C89 and C++; in C89 it would essentially have to be #defined as some literal value. In C99 you have variable-length arrays.)
Since arrays decay to pointers, you can also feed it an array:
float b[M][N];
matmul(b, 2, 3);
Another way of writing the same prototype would be
void matmul(float A[][N],int BlockX, int BlockY)
which better shows what this usually supposed to receive, a two dimensional array, for which N is
a compile time integer constant (not a const variable!) if you only have C89
any integer expression which can be evaluated at the point of the definition if you have modern C99
The other dimension is not specified and you have to know or transmit it somehow.
It looks to me that this interface is an oldish one, since it seems to use int parameters to pass size information. The modern way to do this (and avoid 32/64 bit problems and stuff like that) would be to use size_t for such quantities.
If by chance the two parameters would correspond to the "real" matrix dimension, in modern C your definition should look like
void matmul(size_t m, size_t n, float A[m][n]) {
...
}
where it is important that m and n come before A, such that they are already known, there.
I am trying to write a function that takes an array of an variable size in c.
void sort(int s, int e, int arr[*]){
...
}
It says that for variable length arrays, it needs to be bounded in the function declaration. What does that mean? I am using xcode 4.0, with the LLVM compiler 2.0.
Thanks for the help.
As I see that no one answers the real question, here I give mine.
In C99 you have variable length arrays (VLA) that are declare with a length that is evaluated at run time, and not only at compile time as for previous versions of C. But passing arrays to functions is a bit tricky.
A one dimensional array is always just passed as a pointer so
void sort(size_t n, int arr[n]) {
}
is equivalent to
void sort(size_t n, int *arr){
}
Higher dimensions are well passed through to the function
void toto(size_t n, size_t m, int arr[n][m]){
}
is equivalent to
void toto(size_t n, size_t m, int (*arr)[m]){
}
With such a definition in the inside of such a function you can access the elements with expressions as arr[i][j] and the compiler knows how to compute the correct element.
Now comes the syntax that you discovered which is only useful for prototypes that is places where you forward-declare the interface of the function
void toto(size_t, size_t, int arr[*][*]);
so here you may replace the array dimension by * as placeholders. But this is only usefull when you don't have the names of the dimensions at hand, and it is much clearer to use exactly the same version as for the definition.
void toto(size_t n, size_t m, int arr[n][m]);
In general for a consistent use of that it is just important that you have the dimensions first in the the parameter list. Otherwise they would not be known when the compiler parses the declaration of arr.
If you're not using the C99 variable length arrays (it appears you are, so see below), the usual solution is to pass in a pointer to the first element, along with any indexes you want to use for accessing the elements.
Here's a piece of code that prints out a range of an array, similar to what you're trying to do with your sort.
#include <stdio.h>
static void fn (int *arr, size_t start, size_t end) {
size_t idx;
for (idx = start; idx <= end; idx++) {
printf ("%d ", arr[idx]);
}
putchar ('\n');
}
int main (void) {
int my_array[] = {9, 8, 7, 6, 5, 4, 3, 2, 1, 0};
fn (my_array, 4, 6);
return 0;
}
This outputs elements four through six inclusive (zero-based), giving:
5 4 3
A couple of points to note.
Using my_array in that function call to fn automatically "decays" the array into a pointer to its first element. This actually happens under most (not all) circumstances when you use arrays, so you don't have to explicitly state &(my_array[0]).
C already has a very good sort function built in to the standard library, called qsort. In many cases, that's what you should be using (unless either you have a specific algorithm you want to use for sorting, or you're doing a homework/self-education exercise).
If you are using real VLAs, you should be aware that the [*] construct is only valid in the function prototype, not in an actual definition of the function.
So, while:
void xyzzy(int, int[*]);
is valid, the following is not:
void xyzzy(int sz, int plugh[*]) { doSomething(); }
That's because, while you don't need the size parameter in the prototype, you do very much need it in the definition. And, since you have it, you should just use it:
void xyzzy(int sz, int plugh[sz]) { doSomething(); }
The gcc compiler actually has a reasonably clear error message for this, far better than the "needs to be bounded in the function declaration" one you saw:
error: ‘[*]’ not allowed in other than function prototype scope
What you want to do it make your argument an int *; pass in the length of the array (which the caller presumably knows, but this routine does not) as a separate argument. You can pass an array as such an argument.
The usage of * inside of array brackets for variable-length arrays is limited to prototypes, and serves merely as a placeholder. When the function is later defined, the array's size should be stored in a variable available at either file scope or as one of the parameters. Here's a simple example:
void foo(int, int[*]);
/* asterisk is placeholder */
void foo(int size, int array[size]) {
/* note size of array is specified now */
}
There's the following declarations:
void qsort(void *lineptr[], int left, int right, int (*comp)(void *, void *));
int numcmp(char *, char *);
int strcmp(char *s, char *t);
Then, somewhere in the program there is the following call:
qsort((void**) lineptr, 0, nlines-1,
(int (*)(void*,void*))(numeric ? numcmp : strcmp));
(Ignore the first three arguments and numeric).
I ask what is this:
(int (*)(void*,void*))(numeric ? numcmp : strcmp)
I understand that qsort is expecting a "pointer to function that gets two void pointers and returns an int" as it's 4th argument but how what's written above satisfies that?
It seems to me like some sort of cast because it is made of two parentheses, but that would be a very odd cast. Because it takes a function and makes this function a "pointer to function that gets two void pointers and returns an int". Which is meaningless.
(I followed here the rule that a type type in parenthesis before a variable promotes the variable to that type).
So I think I just get it wrong, maybe someone can tell me how to read this, what's the order?
What's happening here is indeed a cast. Lets ignore the ternary for a second and pretend that numcmp is always used. For the purpose of this question, functions can act as function pointers in C. So if you look at the type of numeric it is actually
(int (*)(int*,int*))
In order for this to be properly used in qsort it needs to have void parameters. Because the types here all have the same size with respect to parameters and return types, it's possible to substitute on for the other. All that's needed is a cast to make the compiler happy.
(int (*)(void*,void*))(numcmp )
You've missed the trick here - the portion
(numeric ? numcmp : strcmp)
is using the ternary operator to choose which function is being called inside of qsort. If the data is numeric, it uses numcmp. If not, it uses strcmp. A more readable implementation would look like this:
int (*comparison_function)(void*,void*) =
(int (*)(void*,void*))(numeric ? numcmp : strcmp);
qsort((void**) lineptr, 0, nlines-1, comparison_function);
As others have pointed out, for
(int (*)(void*,void*))(numeric ? numcmp : strcmp)
then the following is a type cast
(int (*)(void*,void*))
and the expression is
(numeric ? numcmp : strcmp)
C declarations can be quite difficult to read, but it is possible to learn. The method is to start at the inner part and then go right one step, then left one step, continuing right, left, right, left, etc outwards until finished. You do not cross outside a parenthesis before everything inside has been evaluated. For instance for the type cast above, (*) indicates this is a pointer. Pointer was the only thing inside the parenthesis so then we evaluate to the right side outside it. (void*,void*) indicates that is a pointer to a function with two pointer arguments. Finally int indicates the return type of the function. The outer parenthesis makes this a type cast.
Update: two more detailed articles: The Clockwise/Spiral Rule and Reading C Declarations: A Guide for the Mystified.
However, the good news is that although the above is extremely useful to know, there is an extremely simple way to cheat: the cdecl program can convert from C to English description and vice versa:
cdecl> explain (int (*)(void*,void*))
cast unknown_name into pointer to function (pointer to void, pointer to void) returning int
cdecl> declare my_var as array 5 of pointer to int
int *my_var[5]
cdecl>
Exercise: What kind of variable is i?
int *(*(*i)[])(int *)
Answer in rot13 in case you do not have cdecl installed on your machine (but you really should!):
pqrpy> rkcynva vag *(*(*v)[])(vag *)
qrpyner v nf cbvagre gb neenl bs cbvagre gb shapgvba (cbvagre gb vag) ergheavat cbvagre gb vag
pqrpy>
You can do it without the function pointer cast. Here's how. In my experience, in most places, if you are using a cast, you are doing it wrong.
Note that the standard definition of qsort() includes const:
void qsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
Note that the string comparator is given two 'char **' values, not 'char *' values.
I write my comparators so that casts are unnecessary in the calling code:
#include <stdlib.h> /* qsort() */
#include <string.h> /* strcmp() */
int num_cmp(const void *v1, const void *v2)
{
int i1 = *(const int *)v1;
int i2 = *(const int *)v2;
if (i1 < i2)
return -1;
else if (i1 > i2)
return +1;
else
return 0;
}
int str_cmp(const void *v1, const void *v2)
{
const char *s1 = *(const char **)v1;
const char *s2 = *(const char **)v2;
return(strcmp(s1, s2));
}
Forcing people to write casts in the code using your functions is ugly. Don't.
The two functions I wrote match the function prototype required by the standard qsort(). The name of a function when not followed by parentheses is equivalent to a pointer to the function.
You will find in older code, or code written by those who were brought up on older compilers, that pointers to functions are used using the notation:
result = (*pointer_to_function)(arg1, arg2, ...);
In modern style, that is written:
result = pointer_to_function(arg1, arg2, ...);
Personally, I find the explicit dereference clearer, but not everyone agrees.
Whoever wrote that code snippet was trying to be too clever. In his mind, he probably thinks he is being a good programmer by making a clever "one-liner". In reality, he is making code that is less readable and is obnoxious to work with over the long term and should be rewritten in a more obvious form similar to Harper Shelby's code.
Remember the adage from Brian Kernighan:
Debugging is twice as hard as writing
the code in the first place.
Therefore, if you write the code as
cleverly as possible, you are, by
definition, not smart enough to debug
it.
I do lots of performance critical coding with hard real time deadlines... and I have still not seen a place where a dense one-liner is appropriate.
I have even messed around with compiling and checking the asm to see if the one-liner has a better compiled asm implementation but have never found the one-liner to be worth it.
I would probably read it like this:
typedef int (*PFNCMP)(void *, void *);
PFNCMP comparison_function;
if (numeric)
{
comparison_function = numcmp;
}
else
{
comparison_function = strcmp;
}
qsort((void**) lineptr, 0, nlines-1, comparison_function);
The example in the question has an explicit case.
Your logic is correct i think. It is indeed casting to "pointer to function that gets two void pointers and returns an int" which is the required type by the method signature.
Both numcmp and strcmp are pointers to functions which take two char* as parameters and returns an int. The qsort routine expects a pointer to a function that takes two void* as parameters and returns an int. Hence the cast. This is safe, since void* acts as a generic pointer. Now, on to reading the declaration: Let's take your strcmp's declaration:
int strcmp(char *, char *);
The compiler reads it as strcmp is actually:
int (strcmp)(char *, char *)
a function (decaying to a pointer to a function in most cases) which takes two char * arguments. The type of the pointer strcmp is therefore:
int (*)(char *, char *)
Hence, when you need to cast another function to be compatible to strcmp you'd use the above as the type to cast to.
Similarly, since qsort's comparator argument takes two void *s and thus the odd cast!