Why do GCC and Clang produce different output with variable length array? - c

Why do GCC and Clang produce different output with this conforming C code:
int (puts) (); int (main) (main, puts) int main;
char *puts[(&puts) (&main["\0April 1"])]; <%%>
Neither compiler produces any warning or error even with -Wall -std=c18 -pedantic, but the program produces no output when built with GCC but prints the current date when built with Clang.

Why do GCC and Clang produce different output with this conforming C
code:
int (puts) (); int (main) (main, puts) int main;
char *puts[(&puts) (&main["\0April 1"])]; <%%>
In the first place, it is conforming code, though it does make use of a variable-length array, which is an optional language feature in C11 and C17. Some of the obfuscations are
use of the obscure digraphs <% and %>, which mean the same thing as { and }, respectively.
parenthesizing the function identifiers in function declarations
a forward declaration of function puts that is not a prototype
a K&R-style definition of function main
with a VLA parameter
whose dimension expression contains a function call
and a reference to another parameter
use of unconventional identifiers for the parameters to function main()
use of identifiers (puts and main) in declarations of an object and a function, respectively, with the same identifier
use of the identifier main for something more than the program's entry-point function
inversion of the conventional order of the operands of the indexing operator ([])
plus, indexing a sting literal
calling a function via an explicit function pointer constant expression
A string literal with an explicit null character within
Unconventional placement (and omission) of line breaks
A less obfuscated equivalent would be
int puts();
int main(
int argc,
char *argv[ puts("\0April 1" + argc) ]
) {
}
But the central question about the difference in behavior between the version compiled with GCC and the one built with Clang comes down to whether the expression for the size of the VLA function parameter is evaluated at runtime.
The language spec says that when a function parameter is declared with array type, its type is "adjusted" to the corresponding pointer type. That applies equally to complete, incomplete, and variable-length array types, but the spec does not explicitly say that the expression(s) for the dimension(s) are not evaluated. It does specify that expressions go unevaluated in certain other cases, and it even makes an exception to such a rule in the case of sizeof expressions involving VLAs, so the omission in this case could be interpreted as meaningful.
That makes a difference only for parameters of VLA type, because only for those can evaluation of the dimension expression(s) produce side effects on the machine state, including, but not limited to, observable program behavior.
GCC does not evaluate the VLA parameter's size expression at runtime, and I am inclined to take this as conforming to the intent of the standard. As a result, the GCC-compiled program does nothing but exit with status 0.
Clang does evaluate the VLA parameter's size expression at runtime. Although I disfavor this interpretation of the spec, I cannot rule it out. When it does evaluate the size expression, it uses the passed value of the first parameter. When the program is run without arguments, then the first parameter has value 1, with the result that the standard library's puts function is called with a pointer to the 'A' in "\0April 1".

int (puts) ();
int (main) (main, puts)
int main;
char *puts[(&puts) (&main["\0April 1"])];
{
}
Somebody's got a compiler bug; I'm just not sure who anymore. I don't understand why any compiler would emit code to evaluate the size parameter of a VLA as an argument.
The clang output is rather bizarre. For it to work, it would have had to find main in the function's scope but puts in the global scope despite having already encountered the declaration for puts. Normally, you can access a variable in its own declaration.
If somebody did this in production code my answer would be rather: "Stop using K&R function definitions."

Related

Why does a function that returns int not need a prototype?

I tried this program on DevC++ that would call two different types of function, one returning int and one returning char. I'm confused why the int function doesn't need a prototype, while the char one and any other type of function does.
#include <stdio.h>
//int function1();
char function2() ;
int main (){
int X = function1() ;
char Y = function2() ;
printf("%d", X) ;
printf ("%c", Y) ;
return 0 ;
}
int function1(){
return 100 ;
}
char function2(){
return 'B' ;
}
The output:
100B
If I remove the prototype for the char function, it results in:
[Error] conflicting types for 'function2'
[Note] previous implicit declaration of 'function2' was here
In the old days of C any function that was not declared explicitely was supposed to return int type when you call it.
If the compiler then finds the function implementation and sees an int return type, everything is fine.
If the function returns anything else than int you get the error message as you saw with the second function.
This implicit int type declaration was removed from the standard with C99. Now you should at least get a warning from your compiler when you use a function without prototype.
If you did not get any diagnostic message for first funcion, you should turn up warning level in your compiler or switch to at least C99 mode instead of ancient versions.
Edit:
Empty parameter lists in funcion definitions is a deprecated feature as well.
You should not use it.
Always provide prototype for every function with return type and parameter list.
If a function is used before it is declared, the usage becomes an implicit declaration of the function. When a function f is implicitly defined, the definition given to it is int f(), i.e. a function which accepts an unspecified number of arguments and returns an int.
This implicit definition of a function matches the actual definition of function1 but not function2. So calling function1 this way gives no error but attempting to call function2 this way results in the implicit definition not matching the actual definition, giving an error.
This behavior goes back to pre-standardized versions of C where all objects (and a function's return type) had a default type of int if not declared. This was still present in the C89 standard but removed in the C99 standard, although some compilers such as gcc still support this obsolescent usage as an extension.
It's just an ancient relic from when C was first designed. It was actually removed as early as C99, but many compilers still support this type of declaration. But it's not recommended to use it.
I'm not sure if there were any real rationale behind it, but C was heavily inspired by the language B. And in B you did not have to specify the return type for functions. That actually made perfect sense, because there was only one type, word.
In the same way you did not have to specify the type of variables either. You only specified if it had automatic or static storage. And that's where the completely useless keyword auto in C comes from. It does not mean the same as in C++. It just means "not static".

Strange integers in c language

I have code:
#include <stdio.h>
int main() {
int a = sum(1, 3);
return 0;
}
int sum(int a, int b, int c) {
printf("%d\n", c);
return a + b + c;
}
I know that I have to declare functions first, and only after that I can call them, but I want to understand what happends.
(Compiled by gcc v6.3.0)
I ignored implicit declaration of function warning and ran program several times, output was this:
1839551928
-2135227064
41523672
// And more strange numbers
I have 2 questions:
1) What do these numbers mean?
2) How function main knows how to call function sum without it declaration?
I'll assume that the code in your question is the code you're actually compiling and running:
int main() {
int a = sum(1, 3);
return 0;
}
int sum(int a, int b, int c) {
printf("%d\n", c);
return a + b + c;
}
The call to printf is invalid, since you don't have the required #include <stdio.h>. But that's not what you're asking about, so we'll ignore it. The question was edited to add the include directive.
In standard C, since the 1999 standard, calling a function (sum in this case) with no visible declaration is a constraint violation. That means that a diagnostic is required (but a conforming compiler can still successfully compile the program if it chooses to). Along with syntax errors, constraint violations are the closest C comes to saying that something is illegal. (Except for #error directives, which must cause a translation unit to be rejected.)
Prior to C99, C had an "implicit int" rule, which meant that if you call a function with no visible declaration an implicit declaration would be created. That declaration would be for a function with a return type of int, and with parameters of the (promoted) types of the arguments you passed. Your call sum(1, 3) would create an implicit declaration int sum(int, int), and generate a call as if the function were defined that way.
Since it isn't defined that way, the behavior is undefined. (Most likely the value of one of the parameters, perhaps the third, will be taken from some arbitrary register or memory location, but the standard says nothing about what the call will actually do.)
C99 (the 1999 edition of the ISO C standard) dropped the implicit int rule. If you compile your code with a conforming C99 or later compiler, the compiler is required to diagnose an error for the sum(1, 3) call. Many compilers, for backward compatibility with old code, will print a non-fatal warning and generate code that assumes the definition matches the implicit declaration. And many compilers are non-conforming by default, and might not even issue a warning. (BTW, if your compiler did print an error or warning message, it is tremendously helpful if you include it in your question.)
Your program is buggy. A conforming C compiler must at least warn you about it, and possibly reject it. If you run it in spite of the warning, the behavior is undefined.
This is undefined behavior per 6.5.2.2 Function calls, paragraph 9 of the C standard:
If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.
Functions without prototypes are allowed under 6.5.2.2 Function calls, paragraph 6:
If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. ...
Note again: if the parameters passed don't match the arguments expected, the behavior is undefined.
In strictly standard conforming C, if you don't declare a function before using it, it will assume certain default argument types for the function.This is based on early versions of C with a weaker type system, and retained only for backwards compatibility. It should not be used generally.
Ill skip the details here, but in your case it assumes sum takes 2 ints and returns an int.
Calling a function with the wrong number of parameters, as you are doing here, is undefined behaviour. When you call sum, the compiler thinks that it takes two integers, so it passes two integers to it. When the function is actually called, however, it tries to read one more integer, c. Since you only passed 2 ints, the space for c contains random crap, which is what you're seeing when you print out. Note that it doesn't have to do this, since this is undefined behaviour, it could do anything. It could have given values for b & c, for example.
Obviously this behaviour is confusing, and you should not rely on undefined behaviour, so you'd be better off compiling with stricter compiler settings so this program wouldn't compile. (The proper version would declare sum above main.)
1) Since you haven't provided value for parameter "c" when calling function "sum" its value inside the function is undefined. If you declared function before main, your program wouldn't even compile, and you would get "error: too few arguments to function call" error.
2) Normally, it doesn't. Function has to be declared before the call so the compiler can check function signature. Compiler optimizations solved this for you in this case.
I'm not 100% sure if C works exactly like this but, your function calls work like a stack in memory. When you call a function your arguments are put on that stack so when in the fuction you can access them by selecting less x positions on memory. So:
You call summ(1, 3)
the stack will have 1 and the on the top a 3.
when executing the fuction it will see the last position of memory for the 1º argument (it recovers the 1) and then the position before that for the 2º argument (recovering the 3), however, there is a 3º argument so it accesses the position before that as well.
This position is garbige as not put by you and different everytime you run it.
Hope it was clear enought. Remeber that the stack works is inverted so every time you add something it goes to the previous memory position, not the next.

What are the semantics of function pointers with empty parentheses in each C standard?

Answers to this and this question say that function pointers of the form return-type (*pointer)() are pointers to a function which takes any number of arguments, though the latter says they obsolesced in C11.
On an i386 system with GCC, “extra” arguments passed in a call to an empty-parentheses-type’d function pointer are ignored, because of how stack frames work; e.g.,
/* test.c */
#include <stdio.h>
int foo(int arg) { return arg; }
int main(void)
{
int (*fp)() = foo;
printf("%d\n", fp(267239151, 42, (struct { int x, y, z; }){ 1, 2, 3 }));
return 0;
}
$ gcc -o test test.c && ./test
267239151
$
In which C standards are empty-parentheses’d function pointers allowed? and wherever so, what are they specified to mean?
N1570 6.11.6:
The use of function declarators with empty parentheses (not
prototype-format parameter type declarators) is an obsolescent
feature.
This same wording appears in the 1990, 1999, and 2011 editions of the ISO C standard. There has been no change. The word obsolescent says that the feature may be removed in a future edition of the Standard, but so far the committee has not done so. (Function pointer declarations are just one of several contexts where function declarators can appear.)
The Introduction section of the C standard explains what obsolescent means:
Certain features are obsolescent, which means that they may be
considered for withdrawal in future revisions of this International
Standard. They are retained because of their widespread use, but their
use in new implementations (for implementation features) or new
programs (for language [6.11] or library features [7.31]) is
discouraged.
A call to a function declared with an old-style declarator is still required to pass the correct number and type(s) of arguments (after promotion) as defined by the function's actual definition. A call with incorrect arguments has undefined behavior, which means that the compiler is not required to diagnose the error; the burden is entirely on the programmer.
This is why prototypes were introduced, so that the compiler could check correctness of arguments.
On an i386 system with GCC, “extra” arguments passed in a call to an
empty-parentheses-type’d function pointer are ignored, because of how
stack frames work ...
Yes, that's well within the bounds of undefined behavior. The worst symptom of undefined behavior is having the program work exactly as you expect it to. It means that you have a bug that hasn't exhibited itself yet, and it will be difficult to track it down.
You should not depend on that unless you have a very good reason to do so.
If you change
int (*fp)() = foo;
to
int (*fp)(int) = foo;
the compiler will diagnose the incorrect call.
Any function declarator can have empty parentheses (unless it's a function declaration where there is already a non-void prototype in scope). This isn't deprecated, although it is "obsolescent".
In a function pointer, it means the pointer can point to a function with any argument list.
Note that when actually calling a function through the pointer, the arguments must be of correct type and number according to the function definition, otherwise the behaviour is undefined.
Although C allows you to declare a function (or pointer to function) with an empty parameter list, that does not change the fact that the function must be defined with a precise of parameters, each with a precise type. [Note 1]
If the parameter declaration is not visible at a call site, the compiler will obviously not be able to perform appropriate conversions to the provided arguments. It is, therefore, the programmer's responsibility to ensure that there are a correct number of arguments, all of them with the correct type. For some parameter types, this will not be possible because the compiler will apply the default argument promotions. [Note 2]
Calling a function with an incorrect number of arguments or with an argument whose type is not compatible with the corresponding parameter type is Undefined Behaviour.
The fact that the visible declaration has an empty parameter list does not change the way the function is called. It just puts more burden on the programmer to ensure that the call is well-defined.
This is equally true of pointer to function declarations.
In short, the sample code in the question is Undefined Behaviour. It happens to "work" on certain platforms, but it is neither portable nor is it guaranteed to keep working if you recompile. So the only possible advice is "Don't do that."
If you want to create a function which can accept extra arguments, use a varargs declaration. (See open for an example.) But be aware of the limitations: the called function must have some way of knowing the precise number and types of the provided arguments.
Notes
With the the exception of varargs functions, whose prototypes end with .... But a declaration with an empty parameter list cannot be used to call a varargs function.
Integer types narrower than int are converted to int and float values to double.

Why does gcc allow arguments to be passed to a function defined to be with no arguments?

I don't get why does this code compile?
#include <stdio.h>
void foo() {
printf("Hello\n");
}
int main() {
const char *str = "bar";
foo(str);
return 0;
}
gcc doesn't even throw a warning that I am passing too many arguments to foo(). Is this expected behavior?
In C, a function declared with an empty parameter list accepts an arbitrary number of arguments when being called, which are subject to the usual arithmetic promotions. It is the responsibility of the caller to ensure that the arguments supplied are appropriate for the definition of the function.
To declare a function taking zero arguments, you need to write void foo(void);.
This is for historic reasons; originally, C functions didn't have prototypes, as C evolved from B, a typeless language. When prototypes were added, the original typeless declarations were left in the language for backwards compatibility.
To get gcc to warn about empty parameter lists, use -Wstrict-prototypes:
Warn if a function is declared or defined without specifying the argument types. (An old-style function definition is permitted without a warning if preceded by a declaration which specifies the argument types.)
For legacy reasons, declaring a function with () for a parameter list essentially means “figure out the parameters when the function is called”. To specify that a function has no parameters, use (void).
Edit: I feel like I am racking up reputation in this problem for being old. Just so you kids know what programming used to be like, here is my first program. (Not C; it shows you what we had to work with before that.)
void foo() {
printf("Hello\n");
}
foo(str);
in C, this code does not violates a constraint (it would if it was defined in its prototype-form with void foo(void) {/*...*/}) and as there is no constraint violation, the compiler is not required to issue a diagnostic.
But this program has undefined behavior according to the following C rules:
From:
(C99, 6.9.1p7) "If the declarator includes a parameter type list, the list also specifies the types of all the parameters; such a declarator also serves as a function prototype for later calls to the same function in the same translation unit. If the declarator includes an identifier list,142) the types of the parameters shall be declared in a following declaration list."
the foo function does not provide a prototype.
From:
(C99, 6.5.2.2p6) "If the expression that denotes the called function has a type that does not include a prototype [...] If the number of arguments does not equal the number of parameters, the behavior is undefined."
the foo(str) function call is undefined behavior.
C does not mandate the implementation to issue a diagnostic for a program that invokes undefined behavior but your program is still an erroneous program.
Both the C99 Standard (6.7.5.3) and the C11 Standard (6.7.6.3) state:
An identifier list declares only the identifiers of the parameters
of the function. An empty list in a function declarator that is part
of a definition of that function specifies that the function has no
parameters. The empty list in a function declarator that is not part
of a definition of that function specifies that no information about
the number or types of the parameters is supplied.
Since the declaration of foo is part of a definition, the declaration
specifies that foo takes 0 arguments, so the call foo(str) is at
least morally wrong. But as described below, there are different
degrees of "wrong" in C, and compilers may differ in how they deal
with certain kinds of "wrong".
To take a slightly simpler example, consider the following program:
int f() { return 9; }
int main() {
return f(1);
}
If I compile the above using Clang:
tmp$ cc tmp3.c
tmp3.c:4:13: warning: too many arguments in call to 'f'
return f(1);
~ ^
1 warning generated.
If I compile with gcc 4.8 I don't get any errors or warnings, even with -Wall. A previous answer suggested using -Wstrict-prototypes, which correctly reports that the definition of f is not in prototype form, but this is really not the point. The C Standard(s) allow a function definition in a non-prototype form such as the one above and the Standards clearly state that this definition specifies that the function takes 0 arguments.
Now there is a constraint (C11 Sec. 6.5.2.2):
If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters.
However, this constraint does not apply in this case, since the type of the function does not include a prototype. But here is a subsequent statement in the semantics section (not a "constraint"), that does apply:
If the expression that denotes the called function has a type that does not include a prototype
... If the number of arguments does not equal the number of parameters, the behavior is undefined.
Hence the function call does result in undefined behavior (i.e., the program is not "strictly conforming"). However, the Standard only requires an implementation to report a diagnostic message when an actual constraint is violated, and in this case, there is no violation of a constraint. Hence gcc is not required to report an error or warning in order to be a "conforming implementation".
So I think the answer to the question, why does gcc allow it?, is that gcc is not required to report anything, since this is not a constraint violation. Moreover gcc does not claim to report every kind of undefined behavior, even with -Wall or -Wpedantic. It is undefined behavior, which means the implementation can choose how to deal with it, and gcc has chosen to compile it without warnings (and apparently it just ignores the argument).

Why is this legal in C?

I am writing a toy C compiler for a compiler/language course at my university.
I'm trying to flesh out the semantics for symbol resolution in C, and came up with this test case which I tried against regular compilers clang & gcc.
void foo() { }
int main() { foo(5); } // foo has extraneous arguments
Most compilers only seem to warn about extraneous arguments.
Question: What is the fundamental reasoning behind this?
For my symbol table generation/resolution phase, I was considering a function to be a symbol with a return type, and several parametrized arguments (based on the grammar) each with a respective type.
Thanks.
A function with no listed arguments in the prototype is deemed to have an indeterminate number, not zero.
If you really want zero arguments, it should be:
void foo (void);
The empty-list variant is a holdover from ancient C, even before ANSI got their hands on it, where you had things like:
add_one(val)
int val;
{
return val + 1;
}
(with int being the default return type and parameter types specified outside the declarator).
If you're doing a toy compiler and you're not worried about compliance with every tiny little piece of C99, I'd just toss that option out and require a parameter list of some sort.
It'll make your life substantially easier, and I question the need for people to use that "feature" anyway.
It's for backward compatibility with ancient C compilers. Back before the earth cooled, all C function declarations looked roughly like:
int foo();
long bar();
and so on. This told the compiler that the name referred to a function, but did not specify anything about the number or types of parameters. Probably the single biggest change in the original (1989) C standard was adding "function prototypes", which allowed the number and type(s) of parameters to be declared, so the compiler could check what you passed when you called a function. To maintain compatibility for existing code, they decided that an empty parameter list would retain its existing meaning, and if you wanted to declare a function that took no parameters, you'd have to add void in place of the parameter list: int f(void);.
Note that in C++ the same is not true -- C++ eliminates the old style function declarations, and requires that the number and type(s) of all parameters be specified1. If you declare the function without parameters, that means it doesn't take any parameters, and the compiler will complain if you try to pass any (unless you've also overloaded the function so there's another function with the same name that can take parameters).
1 Though you can still use an ellipsis for a function that takes a variable parameter list -- but when/if you do so, you can only pass POD types as parameters.
You haven't provided a prototype for the foo function, so the compiler can't enforce it.
If you wrote:
void foo(void) {}
then you would be providing a prototype of a function that takes no parameters.
gcc's -Wstrict-prototypes will catch this. For an error, use -Werror=strict-prototypes. The standard never specifies whether something should be a warning or an error.
Why is this legal in C?
First just to clarify the C Standard does not use the word legal.
In the C terminology, this program is not strictly conforming:
void foo() { }
int main() { foo(5); } // foo has extraneous arguments
When compiling this program, no diagnostic is required because of the function call foo(5): there is no constraint violation. But calling the function foo with an argument invokes undefined behavior. As any program that invokes undefined behavior, it is not strictly conforming and a compiler has the right to refuse to translate the program.
In the C Standard, a function declaration with an empty list of parameters means the function has an unspecified number of parameters. But a function definition with an empty list of parameters means the function has no parameter.
Here is the relevant paragraph in the C Standard (all emphasis mine):
(C99, 6.7.5.3p14) "An identifier list declares only the identifiers of the parameters of the function. An empty list in a function declarator that is part of a definition of that function specifies that the function has no parameters."
The paragraph of the C Standard that says the foo(5) call is undefined behavior is this one:
(C99, 6.5.2.2p6) "If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument
promotions. If the number of arguments does not equal the number of parameters, the
behavior is undefined."
And from (C99, 6.9.1p7), we know the definition of foo does not provide a prototype.
(C99, 6.9.1p7) "If the declarator includes a parameter type list, the list also specifies the types of all the parameters; such a declarator also serves as a function prototype for later calls to the same function in the same translation unit. If the declarator includes an identifier list,the types of the parameters shall be declared in a following declaration list."
See the Committee answer to the Defect Report #317 for an authoritative answer on the subject:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_317.htm

Resources