Does the C language guarantee that pointers to differently-named standard functions must compare not-equal?
Per 6.5.9 Equality Operators, ¶6,
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, ...
I seem to recall seeing an interpretation claiming that aliases (multiple identifiers for the "same function") are permissible for standard functions, the canonical candidates for such treatment being getc==fgetc and putc==fputc; however, I don't know where I might have seen it, and I'm skeptical of the concept.
Is there any official interpretation or well-accepted argument for or against this possibility?
No, I don't believe there is any such guarantee.
I believe this whole discussion originates from the part of the standard which allows a function to also be defined as a macro with the same name.
From C17 7.1.4:
Any
function declared in a header may be additionally implemented as a function-like macro defined
in the header, so if a library function is declared explicitly when its header is included, one of the
techniques shown below can be used to ensure the declaration is not affected by such a macro. Any
macro definition of a function can be suppressed locally by enclosing the name of the function in
parentheses, because the name is then not followed by the left parenthesis that indicates expansion of
a macro function name. For the same syntactic reason, it is permitted to take the address of a library
function even if it is also defined as a macro189).
189) This means that an implementation shall provide an actual function for each library function, even if it also provides a
macro for that function.
The text goes on describing how users may #undef the macro name if they want to be guaranteed that they get an actual function.
So it is allowed for the implementation to have a standard function and a macro with the same name. But what the macro then expands to is implementation-defined. It may very well be an internal function with the same address as what another library macro expands to.
Based on that, I don't believe there are any guarantees that different functions have different addresses.
In the specific case of getc, the standard says (C17 7.21.7.5):
The getc function is equivalent to fgetc, except that if it is implemented as a macro, it may evaluate stream more than once, so the argument should never be an expression
with side effects.
I would say it is somewhat likely that the implementation calls the same actual function for fgetc and getc when these are implemented as macros. (Or that atoi versus strtol call the same function, etc etc). The standard library implementations I have peeked at don't seem to do it this way, but I don't think there is anything in the standard stopping them.
(As a side note, taking the address of library functions may not be a good idea for other reasons, namely that it may block inlining of that function within the same translation unit.)
Well you are falling in an implementation detail. The standard only specifies the behaviour of the functions of the standard library.
For getc the spec says (emphasize mine):
The getc function is equivalent to fgetc, except that if it is implemented as a macro, it
may evaluate stream more than once, so the argument should never be an expression
with side effects.
So the implementation may implement getc as a macro, but it also may implement it as an alias (a mere pointer to func) to fgetc or as a different function with same behaviour. Long story short you cannot rely on &getc == &fgetc to be true or false.
The only thing that the standard requires is that '&getc` must be defined per 7.1.4 § 1:
... it is permitted to take the address of a library function even if it is also defined as
a macro...
That just means that the implementation must have a function of that name, but it could:
be the fgets function - ok &fgetc == &getc is true
use the macro - fgetc == &getc is false
call the fgets function - &fgetc == &getc is false
Related
In C when a function is declared like void main(); trying to input an argument to it(as the first and the only argument) doesn't cause a compilation error and in order to prevent it, function can be declared like void main(void);. By the way, I think this also applies to Objective C and not to C++. With Objective C I am referring to the functions outside classes. Why is this? Thanks for reaching out. I imagine it's something like that in Fortran variables whose names start with i, j, k, l, m or n are implicitly of integer type(unless you add an implicit none).
Edit: Does Objective C allow this because of greater compatibility with C, or is it a reason similar to the reason for C having this for having this?
Note: I've kept the mistake in the question so that answers and comments wouldn't need to be changed.
Another note: As pointed out by #Steve Summit and #matt (here), Objective-C is a strict superset of C, which means that all C code is also valid Objective-C code and thus has to show this behavior regarding functions.
Because function prototypes were not a part of pre-standard C, functions could be declared only with empty parentheses:
extern double sin();
All existing code used that sort of notation. The standard would have failed had such code been made invalid, or made to mean “zero arguments”.
So, in standard C, a function declaration like that means “takes an undefined list of zero or more arguments”. The standard does specify that all functions with a variable argument list must have a prototype in scope, and the prototype will end with , ...). So, a function declared with an empty argument list is not a variadic function (whereas printf() is variadic).
Because the compiler is not told about the number and types of the arguments, it cannot complain when the function is called, regardless of the arguments in the call.
In early (pre-ANSI) C, a correct match of function arguments between a function's definition and its calls was not checked by the compiler.
I believe this was done for two reasons:
It made the compiler considerably simpler
C was always designed for separate compilation, and checking consistency across translation units (that is, across multiple source files) is a much harder problem.
So, in those early days, making sure that a function's call(s) matched its definition was the responsibility of the programmer, or of a separate program, lint.
The lax checking of function arguments also made varargs functions like printf possible.
At any rate, in the original C, when you wrote
extern int f();
, you were not saying "f is a function accepting no arguments and returning int". You were simply saying "f is a function returning int". You weren't saying anything about the arguments.
Basically, early C's type system didn't even have a way of recording the parameters expected by a function. And that was especially true when separate compilation came into play, because the linker resolved external symbols based pretty much on their names only.
C++ changed this, of course, by introducing function prototypes. In C++, when you say extern int f();, you are declaring a function that explicitly takes 0 arguments. (Also a scheme of "name mangling" was devised, which among other things let the linker do some consistency checking at link time.)
Now, this was all somewhat of a deficiency in old C, and the biggest change that ANSI C introduced was to adopt C++'s function prototype notation into C. It was slightly different, though: to maintain compatibility, in C saying extern int f(); had to be interpreted as meaning "function returning int and taking unspecified arguments". If you wanted to explicitly say that a function took no arguments, you had to (and still have to) say extern int f(void);.
There was also a new ... notation to explicitly mark a function as taking variable arguments, like printf, and the process of getting rid of "implicit int" in declarations was begun.
All in all it was a significant improvement, although there are still a few holes. In particular, there's still some responsibility placed on the programmer, namely to ensure that accurate function prototypes are always in scope, so that the compiler can check them. See also this question.
Two additional notes: You asked about Objective C, but I don't know anything about that language, so I can't address that point. And you said that for a function without a prototype, "trying to input an argument to it (as the first and the only argument) doesn't cause a compilation error", but in fact, you can pass any number or arguments to such a function, without error.
In C11, K.3.7.4.1 The memset_s function, I found this bit of rather confusing text:
Unlike memset, any call to the memset_s function shall be evaluated strictly according to the rules of the abstract machine as described in (5.1.2.3). That is, any call to the memset_s function shall assume that the memory indicated by s and n may be accessible in the future and thus must contain the values indicated by c.
This implies that memset is not (necessarily) "evaluated strictly according to the rules of the abstract machine". (The chapter referenced is 5.1.2.3 Program execution.)
I fail to understand the leeway the standard gives to memset that is explicitly ruled out here for memset_s, and what that would mean for an implementor of either function.
Imagine you have read a password:
{
char password[128];
if (fgets(password, sizeof(password), stdin) != 0)
{
password[strcspn(password), "\n\r"]) = '\0';
validate_password(password);
memset(password, '\0', sizeof(password));
}
}
You've carefully zapped the password so it can't be found accidentally.
Unfortunately, the compiler is allowed to omit that memset() call because password is not used again. The rule for memset_s() means that the call cannot be omitted; the password variable must be zeroed, regardless of optimization.
memset_s(password, sizeof(password), '\0', sizeof(password));
This is one of the few really useful features in Annex K. (We can debate the merits of having to repeat the size. However, in a more general case, the second size can be a variable, not a constant, and then the first size becomes a runtime protection against the variable being out of control.)
Note that this requirement is placed on the compiler rather than the library. The memset_s() function will behave correctly if it is called, just as memset() will behave correctly if it is called. The rule under discussion says that the compiler must call memset_s(), even though it may be able omit the call to memset() because the variable is never used again.
As I recall, early C (e.g. K&R) allowed anything to be passed on any function call, so the calling convention had to be that the args are pushed right-to-left and the caller clears the stack after the function returns.
I came across a puzzle in a presentation where the solution involves calling printf without using any header files at all. He asserts that in C if you call a function that has not been declared, then the compiler implicitly takes its parameter list as the promoted arguments that it saw you pass.
But, the new prototype-enabled function calling that was introduced on the ramp-up to ANSI C uses a more efficient calling convention, where the called function clears the stack; it is not repeated by each usage.
In my recollection, the two forms were given different linker-visible names, and were incompatible and this was caught at link time. His example worked, I maintained, because printf purposefully uses the old form, to enable whatever and anything to be passed on a call-by-call basis.
He says that the two uses must be compatible, mandated by the standard. I don’t see how that can work unless the compiler always generates the old-style calls.
What is the real situation according to the standard? And, what is the history of this — has it changed over time?
The C standard says nothing about calling conventions.
Starting with the 1989 ANSI C standard (equivalent to the 1990 ISO C standard), calling a variadic function like printf without a correct declaration in scope has undefined behavior. That declaration must be a prototype, and it must include the , ... sequence to indicate that a variable number and type(s) of arguments are accepted.
Starting with the 1999 ISO C standard, calling a function with no visible declaration is a constraint violation, requiring a diagnostic. (This is about as close as C gets to saying a construct is illegal.) Prior to C99, a called function would be implicitly declared with a return type of int and whatever (promoted) arguments appear in the call.
Many C compilers will accept (perhaps with a warning) a call with no declaration, and many probably use a calling convention that makes a call to printf with no visible declaration "work". But the language doesn't define the behavior of such a call, and a conforming compiler is free to reject it or to generate code that misbehaves arbitrarily badly.
If you want to call printf, just add #include <stdio.h> at the top of your source file. That's a lot easier than thinking about what you might be able to get away with for a given compiler.
Here is sample code:
#include <ctype.h>
int main(void)
{
isalpha("X");
}
My question is: Is this code a constraint violation? Equivalently, is an implementation non-conforming if it does not issue a diagnostic?
Motivation: Multiple major compilers don't warn for this code, even in conforming code. C11 6.5.2.2/2 covers that passing char * to a function with prototype expecting int is a constraint violation.
However it is not clear to me whether the provisions in 7.1.4 allowing a library function to be additionally defined as a macro supersede the requirement of 6.5.2.2/2. Footnote 187 suggests that the macro hides the prototype, but footnotes are non-normative.
The code (isalpha)("X"); does give a diagnostic of course.
I think the key here is whether isalpha is allowed to be defined as a macro or not. C11 7.1.4 briefly mentions
Any function declared in a header may be additionally implemented as a function-like macro defined in the header
although this chapter is mostly concerned with naming collisions and multi-threading issues etc. On the other hand, C11 7.4 says:
The header declares several functions useful for classifying and mapping characters.
and C11 7.4.1.2:
int isalpha(int c);
The isalpha function...
My take is that isalpha is to be regarded as a function. Or if implemented as a macro, some manner of type check must be ensured by the implementation.
Given that it is a function, it is pretty clear from there. For all functions, the rules for function call are specified in C11 6.5.2.2:
If the expression that denotes the called function has a type that does include a prototype,
the arguments are implicitly converted, as if by assignment, to the types of the
corresponding parameters, taking the type of each parameter to be the unqualified version
of its declared type.
Note the "as if by assignment" part. This leads us to the rules for simple assignment C11 6.5.16.1, constraints. The code in the question would behind the lines be equivalent to an assignment expression such as int c = (char[]){"X"}; where the left operand is an arithmetic type and the right operand is a pointer. No such case can be found anywhere in C11 6.5.16.1.
Therefore the code is a constraint violation of 6.5.16.1.
If a compiler lib chooses to implement isalpha as a macro and thereby loses the type check ability somehow by not performing the normal lvalue conversion of function parameters during assignment, then that library might very well be non-conforming, if the compiler fails to produce a diagnostic message.
My interpretation is that although the standard requires that there is an isalpha function, in 7.1.4 it specifically allows the implementation to additionally define a macro with the same name that hides the function declaration.
This means that calling isalpha in a program (without #undef'ing it first) is allowed to result in a macro expansion to something other than the literal function call for which 6.5.2.2 would require a diagnostic.
It has been my understanding that C variadic arguments are handled entirely on the callee's side, i.e. that if you called a function f with
f(1, 2, 3.0)
The compiler would generate the same code for the call, whether you had declared f as
void f(int, int, double);
or
void f(int, int, ...);
The context for this question is this issue with calling a not-truly-variadic C function from Rust with a variadic FFI definition. If variadics do not matter from the caller's perspective (aside of course from type checking), then it seems odd to me that Rust would generate different code for a call where the function had been declared as variadic.
If this is not in fact decided by the C specification, but rather ABI-dependant, I would be most interested in the answer for the System V ABI, which from what I read of it didn't seem to indicate any special handling of variadics on the caller's side.
This is a non-ABI-specific answer.
Yes, formally the caller can (and, in general case, will) treat functions with variadic arguments in a special way. This is actually the reason why from the beginning of standardized times C language required all variadic functions to be declared with prototype before the point of the call. Note that even though it was possible to safely call undeclared functions in C89/90, the permission to do so did not extend to variadic functions: those always had to be declared in advance. Otherwise, the behavior was undefined.
In a slightly different form the rule still stands in modern C. Even though post-C99 C no longer allows calling undeclared functions, it still does not require prototype declarations. Yet, variadic functions have to be declared with prototype before the point of the call. The rationale is the same: the caller has to know that it is calling a variadic function and, possibly, handle the call differently.
And historically, there were implementations that used completely differrent calling conventions when calling variadic functions.