Why nested functions in C are against C standards

Why nested functions in C are against C standards - c

Nested functions(function declarations in block scope) are not allowed in C standards(ANSI[C89], C99, C11).
But I couldn't find stating it in C standards.
Edit :
Why a function definition cannot be in a function definition(compound statement)?

There is a difference between a function declaration and a function definition. A declaration merely declares the existence of a function and a definition defines the function.
int f(void) { /* ... */ } // function definition
int f(void); // function declaration
In 6.9.1 the syntax for a function is defined as
function-definition:
declaration-specifiers declarator declaration-listopt compound-statment
In 6.8.2, the things you can put in a compound statement are defined as a declaration or a statement. A function definition isn't considered to be either of these syntactically.
So yes, a function declaration is legal in a function but a function definition is not e.g.
int main(int argc, char*argv[])
{
int f(void); // legal
int g(void) { return 1; } ; // ILLEGAL
// blah blah
}

It may not be stated directly but if you work through the grammar for function-definition you'll find they're not accepted within the grammar.
Why? According to Dennis Richie (who is a bit of an authority on the matter) it appears they were excluded from the start:
"Procedures can be nested in BCPL, but may not refer to non-static objects defined in containing procedures. B and C avoid this restriction by imposing a more severe one: no nested procedures at all."
https://www.bell-labs.com/usr/dmr/www/chist.html
It's humor to avoid a restriction by imposing a more severe one. I read this as it being a simplifying maneuver. Nested procedures add complexity to the compiler (which Ritchie was very keen to limit on machines of the time) and add little value.
The standardization process was (wisely) never seen as an opportunity to extend C willy-nilly and (from the same document):
"From the beginning, the X3J11 committee took a cautious, conservative view of language extensions."
It's difficult to put a case that nested functions offer significant benefits so it's not surprising that even if some implementations were supporting them they weren't adopted as standard.
In general the standards efforts ever since have been at least equally conservative and again it's difficult to see a lot of support amongst implementers to add such a feature.
At the end of the day if you're worried that some function be used outside its intended purpose and is (logically) a sub-function of exactly one given function then give it static linkage and introduce another source file or even whole translation unit.

Nested or private functions are something that many C compilers used to allow, but are not part of the C standard, and now it's quite rare to find compilers that support them, certainly by default.
The standard is determined by a committee, and nested functions will be something that they have discussed, and there will be a rationale, but I don't know what it is offhand, nor do most C programmers. Nested functions aren't inherently a bad idea, but you can achieve virtually all of the benefits by writing a static file scope function, which is the method of creating private functions which was standardised.

Related

Why does a function in C(or Objective C) with no listed arguments allow inputting one argument?

In C when a function is declared like void main(); trying to input an argument to it(as the first and the only argument) doesn't cause a compilation error and in order to prevent it, function can be declared like void main(void);. By the way, I think this also applies to Objective C and not to C++. With Objective C I am referring to the functions outside classes. Why is this? Thanks for reaching out. I imagine it's something like that in Fortran variables whose names start with i, j, k, l, m or n are implicitly of integer type(unless you add an implicit none).
Edit: Does Objective C allow this because of greater compatibility with C, or is it a reason similar to the reason for C having this for having this?
Note: I've kept the mistake in the question so that answers and comments wouldn't need to be changed.
Another note: As pointed out by #Steve Summit and #matt (here), Objective-C is a strict superset of C, which means that all C code is also valid Objective-C code and thus has to show this behavior regarding functions.

Because function prototypes were not a part of pre-standard C, functions could be declared only with empty parentheses:
extern double sin();
All existing code used that sort of notation. The standard would have failed had such code been made invalid, or made to mean “zero arguments”.
So, in standard C, a function declaration like that means “takes an undefined list of zero or more arguments”. The standard does specify that all functions with a variable argument list must have a prototype in scope, and the prototype will end with , ...). So, a function declared with an empty argument list is not a variadic function (whereas printf() is variadic).
Because the compiler is not told about the number and types of the arguments, it cannot complain when the function is called, regardless of the arguments in the call.

In early (pre-ANSI) C, a correct match of function arguments between a function's definition and its calls was not checked by the compiler.
I believe this was done for two reasons:
It made the compiler considerably simpler
C was always designed for separate compilation, and checking consistency across translation units (that is, across multiple source files) is a much harder problem.
So, in those early days, making sure that a function's call(s) matched its definition was the responsibility of the programmer, or of a separate program, lint.
The lax checking of function arguments also made varargs functions like printf possible.
At any rate, in the original C, when you wrote
extern int f();
, you were not saying "f is a function accepting no arguments and returning int". You were simply saying "f is a function returning int". You weren't saying anything about the arguments.
Basically, early C's type system didn't even have a way of recording the parameters expected by a function. And that was especially true when separate compilation came into play, because the linker resolved external symbols based pretty much on their names only.
C++ changed this, of course, by introducing function prototypes. In C++, when you say extern int f();, you are declaring a function that explicitly takes 0 arguments. (Also a scheme of "name mangling" was devised, which among other things let the linker do some consistency checking at link time.)
Now, this was all somewhat of a deficiency in old C, and the biggest change that ANSI C introduced was to adopt C++'s function prototype notation into C. It was slightly different, though: to maintain compatibility, in C saying extern int f(); had to be interpreted as meaning "function returning int and taking unspecified arguments". If you wanted to explicitly say that a function took no arguments, you had to (and still have to) say extern int f(void);.
There was also a new ... notation to explicitly mark a function as taking variable arguments, like printf, and the process of getting rid of "implicit int" in declarations was begun.
All in all it was a significant improvement, although there are still a few holes. In particular, there's still some responsibility placed on the programmer, namely to ensure that accurate function prototypes are always in scope, so that the compiler can check them. See also this question.
Two additional notes: You asked about Objective C, but I don't know anything about that language, so I can't address that point. And you said that for a function without a prototype, "trying to input an argument to it (as the first and the only argument) doesn't cause a compilation error", but in fact, you can pass any number or arguments to such a function, without error.

Why a declaration is not a statement in C?

The following example is illegal C program, which is confusing and shows that a declaration is not a statement in C language.
int main() {
if (1) int x;
}
I've read the specification of C (N2176) and I know C language distinguish declaration and statement in the syntax specification. I told my teacher who teaches compiler, and he seems not believe it and I cannot convince him unless I showed him the specification.
So, I am also really confused. Why C is designed like this? Why a declaration is not a statement in C? How to convince someone of the reason of this design?

Because there is no apparent grammatical or technical semantic reason that a declaration cannot appear wherever a statement may appear, this appears to be largely due to history and lack of utility.
Considering the Grammar
Statements enter the C grammar in the function-definition rule, in which a compound-statement appears. A compound-statement allows a statement. Then inspecting the replacements for a statement reveals the places where a statement may appear but a declaration may not:
In a labeled-statement, after a label followed by :.
In a selection-statement, after the ) of an if or a switch or after an else.
In an iteration-statement, after the ) of a while or for or after a do.
The following is not a formal analysis of the grammar, but it appears the places where a statement may appear but a declaration may not are quite limited: After the : that ends a label, after a keyword (else or do) or after the closing ) for a ( that immediately follows a keyword (if, switch, while, or for). These seem to me like unambiguous points in the grammar, where it should be as easy to distinguish declarations and statements as it is to do so after a ; in a compound-statement.
Therefore, I do not think there is a grammatical reason not to allow a declaration to appear anywhere a statement may appear (or, equivalently, to define a declaration as a kind of statement).
Considering the Semantics
Now consider the semantic effects of allowing declarations in the places where currently a statement may appear but a declaration may not.
In the case of a labeled-statement, where we desired to have label: declaration, we can use label: ; declaration, where we have inserted a null statement after the :. The result is a defined code sequence with semantic effect equivalent to what we would desire to have by allowing a declaration immediately after the label.
In the other cases, where we desire to have declaration, we can use { declaration }. Again, the result is a defined code sequence with semantic effect equivalent to what we would desire to have by allowing a bare declaration. That effect is minimal; any expressions in the declaration (in array declarators or initializers) will be evaluated, but anything that is declared goes out of scope immediately. Note that even if the scope were not ended by the closing }, it is ended by the fact that the C standard defines each of these places to be a block. (C 2018 6.8.4 3 says the substatement of a selection-statement is a block, and 6.8.5 5 says the loop body of an iteration-statement is a block.)
Nonetheless, this shows there is no technical semantic impediment to allowing a declaration wherever a statement may appear.
Conclusion
Since the grammar and semantics of C apparently do not preclude allowing a declaration to be a type of statement, we are left with reasons of history and utility. In C as described in the first edition of Kernighan and Ritchie’s The C Programming Language, the locations of declarations were limited. Inside functions, they could only appear at the start of a compound statement. A declaration could not follow a statement. As we see from modern C, there was no grammatical or semantic reason for this limitation; we can allow declarations anywhere within a compound statement. So it seems simply that, around 1978, work on the language had not progressed that far.
Similarly, it seems that current work on C has not gone to the point of allowing a declaration to appear anywhere a statement may appear, as if it were a type of statement, even though there may be no technical impediment. However, in this case, there is less motivation for loosening the rules. Of the above cases, the only one that is of much use is allowing a declaration in a labeled statement. And, as its desired effect is easily accomplished by inserting a null statement, there is likely insufficient motivation to change compilers and to advocate for the changes in the C committee.

That's because a declaration doesn't instruct the compiler to do anything, it's purely informative for the compiler; at least by the standard. Compilers may do something if they see a declaration, the standard does not forbid it either but it doesn't require them to do anything if they only see a declaration and whatever it declares is never used within any statement.
Consider this code:
int main ( )
{
int x;
printf("Hello World!\n");
return 0;
}
What do you think will int x; do? You are declaring that x is of type int but you are never using x anywhere in the rest of the code. The compiler doesn't even have to reserve any memory on stack for it. It may to so but it isn't required to do so.
The standard allows the compiler to create exactly the same code as if you had written:
int main ( )
{
printf("Hello World!\n");
return 0;
}
There is simply nothing a compiler must do if you let it know the type of a variable. This variable doesn't have to exist anywhere at all unless it is ever used by a statement.
C is not an interpreted language where every piece of code instructs the interpreter to directly do something. C is a compiled language which means you tell the compiler to generate CPU code for you that performs the actions you described in a predefined language. So there is no one-to-one relationship between the code you write and the CPU code the compiler generates.
You may write
int x = a / 8;
but the CPU code that the compiler generate may be equivalent to
int x = a >> 3;
As that is exactly the same thing and if shifting is faster than division (and you can bet it is), the compiler does not have to generate a division just because you told it to do so. What you told the compiler is "I want x to be one eighth of a" and the compiler will be like "okay, I'll generate code that makes this happen" but how the compiler is making it happen is up to the compiler.
Thus the compiler only needs to translate statements to CPU code. Actually only statements that have an effect but to find out about that expensive analysis may be required so it's no standard violation to translate all statements to code, even those that do nothing. A declaration on its own has never an effect, it just lets the compiler know the type of a variable or function, which may become important in statements later on but only if the variable/function is ever actually used.

If it were valid, what would you like this program to do ? :
#include <stdio.h>
int main (int argc, char **argv)
{
if (argc > 1) int x=42;
printf("%d\n", x);
return 0;
}

In C, should I define (not declare/prototype) a function that takes no arguments with void or with an empty list?

There may or may not be a duplicate to this question, although I tried to find one but everyone's answer seemed to only be referring to the declaration/prototype. They specify that a definition void foo() { } is the same as void foo(void) { }, but which way should I actually use? In C89? In C99? I believe I should start using void foo(void); for my prototype declarations, but is there any difference at all if I use void or not for the definition?

They are different, void foo(void) declares foo as a function that takes NO argument, and returns nothing.
While for void foo(), the function foo takes UNSPECIFIED number of arguments, and returns void.
You should always use the first one for standard conforming C.

They are semantically different
Given the following functions:
void f(void);
void g();
It is a compile-time error to call f with arguments:
error: too many arguments to function "f"
However, that declaration of g means it takes an unspecified number of arguments. To the compiler, this means it can take any number of arguments, from zero to some implementation-defined upper bound. The compiler will accept:
g();
g(argument);
g(argument1, argument2, ... , argumentN);
Essentially, because g did not specify its arguments, the compiler doesn't really know how many arguments g accepts. So the compiler will accept anything and emit code according to the actual usage of g. If you pass one argument, it will emit code to push one argument, call g and then pop it off the stack.
It's the difference between explicitly saying "no, I don't take any arguments" and not saying anything when questioned. Remaining silent keeps the issue ambiguous, to the point where the statement which calls g is the only concrete information the compiler has regarding which parameters the function accepts. So, it will emit machine code according to that specification.
Recommendations
which way should I actually use?
According to the SEI CERT C Coding Standard, it is recommended to explicitly specify void when a function accepts no arguments.
The article cites, as the basis of its recommendation, the C11 standard, subclause 6.11.6:
The use of function declarators with empty parentheses
(not prototype-format parameter type declarators)
is an obsolescent feature.
Declaring a function with an unspecified parameter list is classified as medium severity. Concrete examples of problems that may arise are presented. Namely:
Ambiguous Interface
Compiler will not perform checks
May hide errors
Information Outflow
Potential security flaw
Information Security has a post exploring not just the security but also the programming and software development implications of both styles.
The issue is more about quality assurance.
Old-style declarations are dangerous, not because of evil programmers,
but because of human programmers, who cannot think of everything
and must be helped by compiler warnings. That's all the point of function
prototypes, introduced in ANSI C, which include type information for
the function parameters.

I'll try to answer simply and practically.
From the practice and reference I'm familiar with, c89 and c99 should treat declaration/definition/call of functions which take no arguments and return no value equally.
In case one omits the prototype declaration (usually in the header file), the definition has to specify the number and type of arguments taken (i.e. it must take the form of prototype, explicitly void foo(void) for taking no arguments)
and should precede the actual function call in the source file (if used in the same program). I've always been advised to write prototypes and decently segmented code as part of good programming practice.
Declaration:
void foo (void); /*not void foo(), in order to conform the prototype definition !*/
Definition:
void foo (void) /*must match its prototype from the declaration !*/
{
/*code for what this function actually does*/
return;
}
Function call from within main() or another function:
...
foo();
...

Yes, there is a difference. It is better to define functions like void foo(void){} cause it will prevent passing any arguments to function in compilation time with error like:too many arguments to function 'foo'
EDIT: If you want to add such compiler's validation for existing code, this probably can be done changing prototypes in the headers. Without changing the function definitions. But it looks awkward IMHO. So for newly created programs (as pointed by skillful commentators above) it's better to make definition and declaration match verbose, and this is bad and ancient practice to declare and define with empty parentheses

Is avoiding prototype declaration for private function (defined before its use) a MISRA violation?

Making prototype declaration for all functions defined in a C file is considered as a good programming. It also satisfies MISRA guideline.
But I have seen developers ignoring prototype declarations for functions which are defined before it's used - It seems prototype declaration is unnecessary in such cases.
So can somebody please tell me if it's a MISRA violation ?

Rule 8.1 of MISRA 2004 says that
Functions shall have prototype declarations and the prototype shall be visible at both the function definition and call.
The explanation given is as follows
The use of prototypes enables the compiler to check the integrity of function definitions and calls. Without prototypes the compiler is not obliged to pick up certain errors in function calls. (e.g. different number of arguments from the function body, mismatch in types of arguments between call and definition).
Function interfaces have been shown to be a cause of considerable
problems, and therefore this rule is considered very important.
So, yes, you would voilate MISRA

It's breaking the rule 8.1 of misra C: http://caxapa.ru/thumbs/468328/misra-c-2004.pdf

Forward declare entities in C standard library?

Is it legal to forward declare structs and functions provided by the C standard library?
My background is C++ in which the answer is no. The primary reason for this is that a struct or class mandated by the C++ standard library can be a template behind the scenes and may have "secret" template parameters and so cannot be properly declared with a naive non-template declaration. Even if a user does figure out exactly how to forward declare a particular entity in a particular version of a particular implementation, the implementation is not obliged to not break that declaration in future versions.
I don't have a copy of any C standard at hand but obviously there are no templates in C.
So is it legal to forward declare entities in the C standard library?
Another reason that entities in the C++ standard library may not be forward declared is that headers provided by the implementation need not follow the normal rules. For example, in a recent question I asked if a C++ header provided by the implementation need be an actual file and the answer was no. I don't know if any of that applies to C.
The C standard library is used by both C and C++ but for this question I'm only asking about C.

Forward declarations of structs are always permissible in C. However, not very many types can be used this way. For example, you can't use a forward declaration for FILE simply because the tag name of the struct is not specified (and theoretically, it may not be a struct at all).
Section 7.1.4 paragraph 2 of n1570 gives you permission to do the same with functions:
Provided that a library function can be declared without reference to any type defined in a
header, it is also permissible to declare the function and use it without including its
associated header.
This used to be rather common. I think the reasoning here is that hard drives are slow, and fewer #include means faster compile times. But this isn't the 1980s any more, and we all have fast CPUs and fast hard drives, so a few #include aren't even noticed.
void *malloc(size_t);
void abort(void);
/* my code here */

yes you can this is perfectly valid.
this can be done with the standard library too.
double atof(const char *);
int main() {
double t = atof("13.37");
return 0;
}
#include <stdio.h>
Similiar things can be done with structs, variables etc.
I would recommend you read the wiki page which features some c examples:
http://en.wikipedia.org/wiki/Forward_declaration
this is specified in the c standard, Section 7.1.4 paragraph 2 of n1570
Provided that a library function can be declared without reference to any type defined in a header, it is also permissible to declare the function and use it without including its associated header.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight