This is not a lambda function question, I know that I can assign a lambda to a variable.
What's the point of allowing us to declare, but not define a function inside code?
For example:
#include <iostream>
int main()
{
// This is illegal
// int one(int bar) { return 13 + bar; }
// This is legal, but why would I want this?
int two(int bar);
// This gets the job done but man it's complicated
class three{
int m_iBar;
public:
three(int bar):m_iBar(13 + bar){}
operator int(){return m_iBar;}
};
std::cout << three(42) << '\n';
return 0;
}
So what I want to know is why would C++ allow two which seems useless, and three which seems far more complicated, but disallow one?
EDIT:
From the answers it seems that there in-code declaration may be able to prevent namespace pollution, what I was hoping to hear though is why the ability to declare functions has been allowed but the ability to define functions has been disallowed.
It is not obvious why one is not allowed; nested functions were proposed a long time ago in N0295 which says:
We discuss the introduction of nested functions into C++. Nested
functions are well understood and their introduction requires little
effort from either compiler vendors, programmers, or the committee.
Nested functions offer significant advantages, [...]
Obviously this proposal was rejected, but since we don't have meeting minutes available online for 1993 we don't have a possible source for the rationale for this rejection.
In fact this proposal is noted in Lambda expressions and closures for C
++ as a possible alternative:
One article [Bre88] and proposal N0295 to the C
++ committee [SH93] suggest adding nested functions to C
++ . Nested functions are similar to lambda expressions, but are defined as statements within a function body, and the resulting
closure cannot be used unless that function is active. These proposals
also do not include adding a new type for each lambda expression, but
instead implementing them more like normal functions, including
allowing a special kind of function pointer to refer to them. Both of
these proposals predate the addition of templates to C
++ , and so do not mention the use of nested functions in combination with generic algorithms. Also, these proposals have no way to copy
local variables into a closure, and so the nested functions they
produce are completely unusable outside their enclosing function
Considering we do now have lambdas we are unlikely to see nested functions since, as the paper outlines, they are alternatives for the same problem and nested functions have several limitations relative to lambdas.
As for this part of your question:
// This is legal, but why would I want this?
int two(int bar);
There are cases where this would be a useful way to call the function you want. The draft C++ standard section 3.4.1 [basic.lookup.unqual] gives us one interesting example:
namespace NS {
class T { };
void f(T);
void g(T, int);
}
NS::T parm;
void g(NS::T, float);
int main() {
f(parm); // OK: calls NS::f
extern void g(NS::T, float);
g(parm, 1); // OK: calls g(NS::T, float)
}
Well, the answer is "historical reasons". In C you could have function declarations at block scope, and the C++ designers did not see the benefit in removing that option.
An example usage would be:
#include <iostream>
int main()
{
int func();
func();
}
int func()
{
std::cout << "Hello\n";
}
IMO this is a bad idea because it is easy to make a mistake by providing a declaration that does not match the function's real definition, leading to undefined behaviour which will not be diagnosed by the compiler.
In the example you give, void two(int) is being declared as an external function, with that declaration only being valid within the scope of the main function.
That's reasonable if you only wish to make the name two available within main() so as to avoid polluting the global namespace within the current compilation unit.
Example in response to comments:
main.cpp:
int main() {
int foo();
return foo();
}
foo.cpp:
int foo() {
return 0;
}
no need for header files. compile and link with
c++ main.cpp foo.cpp
it'll compile and run, and the program will return 0 as expected.
You can do these things, largely because they're actually not all that difficult to do.
From the viewpoint of the compiler, having a function declaration inside another function is pretty trivial to implement. The compiler needs a mechanism to allow declarations inside of functions to handle other declarations (e.g., int x;) inside a function anyway.
It will typically have a general mechanism for parsing a declaration. For the guy writing the compiler, it doesn't really matter at all whether that mechanism is invoked when parsing code inside or outside of another function--it's just a declaration, so when it sees enough to know that what's there is a declaration, it invokes the part of the compiler that deals with declarations.
In fact, prohibiting these particular declarations inside a function would probably add extra complexity, because the compiler would then need an entirely gratuitous check to see if it's already looking at code inside a function definition and based on that decide whether to allow or prohibit this particular declaration.
That leaves the question of how a nested function is different. A nested function is different because of how it affects code generation. In languages that allow nested functions (e.g., Pascal) you normally expect that the code in the nested function has direct access to the variables of the function in which it's nested. For example:
int foo() {
int x;
int bar() {
x = 1; // Should assign to the `x` defined in `foo`.
}
}
Without local functions, the code to access local variables is fairly simple. In a typical implementation, when execution enters the function, some block of space for local variables is allocated on the stack. All the local variables are allocated in that single block, and each variable is treated as simply an offset from the beginning (or end) of the block. For example, let's consider a function something like this:
int f() {
int x;
int y;
x = 1;
y = x;
return y;
}
A compiler (assuming it didn't optimize away the extra code) might generate code for this roughly equivalent to this:
stack_pointer -= 2 * sizeof(int); // allocate space for local variables
x_offset = 0;
y_offset = sizeof(int);
stack_pointer[x_offset] = 1; // x = 1;
stack_pointer[y_offset] = stack_pointer[x_offset]; // y = x;
return_location = stack_pointer[y_offset]; // return y;
stack_pointer += 2 * sizeof(int);
In particular, it has one location pointing to the beginning of the block of local variables, and all access to the local variables is as offsets from that location.
With nested functions, that's no longer the case--instead, a function has access not only to its own local variables, but to the variables local to all the functions in which it's nested. Instead of just having one "stack_pointer" from which it computes an offset, it needs to walk back up the stack to find the stack_pointers local to the functions in which it's nested.
Now, in a trivial case that's not all that terrible either--if bar is nested inside of foo, then bar can just look up the stack at the previous stack pointer to access foo's variables. Right?
Wrong! Well, there are cases where this can be true, but it's not necessarily the case. In particular, bar could be recursive, in which case a given invocation of bar might have to look some nearly arbitrary number of levels back up the stack to find the variables of the surrounding function. Generally speaking, you need to do one of two things: either you put some extra data on the stack, so it can search back up the stack at run-time to find its surrounding function's stack frame, or else you effectively pass a pointer to the surrounding function's stack frame as a hidden parameter to the nested function. Oh, but there's not necessarily just one surrounding function either--if you can nest functions, you can probably nest them (more or less) arbitrarily deep, so you need to be ready to pass an arbitrary number of hidden parameters. That means you typically end up with something like a linked list of stack frames to surrounding functions, and access to variables of surrounding functions is done by walking that linked list to find its stack pointer, then accessing an offset from that stack pointer.
That, however, means that access to a "local" variable may not be a trivial matter. Finding the correct stack frame to access the variable can be non-trivial, so access to variables of surrounding functions is also (at least usually) slower than access to truly local variables. And, of course, the compiler has to generate code to find the right stack frames, access variables via any of an arbitrary number of stack frames, and so on.
This is the complexity that C was avoiding by prohibiting nested functions. Now, it's certainly true that a current C++ compiler is a rather different sort of beast from a 1970's vintage C compiler. With things like multiple, virtual inheritance, a C++ compiler has to deal with things on this same general nature in any case (i.e., finding the location of a base-class variable in such cases can be non-trivial as well). On a percentage basis, supporting nested functions wouldn't add much complexity to a current C++ compiler (and some, such as gcc, already support them).
At the same time, it rarely adds much utility either. In particular, if you want to define something that acts like a function inside of a function, you can use a lambda expression. What this actually creates is an object (i.e., an instance of some class) that overloads the function call operator (operator()) but it still gives function-like capabilities. It makes capturing (or not) data from the surrounding context more explicit though, which allows it to use existing mechanisms rather than inventing a whole new mechanism and set of rules for its use.
Bottom line: even though it might initially seem like nested declarations are hard and nested functions are trivial, more or less the opposite is true: nested functions are actually much more complex to support than nested declarations.
The first one is a function definition, and it is not allowed. Obvious, wt is the usage of putting a definition of a function inside another function.
But the other twos are just declarations. Imagine you need to use int two(int bar); function inside the main method. But it is defined below the main() function, so that function declaration inside the function makes you to use that function with declarations.
The same applies to the third. Class declarations inside the function allows you to use a class inside the function without providing an appropriate header or reference.
int main()
{
// This is legal, but why would I want this?
int two(int bar);
//Call two
int x = two(7);
class three {
int m_iBar;
public:
three(int bar):m_iBar(13 + bar) {}
operator int() {return m_iBar;}
};
//Use class
three *threeObj = new three();
return 0;
}
This language feature was inherited from C, where it served some purpose in C's early days (function declaration scoping maybe?).
I don't know if this feature is used much by modern C programmers and I sincerely doubt it.
So, to sum up the answer:
there is no purpose for this feature in modern C++ (that I know of, at least), it is here because of C++-to-C backward compatibility (I suppose :) ).
Thanks to the comment below:
Function prototype is scoped to the function it is declared in, so one can have a tidier global namespace - by referring to external functions/symbols without #include.
Actually, there is one use case which is conceivably useful. If you want to make sure that a certain function is called (and your code compiles), no matter what the surrounding code declares, you can open your own block and declare the function prototype in it. (The inspiration is originally from Johannes Schaub, https://stackoverflow.com/a/929902/3150802, via TeKa, https://stackoverflow.com/a/8821992/3150802).
This may be particularily useful if you have to include headers which you don't control, or if you have a multi-line macro which may be used in unknown code.
The key is that a local declaration supersedes previous declarations in the innermost enclosing block. While that can introduce subtle bugs (and, I think, is forbidden in C#), it can be used consciously. Consider:
// somebody's header
void f();
// your code
{ int i;
int f(); // your different f()!
i = f();
// ...
}
Linking may be interesting because chances are the headers belong to a library, but I guess you can adjust the linker arguments so that f() is resolved to your function by the time that library is considered. Or you tell it to ignore duplicate symbols. Or you don't link against the library.
This is not an answer to the OP question, but rather a reply to several comments.
I disagree with these points in the comments and answers: 1 that nested declarations are allegedly harmless, and 2 that nested definitions are useless.
1 The prime counterexample for the alleged harmlessness of nested function declarations is the infamous Most Vexing Parse. IMO the spread of confusion caused by it is enough to warrant an extra rule forbidding nested declarations.
2 The 1st counterexample to the alleged uselessness of nested function definitions is frequent need to perform the same operation in several places inside exactly one function. There is an obvious workaround for this:
private:
inline void bar(int abc)
{
// Do the repeating operation
}
public:
void foo()
{
int a, b, c;
bar(a);
bar(b);
bar(c);
}
However, this solution often enough contaminates the class definition with numerous private functions, each of which is used in exactly one caller. A nested function declaration would be much cleaner.
Specifically answering this question:
From the answers it seems that there in-code declaration may be able to prevent namespace pollution, what I was hoping to hear though is why the ability to declare functions has been allowed but the ability to define functions has been disallowed.
Because consider this code:
int main()
{
int foo() {
// Do something
return 0;
}
return 0;
}
Questions for language designers:
Should foo() be available to other functions?
If so, what should be its name? int main(void)::foo()?
(Note that 2 would not be possible in C, the originator of C++)
If we want a local function, we already have a way - make it a static member of a locally-defined class. So should we add another syntactic method of achieving the same result? Why do that? Wouldn't it increase the maintenance burden of C++ compiler developers?
And so on...
Just wanted to point out that the GCC compiler allows you to declare functions inside functions. Read more about it here. Also with the introduction of lambdas to C++, this question is a bit obsolete now.
The ability to declare function headers inside other functions, I found useful in the following case:
void do_something(int&);
int main() {
int my_number = 10 * 10 * 10;
do_something(my_number);
return 0;
}
void do_something(int& num) {
void do_something_helper(int&); // declare helper here
do_something_helper(num);
// Do something else
}
void do_something_helper(int& num) {
num += std::abs(num - 1337);
}
What do we have here? Basically, you have a function that is supposed to be called from main, so what you do is that you forward declare it like normal. But then you realize, this function also needs another function to help it with what it's doing. So rather than declaring that helper function above main, you declare it inside the function that needs it and then it can be called from that function and that function only.
My point is, declaring function headers inside functions can be an indirect method of function encapsulation, which allows a function to hide some parts of what it's doing by delegating to some other function that only it is aware of, almost giving an illusion of a nested function.
Nested function declarations are allowed probably for
1. Forward references
2. To be able to declare a pointer to function(s) and pass around other function(s) in a limited scope.
Nested function definitions are not allowed probably due to issues like
1. Optimization
2. Recursion (enclosing and nested defined function(s))
3. Re-entrancy
4. Concurrency and other multithread access issues.
From my limited understanding :)
Related
I read plenty of questions regarding
declaration of functions inside structure?
But I did not get a satisfactory answer.
My question is not about whether functions can be declared inside a structure or not?
Instead, my question is
WHY function can not be declared inside structure?
Well that is the fundamental difference between C and C++ (works in C++). C++ supports classes (and in C++ a struct is a special case of a class), and C does not.
In C you would implement the class as a structure with functions that take a this pointer explicitly, which is essentially what C++ does under the hood. Coupled with a sensible naming convention so you know what functions belong to which classes (again something C++ does under then hood with name-mangling), you get close to object-based if not object-oriented programming. For example:
typedef struct temp
{
int a;
} classTemp ;
void classTemp_GetData( classTemp* pThis )
{
printf( "Enter value of a : " );
scanf( "%d", &(pThis->a) );
}
classTemp T ;
int main()
{
classTemp_GetData( &T );
}
However, as you can see without language support for classes, implementing then can become tiresome.
In C, the functions and data structures are more or less bare; the language gives a minimum of support for combining data structures together, and none at all (directly) for including functions with those data structures.
The purpose of C is to have a language that translates as directly as possible into machine code, more like a portable assembly language than a higher-level language such as C++ (not that C++ is all that high-level). C let's you get very close to the machine, getting into details that most languages abstract away; the down side of this is that you have to get close to the machine in C to use the language to its utmost. It takes a completely different approach to programming from C++, something that the surface similarities between them hide.
Check out here for more info (wonderful discussions there).
P.S.: You can also accomplish the functionality by using function pointers, i.e.
have a pointer to a function (as a variable) inside the struct.
For example:
#include <stdio.h>
struct t {
int a;
void (*fun) (int * a); // <-- function pointers
} ;
void get_a (int * a) {
printf (" input : ");
scanf ("%d", a);
}
int main () {
struct t test;
test.a = 0;
printf ("a (before): %d\n", test.a);
test.fun = get_a;
test.fun(&test.a);
printf ("a (after ): %d\n", test.a);
return 0;
}
WHY function can not be declared inside structure?
Because C standard does not allow to declare function/method inside a structure. C is not object-oriented language.
6.7.2.1 Structure and union specifiers:
A structure or union shall not contain a member with incomplete or function type(hence,
a structure shall not contain an instance of itself, but may contain a pointer to an instance
of itself), except that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union containing, possibly
recursively, a member that is such a structure) shall not be a member of a structure or an
element of an array.
I suppose there were and are many reasons, here are several of them:
C programming language was created in 1972, and was influenced by pure assembly language, so struct was supposed as "data-only" element
As soon as C is NOT object oriented language - there is actually no sense to define functions inside "data structure", there are no such entity as constructor/method etc
Functions are directly translated to pure assembly push and call instructions and there are no hidden arguments like this
I guess, because it wouldn't make much sence in C. If you declare a function inside structure, you expect it to be somehow related to that structure, right? Say,
struct A {
int foo;
void hello() {
// smth
}
}
Would you expect hello() to have access to foo at least? Because otherwise hello() only got something like namespace, so to call it we would write A.hello() - it would be a static function, in terms of C++ - not much difference from normal C function.
If hello() has access to foo, there must be a this pointer to implement such access, which in C++ always implicitly passed to functions as first argument.
If a structure function has access to structure variables, it must be different somehow from access that have other functions, again, to add some sence to functions inside structures at all. So we have public, private modificators.
Next. You don't have inheritance in C (but you can simulate it), and this is something that adds lots of sence to declaring functions inside structutes. So here we'd like to add virtual and protected.
Now you can add namespaces and classes, and here you are, invented C++ (well, without templates).
But C++ is object-oriented, and C is not. First, people created C, then they wrote tons of programs, understood some improvements that could be made, and then, following reasonings that I mentioned earlier, they came up with C++. They did not change C instead to separate concepts - C is procedure-oriented, and C++ is object-oriented.
C was designed so that it could be processed with a relatively simple compilation system. To allow function definitions to appear within anything else would have required the compiler to keep track of the context in which the function appeared while processing it. Given that members of a structure, union, or enum declaration do not enter scope until the end of the declaration, there would be nothing that a function declared within a structure could do which a function declared elsewhere could not.
Conceptually, it might have been nice to allow constant declarations within a structure; a constant pointer to a literal function would then be a special case of that (even if the function itself had to be declared elsewhere), but that would have required the C compiler to keep track of more information for each structure member--not just its type and its offset, but also whether it was a constant and, if so, what its value should be. Such a thing would not be difficult in today's compilation environments, but early C compilers were expected to run with orders of magnitude less memory than would be available today.
Given that C has been extended to offer many features which could not very well be handled by a compiler running on a 64K (or smaller) system, it might reasonably be argued that it should no longer be bound by such constraints. Indeed, there are some aspect of C's heritage which I would like to lose, such as the integer-promotion rules which require even new integer types to follow the inconsistent rules for old types, rather than allowing the new types to have explicitly specified behavior [e.g. have a wrap32_t which could be converted to any other integer type without a typecast, but when added to any "non-wrap" integer type of any size would yield a wrap32_t]. Being able to define functions within a struct might be nice, but it would be pretty far down on my list of improvements.
I'd like to be able to generically pass a function to a function in C. I've used C for a few years, and I'm aware of the barriers to implementing proper closures and higher-order functions. It's almost insurmountable.
I scoured StackOverflow to see what other sources had to say on the matter:
higher-order-functions-in-c
anonymous-functions-using-gcc-statement-expressions
is-there-a-way-to-do-currying-in-c
functional-programming-currying-in-c-issue-with-types
emulating-partial-function-application-in-c
fake-anonymous-functions-in-c
functional-programming-in-c-with-macro-higher-order-function-generators
higher-order-functions-in-c-as-a-syntactic-sugar-with-minimal-effort
...and none had a silver-bullet generic answer, outside of either using varargs or assembly. I have no bones with assembly, but if I can efficiently implement a feature in the host language, I usually attempt to.
Since I can't have HOF easily...
I'd love higher-order functions, but I'll settle for delegates in a pinch. I suspect that with something like the code below I could get a workable delegate implementation in C.
An implementation like this comes to mind:
enum FUN_TYPES {
GENERIC,
VOID_FUN,
INT_FUN,
UINT32_FUN,
FLOAT_FUN,
};
typedef struct delegate {
uint32 fun_type;
union function {
int (*int_fun)(int);
uint32 (*uint_fun)(uint);
float (*float_fun)(float);
/* ... etc. until all basic types/structs in the
program are accounted for. */
} function;
} delegate;
Usage Example:
void mapint(struct fun f, int arr[20]) {
int i = 0;
if(f.fun_type == INT_FUN) {
for(; i < 20; i++) {
arr[i] = f.function.int_fun(arr[i]);
}
}
}
Unfortunately, there are some obvious downsides to this approach to delegates:
No type checks, save those which you do yourself by checking the 'fun_type' field.
Type checks introduce extra conditionals into your code, making it messier and more branchy than before.
The number of (safe) possible permutations of the function is limited by the size of the 'fun_type' variable.
The enum and list of function pointer definitions would have to be machine generated. Anything else would border on insanity, save for trivial cases.
Going through ordinary C, sadly, is not as efficient as, say a mov -> call sequence, which could probably be done in assembly (with some difficulty).
Does anyone know of a better way to do something like delegates in C?
Note: The more portable and efficient, the better
Also, Note: I've heard of Don Clugston's very fast delegates for C++. However, I'm not interested in C++ solutions--just C .
You could add a void* argument to all your functions to allow for bound arguments, delegation, and the like. Unfortunately, you'd need to write wrappers for anything that dealt with external functions and function pointers.
There are two questions where I have investigated techniques for something similar providing slightly different versions of the basic technique. The downside of this is that you lose compile time checks since the argument lists are built at run time.
The first is my answer to the question of Is there a way to do currying in C. This approach uses a proxy function to invoke a function pointer and the arguments for the function.
The second is my answer to the question C Pass arguments as void-pointer-list to imported function from LoadLibrary().
The basic idea is to have a memory area that is then used to build an argument list and to then push that memory area onto the stack as part of the call to the function. The result is that the called function sees the memory area as a list of parameters.
In C the key is to define a struct which contains an array which is then used as the memory area. When the called function is invoked, the entire struct is passed by value which means that the arguments set into the array are then pushed onto the stack so that the called function sees not a struct value but rather a list of arguments.
With the answer to the curry question, the memory area contains a function pointer as well as one or more arguments, a kind of closure. The memory area is then handed to a proxy function which actually invokes the function with the arguments in the closure.
This works because the standard C function call pushes arguments onto the stack, calls the function and when the function returns the caller cleans up the stack because it knows what was actually pushed onto the stack.
New EE with very little software experience here.
Have read many questions on this site over the last couple years, this would be my first question/post.
Haven't quite found the answer for this one.
I would like to know the difference/motivation between having a function modify a global variable within the body (not passing it as a parameter), and between passing the address of a variable.
Here is an example of each to make it more clear.
Let's say that I'm declaring some functions "peripheral.c" (with their proper prototypes in "peripheral.h", and using them in "implementation.c"
Method 1:
//peripheral.c
//macros, includes, etc
void function(*x){
//modify x
}
.
//implementation.c
#include "peripheral.h"
static uint8 var;
function(&var); //this will end up modifying var
Method 2:
//peripheral.c
//macros, includes, etc
void function(void){
//modify x
}
.
//implementation.c
#include "peripheral.h"
static uint8 x;
function(); //this will modify x
Is the only motivation to avoid using a "global" variable?
(Also, is it really global if it just has file scope?)
Hopefully that question makes sense.
Thanks
The function that receives a parameter pointing to the variable is more general. It can be used to modify a global, a local or indeed any variable. The function that modifies the global can do that task and that task only.
Which is to be preferred depends entirely on the context. Sometimes one approach is better, sometimes the other. It's not possible to say definitively that one approach is always better than the other.
As for whether your global variable really is global, it is global in the sense that there is one single instance of that variable in your process.
static variables have internal linkage, they cannot be accessed beyond the translation unit in which they reside.
So if you want to modify a static global variable in another TU it will be have to be passed as an pointer through function parameter as in first example.
Your second example cannot work because x cannot be accessed outside implementation.c, it should give you an compilation error.
Good Read:
What is external linkage and internal linkage?
First of all, in C/C++, "global" does mean file scope (although if you declare a global in a header, then it is included in files that #include that header).
Using pointers as parameters is useful when the calling function has some data that the called function should modify, such as in your examples. Pointers as parameters are especially useful when the function that is modifying its input does not know exactly what it is modifying. For example:
scanf("%d", &foo);
scanf is not going to know anything about foo, and you cannot modify its source code to give it knowledge of foo. However, scanf takes pointers to variables, which allows it to modify the value of any arbitrary variable (of types it supports, of course). This makes it more reusable than something that relies on global variables.
In your code, you should generally prefer to use pointers to variables. However, if you notice that you are passing the same chunk of information around to many functions, a global variable may make sense. That is, you should prefer
int g_state;
int foo(int x, int y);
int bar(int x, int y);
void foobar(void);
...
to
int foo(int x, int y, int state);
int bar(int x, int y, int state);
void foobar(int state);
...
Basically, use globals for values that should be shared by everything in the file they are in (or files, if you declare the global in a header). Use pointers as parameters for values that should be passed between a smaller group of functions for sharing and for situations where there may be more than one variable you wish to do the same operations to.
EDIT: Also, as a note for the future, when you say "pointer to function", people are going to assume that you mean a pointer that points to a function, rather than passing a pointer as a parameter to a function. "pointer as parameter" makes more sense for what you're asking here.
Several different issues here:
In general, "global variables are bad". Don't use them, if you can avoid it. Yes, it preferable to pass a pointer to a variable so a function can modify it, than to make it global so the function can implicitly modify it.
Having said that, global variables can be useful: by all means use them as appropriate.
And yes, "global" can mean "between functions" (within a module) as well as "between modules" (global throughout the entire program).
There are several interesting things to note about your code:
a) Most variables are allocated from the "stack". When you declare a variable outside of a function like this, it's allocated from "block storage" - the space exists for the lifetime of the program.
b) When you declare it "static", you "hide" it from other modules: the name is not visible outside of the module.
c) If you wanted a truly global variable, you would not use the keyword "static". And you might declare it "extern uint8 var" in a header file (so all modules would have the definition).
I'm not sure your second example really works, since you declared x as static (and thus limiting its scope to a file) but other then that, there are some advantages of the pointer passing version:
It gives you more flexibility on allocation and modularity. While you can only have only one copy of a global variable in a file, you can have as many pointers as you want and they can point to objects created at many different places (static arrays, malloc, stack variables...)
Global variables are forced into every function so you must be always aware that someone might want to modify them. On the other hands, pointers can only be accessed by functions you explicitely pass them to.
In addition to the last point, global variables all use the same scope and it can get cluttered with too many variables. On the other hand, pointers have lexical scoping like normal varialbes and their scope is much more restricted.
And yes, things can get somewhat blurry if you have a small, self contained file. If you aren't going to ever instantiate more then one "object" then sometimes static global variables (that are local to a single file) work just as well as pointers to a struct.
The main problem with global variables is that they promote what's known as "tight coupling" between functions or modules. In your second design, the peripheral module is aware of and dependent on the design of implementation, to the point that if you change implementation by removing or renaming x, you'll break peripheral, even without touching any its code. You also make it impossible to re-use peripheral independently of the implementation module.
Similarly, this design means function in peripheral can only ever deal with a single instance of x, whatever x represents.
Ideally, a function and its caller should communicate exclusively through parameters, return values, and exceptions (where appropriate). If you need to maintain state between calls, use a writable parameter to store that state, rather than relying on a global.
I'm very curious to know why exactly C89 compilers will dump on you when you try to mix variable declarations and code, like this for example:
rutski#imac:~$ cat test.c
#include <stdio.h>
int
main(void)
{
printf("Hello World!\n");
int x = 7;
printf("%d!\n", x);
return 0;
}
rutski#imac:~$ gcc -std=c89 -pedantic test.c
test.c: In function ‘main’:
test.c:7: warning: ISO C90 forbids mixed declarations and code
rutski#imac:~$
Yes, you can avoid this sort of thing by staying away from -pedantic. But then your code is no longer standards compliant. And as anybody capable of answering this post probably already knows, this is not just a theoretical concern. Platforms like Microsoft's C compiler enforce this quick in the standard under any and all circumstances.
Given how ancient C is, I would imagine that this feature is due to some historical issue dating back to the extraordinary hardware limitations of the 70's, but I don't know the details. Or am I totally wrong there?
The C standard said "thou shalt not", because it was not allowed in the earlier C compilers which the C89 standard standardized. It was a radical enough step to create a language that could be used for writing an operating system and its utilities. The concept probably simply wasn't considered - no other language at the time allowed it (Pascal, Algol, PL/1, Fortran, COBOL), so C didn't need to either. And it probably makes the compiler slightly harder to handle (bigger), and the original compilers were space-constrained by the 64 KiB code and 64 KiB data space allowed in the PDP 11 series (the big machines; the littler ones only allowed 64 KiB for both code and data, AFAIK). So, extra complexity was not a good idea.
It was C++ that allowed declarations and variables to be interleaved, but C++ is not, and never has been, C. However, C99 finally caught up with C++ (it is a useful feature). Sadly, though, Microsoft never implemented C99.
It was probably never implemented that way, because it was never needed.
Suppose you want to write something like this in plain C:
int myfunction(int value)
{
if (value==0)
return 0;
int result = value * 2;
return result;
}
Then you can easily rewrite this in valid C, like this:
int myfunction(int value)
{
int result;
if (value==0)
return 0;
result = value * 2;
return result;
}
There is absolutely no performance impact by first declaring the variable, then setting its value.
However, in C++, this is not the case anymore.
In the following example, function2 will be slower than function1:
double function1(const Factory &factory)
{
if (!factory.isWorking())
return 0;
Product product(factory.makeProduct());
return product.getQuantity();
}
double function2(const Factory &factory)
{
Product product;
if (!factory.isWorking())
return 0;
product = factory.makeProduct();
return product.getQuantity();
}
In function2 the product variable needs to be constructed, even when the factory is not working.
Later, the factory makes the product and then the assignment operator needs to copy the product (from the return value of makeProduct to the product variable). In function1, product is only constructed when the factory is working, and even then, the copy constructor is called, not the normal constructor and assignment operator.
However, I would expect nowadays that a good C++ compiler would optimize this code, but in the first C++ compilers this probably wasn't the case.
A second example is the following:
double function1(const Factory &factory)
{
if (!factory.isWorking())
return 0;
Product &product = factory.getProduct();
return product.getQuantity();
}
double function2(const Factory &factory)
{
Product &product;
if (!factory.isWorking())
return 0;
product = factory.getProduct(); // Invalid. You can't assign to a reference.
return product.getQuantity();
}
In this example, function2 is simply invalid. References can only be assigned a value at declaration time, not later.
This means that in this example, the only way to write valid code is to write the declaration at the moment where the variable is really initialized. Not sooner.
Both examples show why it was really needed in C++ to allow variable declarations after other executable statements, and not in the beginning of the block like in C.
This explains why this was added to C++, and not to C (and other languages) where it isn't really needed.
It is similar to requiring functions to be declared before they are used - it allows a simple-minded compiler to operate in one pass, from top to bottom, emitting object code as it goes.
In this particular case, the compiler can go through the declarations, adding up the stack space required. When it reaches the first statement, it can output the code to adjust the stack, allocating space for the locals, immediately before the start of the function code proper.
It is much easier to write a compiler for language which requires all variables to be declared at the start of function. Some languages even require variables to be declared in specific clause outside of function code (Pascal and Smalltalk come to mind).
The reason is it's easier to map this variables to stack (or registers if your compiler is smart enough) if they are known and don't change.
Any other statements (esp. function calls) may modify stack/registers, making variable mapping more complex.
Im very new to C programing, but have limited experience with Python, Java, and Perl. I was just wondering what the advantage is of having the prototype of a function above main() and the definition of that function below main() rather than just having the definition of said function above main(). From what I have read, the definition can act as a prototype as well.
Thanks in advance,
The use of a prototype above main() (within a single module) is mostly a matter of personal preference. Some people like to see main() at the bottom of the file; other people like to see it at the top. I had a professor in university who complained that I wrote my programs "upside-down" because I put main() at the bottom (and avoided having to write and maintain prototypes for everything).
There is one situation where a prototype may be required:
void b(); // prototype required here
void a()
{
b();
}
void b()
{
a();
}
In this mutually recursive situation, you need at least one of the prototypes to appear prior to the definition of the other function.
When someone else wants to use your library, you can give them the header (containing the prototypes), without giving them your code.
Example? Windows SDK.
If two functions call each other, you will need to have a prototype for at least one of them.
There is no performance impact* on the compiled code.
*Note: Well, there could be, I guess. If the compiler decides to put the method somewhere else in the binary, then you could theoretically see locality issues affecting the performance of your application. It's not something to normally worry about, though.
There's no advantage in splitting a function into a standalone prototype and a definition. In fact, there's a clear and explicit disadvantage to that approach, since it requires extra maintenance efforts to keep the prototype and the definition in perfect sync. For this reason, a reasonable approach would be to provide a separate prototype only when you have to. For example, with external functions declared in header files you have no other choice but to keep the prototype and definition separate.
As for functions internal to a single translation unit (the ones that we'd normally declare static) there's absolutely no reason to create a standalone prototype, since the definition of the function itself includes a prototype. Just define your functions in the natural order: low level functions first, high level functions last, and that's it. If your program consists of a single translation unit, the main function will end up at the very bottom.
The only case you'd have to declare a standalone prototype in this approach would be if some form of recursion was involved.
If you write your code in the correct order, i.e. main last, then there is only one definition in your code for each function. This prevents having to change the code in two, or more, places if changes ever need to be made. That prevents errors and reduces the work to maintain it. The principle is called "single source of truth"