I'm reading The C Book to try and get a better foundation in C. While I think I'm generally getting the concept of pointers, one thing sticks out to me is that it seems like it's generalizing whatever it's pointing to into a global variable (e.g. the ability to use pointers to return values from void functions), which naturally carries with it all the attendant dangers, I assume.
Aside from the fact that a pointer references a specific variable or index in an array, what is the difference between a pointer and a global variable?
They're quite different beasts. To better explain, let me define both.
Pointers:
A variable holds some piece of data. A pointer is a type of data that refers to another piece of memory. Think of it as a sign that says "Over there ---->" pointing at an object of some sort. For example, strings in C are just a pointer to a character, and by convention, you know there's more characters following it until a \0 character. C uses pointers extensively, since there's no other mechanism for sharing common information between parts of the program, except for....
Global Variables:
In a program, you have variables in each function. These can be the parameters to the function, and ones defined inside. As well, you have what are known as global variables. These variables store information that all the functions in a file can access. This can be useful to pass things like a global state around, or configuration. For example, you might have one called debug that your code checks before printing some messages, or to store a global state object, like the score in a video game.
What I think is confusing you: Both can be used to share information between parts of code. Because function arguments are passed by value in C, a function can't modify the variables of what calls it. There are two ways to "fix" that problem. The first (and correct) way is to pass a pointer to the variable into the function. That way, the function knows where to modify the parent's variable.
Another approach is to just use a global variable. That way, instead of passing around pointers, they just edit the global variables directly.
So you can use both of them to accomplish the same thing, but how they work is quite seperate. In fact, a global variable can be a pointer.
A global variable is any variable that is accessible in any scope. A pointer is a variable that contains the address where something lives.
They aren't directly related to each other in any way.
A pointer variable can be in global or local scope and can also point to a variable that is in global, local, or no scope (as if it were coming off of the heap or addressing some DIO lines).
There's a huge difference. Aside from the "other" uses of pointers (which include dealing with strings and arrays, and building dynamic data structures like trees and linked lists), using a pointer to give another function access to a local variable is much more flexible and controlled than sharing a global variable between these two functions.
Firstly, it allows the called function to be provided access to different variables at different times. Think how much more laborious it would be to use scanf() if it always saved its results into the same global variables.
Secondly, passing a pointer to another function makes you much more aware of the fact that that function will be able to modify the object. If you use a global variable for the same purpose, it is easy to forget which functions modify the global and which do not.
Thirdly, global variables consume memory for the life of your program. Local variables are released when their containing function ends, and dynamically-allocated data is released when it is freed. So global variables can at times be a considerable waste of memory.
Using pointers leads to the danger of referring to variables that no longer exist, so care has to be taken. But this is most often a problem when there are complicated global or long-lived data structures which in itself is often a design weakness.
Globals just get in the way of good, modular program design and pointers often provide a better way to achieve the same things.
"Pointer" is a variable that tells you how to get to a value: it's the address of the value you care about. You dereference it (with *) to get to the value.
"Global" defines the scope of the variable: anywhere in the program can say the name and get the value.
You can have local pointers, or global non-pointers. The concepts are completely orthogonal.
The term pointer refers to a variable's type; it is a variable used to refer to another. The term global refers to a variables scope - i.e. its visibility from any part of a program. Therefore the question is somewhat nonsensical since they refer to different kinds of variable attribute; a pointer variable may in fact have global scope, and so have both attributes simultaneously.
While a pointer may indeed refer to an object that is not directly in scope (which is what I think you are referring to), it still allows restricted control of scope, because the pointer itself has scope (unless of course it is a global pointer!).
Moreover a global variable always has static storage class. Whereas a pointer may refer to a static, dynamic, or automatic variable, and because it is a variable, the pointer itself may be static, or auto, or in the case of a dynamically allocated array of pointers - dynamic also.
I think perhaps that you are considering only a very specific use of pointers when in fact they have far greater utility and can be used in many ways. For example, you would almost invariably use pointers to implement the links in a linked list data structure; a global variable will not help you do that.
Clifford
Completely different concepts. You can have pointers to both global and local variables. There's nothing associating the two.
Also, from a function, you can certainly return a pointer to a variable scoped within that function. But that's a bad idea since the variable existed on the function's stack and now that's gone.
Related
I've actually seen some results by testing it, but I want to know which way is better and why.
Question #1: Do local variables get declared every time when I call that function again and again? I know that it is better to declare variables in the narrowest scope possible. But I can not stop myself thinking about declaring it as a global variable and make it get declared only once, not in every function call. Or, does it get declared again in every function call? I know that the scope of a local variable is only that function. So when it leaves that function, it must forget that variable as it is going out of its scope right?
Question #2: When I have some function variables which need to store its previous content(e.g. timer counter variables), which way is better: to declare them as a global variable or to declare them as a static local variable? I don't need them to get their initial values whenever I call that function, I am already setting them to zero or etc whenever I need.
Question #1: Do local variables get declared every time when I call that function again and again?
A1: Yes, but it's not an issue really.
Declaring a local variable means that space is made for that variable on the stack, within the stack frame of that function. Declaring a variable global means that space is made for that variable in the data section of the executable (if the variable is initialized), or the BSS section (if not).
Allocating on the stack comes at zero cost. At function entry, the stack frame is sized to make room for all local variables of the function. One more or less does not matter. Statically allocating (for a global variable) is a tad quicker, but you only get that one variable. This can become a huge issue at some later point, e.g. if you want to make your program multithreaded, your function re-entrant, or your algorithm recursive. It can also become a major hassle during debugging, wasting hours of unproductive time while you are hunting down that bug.
(This is the main point of it all: The performance difference is really negligible. The time you can waste on a suboptimal design riddled with globals, on the other hand, can be quite significant.)
Question #2: [...] which way is better: to declare them as a global variable or to declare them as a static local variable?
A2: From an architectural standpoint, avoid globals wherever possible. There are a few specific cases where they make sense, but you know them when you see them. If you can make it work without globals, avoid them. (The same is true, actually, for static locals. They are better than globals as they are limited in scope, and there are cases where they make sense, but local variables should really be the "default" in your mind.)
Global variable - declared at the start of the program, their global scope means they can be used in any procedure or subroutine in
the program
It is seldom advisable to use Global variables as they are liable to cause bugs, waste memory and can be hard to follow when tracing code. If you declare a global variable it will continue to use memory whilst a program is running even if you no longer need/use it.
Local variable - declared within subroutines or programming blocks, their local scope means they can only be used within the
subroutine or program block they were declared in
Local variables are initiated within a limited scope, this means they are declared when a function or subroutine is called, and once the function ends, the memory taken up by the variable is released. This contrasts with global variables which do not release memory.
Question #1: YES. Local variables get declared every time when you call that function again and again. After it leaves the function it forgots the variables that you declared in that scope. You must also remember that when some variable faced, the program will start to search for it. So when it is closer like declaring in the same scope, it will find faster and be able to continue. Also this will be more efficent while you are coding and will cause less bugs and mistakes etc.
Question #2: If you use the same variable with different functions, I strongly suggest you to declare them as global or define, this will lead the program to carry your "counter" with it. So it can be fastly use it when you need between the scopes you travel.
But after these conditions I must strongly suggest you to:
avoid globals wherever possible (as #DevSolar said)
Q2:
It is usually more preferable to use static variables in your function. The main reason is that since all functions can access global variables, it is very hard to keep track of and debug your program.
Q1:
Yes, local variables are created every time the function it belongs to is run, and deleted when the ends.
Suppose your program has 5 functions (that are rarely used), and each function uses 6 local variables. If you change them all to global variables, you will have all 30 variables taking up space for the entire duration of your program, instead of only have 5 variables occasionally being created and destroyed. Moreover, allocation does not really take much time.
This is going to probably be a stupid question but I am reviewing some code and I just don't see the point in what this guy is doing. In one C file he has defined a global structure that has many elements of many types. So from function "A" there is a call to function "B". In the call they are passing a pointer to the global structure and then in function "B" some stuff is done and part of the global is updated. Now This all seems like superfluous overkill since it is already a global. If the structure was local to function "A" I could totally see passing in the address to the structure into function "B". However the memory is permanently allocated already at the very top of the C file. In fact I can argue that there is a potential problem for someone else coming in a changing something and not realizing they have created a bug.
So I am sure there is a "good coding practice" BKM or something like that for doing this but I just can't see it. So in short, why create an address pointer and pass that to a function unnecessarily when the variable is already a global?
Passing the pointer is good style, primarily because globals are bad style. Perhaps the original developer is thinking about the possibility that the global may not be global, or the function that accepts it might possibly operate on a different variable (which may or may not also be global, but still needs to be identified).
If the structures instance is global, and the two code files can access it, then obviously that is some unwanted coding. But there may be a case that the previous developer would have planned to create other instances and in such case his function re-usability had been challenged.
Its a good practice to use references to the structure during function intercommunication ,but if there is no some future plan of huge code change then using globals directly is not a bad idea.
Function B was most likely being written with an eye towards reusability, and for whatever reason was never actually re-used.
Ideally, functions should communicate with each other exclusively through parameters and return values (and exceptions, where supported), rather than sharing global data. This allows you to more easily re-use code in other programs where the global data variables are not present (or have different names).
If you're really squeezed for stack space, or have some other real technical limitation that makes using global data a significantly more attractive / less expensive option than passing arguments around, then globals are the right answer, but that should be rare.
I am creating a decompiler from IL (Compiled C#\VB code). Is there any way to create reference in C?
Edit:
I want something faster than pointer like stack. Is there a thing like that?
A reference is just a syntactically sugar-coated pointer–a pointer will do just fine.
Stack and pointer are two completely independent concepts.
A reference is just like a pointer, a way to access/pass a variable without copying it.
On the other hand, stack and heap are two different places where variables live.
The decision whether or not a variable should live on the stack or on the heap is totally independent from the way you pass it around.
If you need a local variable, with a lifetime automatically coupled to your function scope declare it on the stack. Allocation is fast, but the object is gone when the function scope ends. Taking this into account, you can pass the variable by value or by pointer to other functions.
If you need a variable that survives the function scope, you need to make it global (or static), or to put the variable dynamically on the heap. Allocation is a bit slower, but once it's there you can use it like the other. You can pass it by value or by pointer then, either. (Bear in mind, that you need to de-allocate dynamically created objects eventually.)
If heap allocation is indeed a performance bottleneck, you should make sure that you use automatic variables (on stack) where possible. Then, do profiling of your allocation patterns. And finally optimize your allocation strategy.
Yesterday I had an interview where the interviewer asked me about the storage classes where variables are stored.
My answer war:
Local Variables are stored in Stack.
Register variables are stored in Register
Global & static variables are stored in data segment.
The memory created dynamically are stored in Heap.
The next question he asked me was: why are they getting stored in those specific memory area? Why is the Local variable not getting stored in register (though I need an auto variable getting used very frequently in my program)? Or why global or static variables are not getting stored in stack?
Then I was clueless. Please help me.
Because the storage area determines the scope and the lifetime of the variables.
You choose a storage specification depending on your requirement, i.e:
Lifetime: The duration you expect the particular variable needs to be alive and valid.
Scope: The scope(areas) where you expect the variable to be accessible.
In short, each storage area provides a different functionality and you need various functionality hence different storage areas.
The C language does not define where any variables are stored, actually. It does, however, define three storage classes: static, automatic, and dynamic.
Static variables are created during program initialization (prior to main()) and remain in existence until program termination. File-scope ('global') and static variables fall under the category. While these commonly are stored in the data segment, the C standard does not require this to be the case, and in some cases (eg, C interpreters) they may be stored in other locations, such as the heap.
Automatic variables are local variables declared in a function body. They are created when or before program flow reaches their declaration, and destroyed when they go out of scope; new instances of these variables are created for recursive function invocations. A stack is a convenient way to implement these variables, but again, it is not required. You could implement automatics in the heap as well, if you chose, and they're commonly placed in registers as well. In many cases, an automatic variable will move between the stack and heap during its lifetime.
Note that the register annotation for automatic variables is a hint - the compiler is not obligated to do anything with it, and indeed many modern compilers ignore it completely.
Finally, dynamic objects (there is no such thing as a dynamic variable in C) refer to values created explicitly using malloc, calloc or other similar allocation functions. They come into existence when explicitly created, and are destroyed when explicitly freed. A heap is a convenient place to put these - or rather, one defines a heap based on the ability to do this style of allocation. But again, the compiler implementation is free to do whatever it wants. If the compiler can perform static analysis to determine the lifetime of a dynamic object, it might be able to move it to the data segment or stack (however, few C compilers do this sort of 'escape analysis').
The key takeaway here is that the C language standard only defines how long a given value is in existence for. And a minimum bound for this lifetime at that - it may remain longer than is required. Exactly how to place this in memory is a subject in which the language and library implementation is given significant freedom.
It is actually just an implementation detail that is convenient.
The compiler could, if he wanted to, generate local variables on the heap if he wishes.
It is just easier to create them on the stack since when leaving a function you can adjust the frame pointer with a simple add/subtract depending on the growth direction of the stack and so automatically free the used space for the next function. Creating locals on the heap however would mean more house-keeping work.
Another point is local variables must not be created on the stack, they can be stored and used just in a register if the compiler thinks that's more appropriate and has enough registers to do so.
Local variables are stored in registers in most cases, because registers are pushed and poped from stack when you make function calls It looks like they are on stack.
There is actually no such tings as register variables because it is just some rarely used keyword in C that tells compiler to try to put this in registers. I think that most compilers just ignore this keyword.
That why asked you more, because he was not sure if you deeply understand topic. Fact is that register variables are virtually on stack.
in embedded systems we have different types of memories(read only non volatile(ROM), read write non volatile(EEPROM, PROM, SRAM, NVRAM, flash), volatile(RAM)) to use and also we have different requirements(cannot change and also persist after power cycling, can change and also persist after power cycling, can change any time) on data we have. we have different sections because we have to map our requirements of data to different types of available memories optimistically.
I have seen some code in which the arguments passed to the function by value was being modified or assigned a new value and was being used like a local variable.
Is it a good thing to do? Are there any pitfalls of doing this or is it Ok to code like this?
Essentially, a parameter of a function is a local variable, so this practice is not bad in principle.
On the other hand, doing this can lead to maintenance headaches. If another programmer comes along later, he might expect the variable to hold the passed in value, and the change will cause a bug.
One justification for reusing the variable is for a misguided notion of efficiency of memory usage. Actually, it can't improve efficiency, and can decrease it. The reason is that the compiler can automatically detect if it is useful to use the same register for two different variables at two different times, and will do it if it is better. But the programmer should not make that decision for the compiler. That will limit the choices the compiler can make.
The safest practice is to use a new variable if it needs a new value, and rely on the compiler to make it efficient.
No problems at all that I can think of. The arguments will either be placed in the current stack frame or in registers just like any other local variable. Make sure that the arguments are passed by value, however. In particular, arrays are passed by reference.