Where does local const variable will get stored? I have verified that, every where in function where const variable is used, get replaced with its value(like immediate value addressing mode). But if pointer is assigned to it then it gets stored on stack. Here I do not understand one thing how processor knows its constant value. Is there any read only section in stack like it present in .data section?
Generally, the processor does not know that an object is declared const in C.
Systems commonly have regions of memory that are marked read-only after a program is loaded, and static const objects are stored in such memory. For these objects, the processor enforces the read-only property.
Systems generally do not have read-only memory used for stack. This would be inherently difficult—the memory would need to be read-write when a function is starting, so that its stack frame can be constructed, but read-only at other times. So the program would be frequently changing the hardware memory protection settings. This would impair performance and is generally not considered worth while.
So programs generally have only a read-write stack available. When you declare an automatic (rather than static) const object, where can the compiler put it? As you note, it is often optimized into an immediate operand in instructions. However, when you take its address, it must have an address, so it must be in memory.
One idea might be that, since it is const, it will not chamge, so we only need one copy, so it can be stored in the static read-only section instead of on the stack. However, the C standard says that each different object has a different address. To comply with that requirement, the compiler has to create a different instance of the object in memory each time it is created in the C code. Putting it on the stack is an easy way to do this.
I think it totally depends on your tool-chain specific implementation. Variables are stored in RAM, program in Flash memory and constants either in RAM or Flash.
Correct me if I'm wrong.
Related
I have a weird question, and i am not sure if i will be able to explain it but here we go. While learning C and using it you usually come across the term "trash" or "garbage" value, my first question to that is, is data left over data in that memory address from some different program or of anything or is it actually some 'random' value , if i take that it is true that is leftover value in that memory address why are we still able to read from such memory address, i mean lets assume we just declare int x; and it is now stored in bss on some memory address #, and we were to output its value we would get the value of that resides on that address, so if all the things i said are true, doesnt that allow for us to declare many many many variables but only declare and not initialize perhaps we can map all the values previously stored in bss from some program from before etc.
I am mostly likely sure that this would be a big security threat and thus i know there is probably some measure against it but i want to know what prevents this?
No, the contents of the .bss section are zeroed out before your program starts. This is to satisfy C's guarantee that global and static variables, if not explicitly initialized, will be initialized to zero.
Indeed, on a typical multitasking system, all memory allocated by your process will be zeroed by the operating system before you are given access to it. This is to avoid precisely the security hole you mention.
The values of local (auto) variables, on the stack, do typically contain "garbage" if not initialized, but it would be garbage left over from the execution of your own program up to this point. If your program happens not to have written anything to that particular location on the stack, then it will still contain zero (again on a typical OS); it will never contain memory contents from other programs.
The same goes for memory allocated by malloc. If it is coming straight from the OS, it contains zeros. If it happens to be a block that was previously allocated and freed, it might contain garbage from your previous use of that memory, or from malloc's internal data, but again it will never contain another program's data.
Nothing in the C language itself prevents you from doing almost exactly as you say. The only thing you said that was wrong, considering only the requirements of C standard, was talking about variables "in bss". Objects with static storage duration and no initializer (which is the standardese equivalent of variables in bss) are guaranteed to be initialized to zero at program startup, so you cannot access the data of no-longer-running programs that way. But, in an environment like good old-fashioned MS-DOS or CP/M, there was nothing whatsoever to stop you from setting a pointer to the base of physical RAM, scanning to the end, and finding data from previous programs.
All modern operating systems for full-featured computers, however, provide memory protection which means, among other things, that they guarantee that no process can read another process's memory, whether or not the other process is still running, except via well-defined APIs that enforce security policy. The "Spectre" family of hardware bugs are a big deal just because they break this guarantee.
The details of how memory protection work are too complex to fit into this answer box, but one of the things that's almost always done is, whenever you allocate more memory from the operating system, that memory is initialized, either to all-bits-zero or to the contents of a file on disk. Either way you can't get at "garbage".
I've been working with a program and I've been trying to conserve bytes and storage space.
I have many variables in my C program, but I wondered if I could reduce the program's size by making some of the variables that don't change throughout the program const or final.
So my questions are these:
Is there any byte save when identifying static variables as constant?
If bytes are saved by doing this, why are they saved? How does the program store the variable differently if it is constant, and why does this way need less storage space?
If bytes are not saved by defining variables as constant, then why would a developer define a variable this way in the first place? Could we not just leave out the const just in case we need to change the variable later (especially if there is no downfall in doing so)?
Are there only some IDEs/Languages that save bytes with constant variables?
Thanks for any help, it is greatly appreciated.
I presume you're working on deeply embedded system (like cortex-M processors).
For these, you know that SRAM is a scarce resource whereas you have plenty of FLASH memory.
Then as much as you can, use the const keyword for any variable that doesn't change. Doing this will tell compiler to store the variable in FLASH memory and not in SRAM.
For example, to store a text on your system you can do this:
const char* const txtMenuRoot[] = { "Hello, this is the root menu", "Another text" };
Then not only the text is stored in FLASH, but also its pointer.
All your questions depend heavily on compiler and environment. A C compiler intended for embedded environment can do a great job about saving memory, while others maybe not.
Is there any byte save when identifying static variables as constant?
Yes, it may be possible. But note that "const", generally, isn't intended to specify how to store a variable - instead its meaning is to help the programmer and the compiler to better understand the source code (when the compiler "understand better", it can produce better object code). Some compiler can use that information to also store the variable in read-only memory, or delete it and turn it into literals in object code. But in the context of your question, may be that a #define is more suitable.
If bytes are saved by doing this, why are they saved? How does the program store the variable differently if it is constant, and why does this way need less storage space?
Variables declared in source code can go to different places in the object code, and different places when an object file is loaded in memory and executed. Note that, again, there are differences on various architectures - for example in a small 8/16 bits MCU (cpu for electronic devices), generally there is no "loading" of an object file. So the value of a variable is stored somewhere - anyway. But at low level the compiler can use literals instead of addresses, and this mostly saves some memory. Suppose you declare a constant variable GAIN=5 in source code. When that variable is used in some formula, the compiler emits something like "LD R12,GAIN" (loads register R12 with the content of the address GAIN, where variable GAIN is stored). But the compiler can also emit "LD R12,#5" (loads the value "5" in R12). In both cases an instruction is needed, but in the second case there is no memory for variables involved. This is a saving, and can also be faster.
If bytes are not saved by defining variables as constant, then why would a developer define a variable this way in the first place? Could we not just leave out the const just in case we need to change the variable later (especially if there is no downfall in doing so)?
As told earlier, the "const" keyword is meant to better define what operations will be done on the variable. This is useful for programmers, for clarity. It is useful to clearly state that a variable is not intended to be modified, especially when the variable is a formal parameter. In some environments, there is actually some read-only memory that can only be read and not written to and, if a variable (maybe a "system variable") is marked as "const", all is clear to the programmer -and- the compiler, which can warn if it encounters code trying to modify that variable.
Are there only some IDEs/Languages that save bytes with constant variables?
Definitely yes. But don't talk about IDEs: they are only editors. And about languages, things are complicated: it depends entirely on implementation and optimization. Likely this kind of saving is used only in compilers (not interpreters), and depends a lot on optimization options/capabilities of the compiler.
Think of const this way (there is no such thing as final or constant in C, so I'll just ignore that). If it's possible for the compiler to save memory, it will (especially when you compile optimizing for size). const gives the compiler more information about the properties of an object. The compiler can make smarter decisions when it has more information and it doesn't prevent the compiler from making the exact same decision as before it had that information.
It can't hurt and may help and it also helps the programmers working with the code to easier reason about it. Both the compiler and the programmer are helped, no one gets hurt. It's a win-win.
Compilers can reduce the memory used based on the knowledge of the code, const help compiler to know the real code behaviour (if you activate warnings you can have suggestions of where to put const).
But a struct can contains unused byte due to alignment restrictions of the hw used and compilers cannot alter the inner order of a struct. This can be done only changing the code.
struct wide struct compact
{ {
int_least32_t i1; int_least32_t i1,
int_least8_t b; i2;
int_least32_t i2; int_least8_t b;
} }
Due to the alignment restrictions the struct wide can have an empty space between members 'b' and 'i2'.
This is not the case in struct compact because the elements are listed from the widest, which can require greater alignments, to the smaller.
In same cases the struct compact leads even to faster code.
I read some topics over optimization and it is mentioned that global variables can not be stored in registers and hence if we need to optimize we use register variable to store the global data and modify that register variable. Is this applies to static variables too?
For auto storage, what if we store auto variables in register variables? Won't it faster the access from register instead of stack?
Both global variables and static variables exist in the data segment, which includes the data, BSS, and heap sections. If the static variable is initialized to 0 or not initialized to anything, it goes in the BSS section. If it is given a non-zero initialization value, then it is in the "data" section. See:
http://en.wikipedia.org/wiki/Data_segment
As for auto vs. register variables: register does not guarantee that the variable will be stored in a register, it is more providing a hint from the programmer. See:
http://www.lix.polytechnique.fr/~liberti/public/computing/prog/c/C/CONCEPT/storage_class.html
Yes, it is (much) faster to access a register than to access the stack memory, but this optimization nowadays is left up to the compiler (the problem of register allocation) as well as the CPU architecture (which has a great many optimizations too complex to explain here).
Unless you're programming for a really simple or old architecture and/or using a really outdated compiler, you probably should not worry about this kind of optimization.
global variables' values can be held in registers for so long as the compiler can prove there is no other access to the stored value. With values that can't be held in a register themselves, declaring a pointer with the restrict keyword declares that a value isn't being accessed via any other means for that pointer's lifetime; just don't give away any copies and the compiler will take care of the rest. For scalars declaring thistype localval=globalval; works at least as well if you're not changing the value or you've got good control over scope exits -- or even better.
You can only use the restrict declaration if the value really won't be accessed otherwise. Optimizers these days can for instance deduce from your declaring the object won't be accessed in one function that a code path that does access it in another won't be executed, and from that deduce the content of the expression used to take that code path, and so on. "If you lie to the compiler, it will have its revenge" is more true today than ever.
Higher level languages such as javascript don't give the programmer a
choice as to where variables are stored. But C does. My question is:
are there any guidelines as to where to store variables, eg dependent
on size, usage, etc.
As far as I understand, there are three possible locations to store
data (excluding code segment used for actual code):
DATA segment
Stack
Heap
So transient small data items should be stored on the stack?
What about data items which must be shared between functions. These
items could be stored on the heap or in the data segment. How do you
decide which to choose?
You're looking through the wrong end of the telescope. You don't specify particular memory segments in which to store a variable (particularly since the very concept of a "memory segment" is highly platform-dependent).
In C code, you decide a variable's lifetime, visibility, and modifiability based on what makes sense for the code, and based on that the compiler will generate the machine code to store the object in the appropriate segment (if applicable)
For example, any variables declared at file scope (outside of any function) or with the keyword static will have static storage duration, meaning they are allocated at program startup and held until the program terminates; these objects may be allocated in a data segment or bss segment. Variables declared within a function or block without the static keyword have automatic storage duration, and are (typically) allocated on the stack.
String literals and other compile-time constant objects are often (but not always!) allocated in a readonly segment. Numeric literals like 3.14159 and character constants like 'A' are not objects, and do not (typically) have memory allocated for them; rather, those values are embedded directly in the machine code instructions.
The heap is reserved for dynamic storage, and variables as such are not stored there; instead, you use a library call like malloc to grab a chunk of the heap at runtime, and assign the resulting pointer value to a variable allocated as described above. The variable will live in either the stack or a data segment, while the memory it points to lives on the heap.
Ideally, functions should communicate solely through parameters, return values, and exceptions (where applicable); functions should not share data through an external variable (i.e., a global). Function parameters are usually allocated on the stack, although some platforms may pass parameters via registers.
You should prefer local/stack variables to global or heap variables when those variables are small, used often and in a relatively small/limited scope. That will give the compiler more opportunities to optimize the code using them as it'll know they aren't going to change between function calls unless you pass around pointers to them.
Also, the stack is usually relatively small and allocating large structures or arrays on it may lead to stack overflows, especially so in recursive code.
Another thing to consider is the use of global variables in multithreaded programs. You want to minimize chances of race conditions and one strategy for that is maiking functions thread-safe and re-enterant by not using any global resources in them directly (if malloc() is thread-safe, if errno is per-thread, etc you can use them, of course).
Btw, using local variables instead of global variables also improves code readability as the variables are located close to the place where they're used and you can quickly find out their type and where and how they're used.
Other than that, if your code is correct, there shouldn't be much practical difference between making variables local or global or in the heap (of course, malloc() can fail and you should remember about it:).
C only allows you to specify where data is stored indirectly... via the scope of the variable and/or allocation. i.e., a local variable to a function is typically a stack variable unless it is declared static in which case it will likely be DATA/BSS. Variables created dynamically via new/malloc will typically be heap.
However, there's no guarantee of any of that... only the implication of it.
That said, the one thing that is guaranteed to be a bad idea is to declare large local variables in functions... common source of strange errors and stack overflows. Very large arrays and structures are best suited to dynamic allocation and keep the pointers in local/global as required.
Yesterday I had an interview where the interviewer asked me about the storage classes where variables are stored.
My answer war:
Local Variables are stored in Stack.
Register variables are stored in Register
Global & static variables are stored in data segment.
The memory created dynamically are stored in Heap.
The next question he asked me was: why are they getting stored in those specific memory area? Why is the Local variable not getting stored in register (though I need an auto variable getting used very frequently in my program)? Or why global or static variables are not getting stored in stack?
Then I was clueless. Please help me.
Because the storage area determines the scope and the lifetime of the variables.
You choose a storage specification depending on your requirement, i.e:
Lifetime: The duration you expect the particular variable needs to be alive and valid.
Scope: The scope(areas) where you expect the variable to be accessible.
In short, each storage area provides a different functionality and you need various functionality hence different storage areas.
The C language does not define where any variables are stored, actually. It does, however, define three storage classes: static, automatic, and dynamic.
Static variables are created during program initialization (prior to main()) and remain in existence until program termination. File-scope ('global') and static variables fall under the category. While these commonly are stored in the data segment, the C standard does not require this to be the case, and in some cases (eg, C interpreters) they may be stored in other locations, such as the heap.
Automatic variables are local variables declared in a function body. They are created when or before program flow reaches their declaration, and destroyed when they go out of scope; new instances of these variables are created for recursive function invocations. A stack is a convenient way to implement these variables, but again, it is not required. You could implement automatics in the heap as well, if you chose, and they're commonly placed in registers as well. In many cases, an automatic variable will move between the stack and heap during its lifetime.
Note that the register annotation for automatic variables is a hint - the compiler is not obligated to do anything with it, and indeed many modern compilers ignore it completely.
Finally, dynamic objects (there is no such thing as a dynamic variable in C) refer to values created explicitly using malloc, calloc or other similar allocation functions. They come into existence when explicitly created, and are destroyed when explicitly freed. A heap is a convenient place to put these - or rather, one defines a heap based on the ability to do this style of allocation. But again, the compiler implementation is free to do whatever it wants. If the compiler can perform static analysis to determine the lifetime of a dynamic object, it might be able to move it to the data segment or stack (however, few C compilers do this sort of 'escape analysis').
The key takeaway here is that the C language standard only defines how long a given value is in existence for. And a minimum bound for this lifetime at that - it may remain longer than is required. Exactly how to place this in memory is a subject in which the language and library implementation is given significant freedom.
It is actually just an implementation detail that is convenient.
The compiler could, if he wanted to, generate local variables on the heap if he wishes.
It is just easier to create them on the stack since when leaving a function you can adjust the frame pointer with a simple add/subtract depending on the growth direction of the stack and so automatically free the used space for the next function. Creating locals on the heap however would mean more house-keeping work.
Another point is local variables must not be created on the stack, they can be stored and used just in a register if the compiler thinks that's more appropriate and has enough registers to do so.
Local variables are stored in registers in most cases, because registers are pushed and poped from stack when you make function calls It looks like they are on stack.
There is actually no such tings as register variables because it is just some rarely used keyword in C that tells compiler to try to put this in registers. I think that most compilers just ignore this keyword.
That why asked you more, because he was not sure if you deeply understand topic. Fact is that register variables are virtually on stack.
in embedded systems we have different types of memories(read only non volatile(ROM), read write non volatile(EEPROM, PROM, SRAM, NVRAM, flash), volatile(RAM)) to use and also we have different requirements(cannot change and also persist after power cycling, can change and also persist after power cycling, can change any time) on data we have. we have different sections because we have to map our requirements of data to different types of available memories optimistically.