difference between stack segment and uninitialized data segment - c

I was trying to get a hand over the memory allocation in c.
According to the following link, the stack and the uninitialized data segment are different and the uninitialized data of the local function goes to the uninitialized data segment.
If that is the case then what is stored in the stack segment in case of a code with uninitialized local variables? Is it empty?

I would not recommend reading "geeksforgeeks" tutorials. You have some misconceptions.
What they call "uninitialized data", the .bss segment, is in fact a store for variables of static storage duration that are zero-initialized. Including any such variable which is explicitly initialized to value zero.
An explanation of static storage duration and the different common segments, with examples, can be found here.
Only variables with static storage duration end up in .bss and .data. Local variables always end up on the stack, or in CPU registers, no matter if they are initialized or not.
(Please note that none of this is specified by the ISO C standard, but rather by industry de facto standards.)

the uninitialized data of the local function goes to the uninitialized data segment.
Well, that is not entirely true.
Read carefully, (from the same link, emphasis mine)
[...] uninitialized data starts at the end of the data segment and contains all global variables and static variables that are initialized to zero or do not have explicit initialization in source code. [...]
So, the automatic storage variables still resides in stack segment, irrespective of the fact whether they are initialized or not.
That said, a word of caution, this is "A typical memory representation", not universal. C standard does not mandate to have a stack segment (or any other), for the matter.

Related

Location of a dereferenced uninitialized pointer in memory?

I have this code example in c from an introductory Embedded system course quiz :
#include <stdlib.h>
#include <stdint.h>
//cross-compiled for MSP432 with cortex-m0plus
int main() {
int * l2;
return 0;
}
I want to know the memory segment ,sub-segment, permissions and lifetime of *l2 in memory.
What I understand is that the pointer l2 is going to be allocated in the stack sub-segment first then because it's uninitialized it's going to get a garbage value which is in this case any value it finds in the stack; I assumed it was in the .text or .const with a static lifetime and none of these answers were right, so am I missing something here ?
Edit:
After I passed the quiz without solving this point correctly, the solution table says it's in the heap with indefinite lifetime. what i got from this answer is that : because a pointer itself is stored in stack and the object it points to is uninitialized (it's not auto or static), it's stored in the heap.. I guess ??
It depends on the implementation.
Usually as it is local automatic variable it will be located on the stack. Its lifetime is the same as lifetime of the main function. It can be only accessed from the main function.
But in real life as you do not do anything with it, it will be just removed by the compiler as not needed even if if you compile it with no optimizations https://godbolt.org/z/1Y6W5j . In this case its location is "nowhere"
Objects can be also kept in the registers and not be placed in the memory https://godbolt.org/z/8nWxxz
Most modern C implementations place code in the .text segment, initialized static storage location variables in the .data segment, not initialized static storage location variables in the .bss segment and read only data in the .rodata segment . You may have plenty other memory segments in your program - but there are so many options. You can also have your own segments and place objects there.
Stack and heap location are 100% implementation defined.
The value stored in l2 is indeterminate - it can even be a trap representation. The l2 object itself has auto storage duration and its lifetime is limited to the lifetime of the enclosing function. What that translates into in terms of memory segment depends on the specific implementation.
You can’t say anything about the value of *l2, unless your specific implementation documents exactly how uninitialized pointers are handled.

scope of variables

Why do local variables use Stack in C/C++?
Technically, C does not use a stack. If you look at the C99 standard, you'll find no reference to the stack. It's probably the same for the C++ standard, although I haven't checked it.
Stacks are just implementation details used by most compilers to implement the C automatic storage semantics.
The question you're actually asking is, "why do C and C++ compilers use the hardware stack to store variables with auto extent?"
As others have mentioned, neither the C nor C++ language definitions explicitly say that variables must be stored on a stack. They simply define the behavior of variables with different storage durations:
6.2.4 Storage durations of objects
1 An object has a storage duration that determines its lifetime. There are three storage
durations: static, automatic, and allocated. Allocated storage is described in 7.20.3.
2 The lifetime of an object is the portion of program execution during which storage is
guaranteed to be reserved for it. An object exists, has a constant address,25) and retains
its last-stored value throughout its lifetime.26) If an object is referred to outside of its
lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when
the object it points to reaches the end of its lifetime.
3 An object whose identifier is declared with external or internal linkage, or with the
storage-class specifier static has static storage duration. Its lifetime is the entire
execution of the program and its stored value is initialized only once, prior to program
startup.
4 An object whose identifier is declared with no linkage and without the storage-class
specifier static has automatic storage duration.
5 For such an object that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way. (Entering an enclosed block or calling a function suspends, but does not end,
execution of the current block.) If the block is entered recursively, a new instance of the
object is created each time. The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration is
reached in the execution of the block; otherwise, the value becomes indeterminate each
time the declaration is reached.
C language standard, draft n1256.
No doubt that paragraph 5 was written with hardware stacks in mind, but there are oddball architectures out there that don't use a hardware stack, at least not in the same way as something like x86. The hardware stack simply makes the behavior specified in paragraph 5 easy to implement.
Local data storage – A subroutine frequently needs memory space for storing the values of local variables, the variables that are known only within the active subroutine and do not retain values after it returns. It is often convenient to allocate space for this use by simply moving the top of the stack by enough to provide the space. This is very fast compared to heap allocation. Note that each separate activation of a subroutine gets its own separate space in the stack for locals.
Stack allocation is much faster since all it really does is move the stackpointer. Using memory pools you can get comparable performance out of heap allocation but that comes with a slight added complexity and its own headaches.
In Heaps there is another layer of indirection since you will have to go from
stack -> heap before you get the correct object. Also the stack is local for
each thread and is inherintly thread safe, where as the heap is free-for-all
memory
It depends on the implementation where variables are stored.
Some computers might not even have a "stack" :D
Other than that, it is usual to do some house keeping when calling functions for keeping track of the return address and maybe a few other things. Instead of creating another house keeping method for local variables, many compiler implementations choose to use the already existing method, which implements the stack, with only minimal changes.
Local variables are local to frames in the call stack.
Using a stack allows recursion.
Because stack is part of the memory that will be automatically discarged when the scope ends. This is the reason for calling sometimes local variables as "automatic". Local variable in a call are "insulated" from recursive or multithreaded calls to the same function.
Local variables are limited to the scope in which they can be accessed.
Using a stack enables jump of control from one scope to other and on returning, to continue with the local variables present initially.
When there is jump, the local variables are pushed and the jump is executed. On returning back to the scope, the local variables are popped out.

how about .bss section not zero initialized

as we know .bss contains un-initialized variables. if in c code, programer initialize the variables before using them. then .bss is not necessary to be zero before executing C code.
Am I right?
Thanks
In C code, any variable with static storage duration is defined to be initialized to 0 by the spec (Section 6.7.8 Initialization, paragraph 10):
If an object that has static storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules;
if it is a union, the first named member is initialized (recursively) according to these rules.
Some program loaders will fill the whole section with zeroes to start with, and others will fill it 'on demand' as a perfomance improvement. So while you are technically correct that the .bss section may not really contain all zeroes when the C code starts executing, it logically does. In any case, assuming you have a standard compliant toolchain, you can think of it as being all zero.
Any variables that are initialized to non-zero values will never end up in the .bss section; they are handled in the .data or .rodata sections, depending on their particular characteristics.
The ELF specification says:
.bss This section holds uninitialized
data that contribute to the program’s
memory image. By definition, the
system initializes the data with zeros
when the program begins to run. The
section occupies no file space, as
indicated by the section type,
SHT_NOBITS.
It therefore follows that a C global variable which has a value assigned to it cannot be put into the .bss section and will have to go into the .data section. The .data section contains the initial valued for all the variables assigned to it.
That depends on where the variable is in code. For instance if you're talking about a local variable in main() or any other function, then variables are pushed onto the stack (unless you use other modifying keywords). If your variable is global AND uninitialized then it should be kept in .bss. Note that compiler optimization and so forth may change things around a bit. If you want to know for sure use readelf to investigate an ELF binary on linux.
It seems like you might be confused about the mechanism by which the .bss section ends up zero initialized. The code that you compile doesn'thave to explicitly initialize the region to zero because when the operating system first allocates a new page of memory to a process the OS makes sure that the page is zero initialized. This is done for security reasons, so that a process can't go looking for secrets that were left in memory when other processes exited.

Where are static variables stored (data segment or heap or BSS)?

I obtained conflicting Opinions about static variable storage.
Opinion 1 : "A stack static variable stores its value in the heap"
Opinion 2 : "A stack static variable stores its value in the data segment".
I am confused with these conflicting answers.
Where exactly are static variables stored?
I am expecting an answers with references (text books, authentic tutorials, etc.).
Static variables have two types:
static variables declared inside a function.
global (declared outside function) static variable.
I would also like to know if there is any difference in the storage of these two types of variables?
The 'stack variables' are usually stored on 'the stack', which is separate from the text, data, bss and heap sections of your program.
The second half of your question is about 'static' variables, which are different from stack variables - indeed, static variables do not live on the stack at all. Classically, static variables would all be in the data or bss sections of your program. With modern compilers, if the data is const-qualified, then the data may be stored in the text section of your program, which has a variety of benefits (including enforced non-modifiability).
The C standard does not dictate that there is a stack, nor a bss section. It just requires storage space to be available for variables with appropriate durations.
Stack memory is allocated when you launch your application and always remains the same size during the execution of the application. It's not stored in the DATA segment, DATA segment is for things like constant values used in the application (such as string literals).
Both local and global static variables are kept in initialized Data segments
There are two data segments initialized data segment and unitialized data segment.
Unitialized data segment also called BSS.
When we say data segment, by default it initialized data segment, this section gets copied from loaded image of the program. ( all global variables and local static variables initialized to to non zero i.e. ini var1_global = 10; )
The uninitialized data segemnet aka BSS. This section will be initialized to zero generall, just before main() gets called. All unitialized global, local static goes here.
http://www.geeksforgeeks.org/memory-layout-of-c-program/

Where are constant variables stored in C?

I wonder where constant variables are stored. Is it in the same memory area as global variables? Or is it on the stack?
How they are stored is an implementation detail (depends on the compiler).
For example, in the GCC compiler, on most machines, read-only variables, constants, and jump tables are placed in the text section.
Depending on the data segmentation that a particular processor follows, we have five segments:
Code Segment - Stores only code, ROM
BSS (or Block Started by Symbol) Data segment - Stores initialised global and static variables
Stack segment - stores all the local variables and other informations regarding function return address etc
Heap segment - all dynamic allocations happens here
Data BSS (or Block Started by Symbol) segment - stores uninitialised global and static variables
Note that the difference between the data and BSS segments is that the former stores initialized global and static variables and the later stores UNinitialised ones.
Now, Why am I talking about the data segmentation when I must be just telling where are the constant variables stored... there's a reason to it...
Every segment has a write protected region where all the constants are stored.
For example:
If I have a const int which is local variable, then it is stored in the write protected region of stack segment.
If I have a global that is initialised const var, then it is stored in the data segment.
If I have an uninitialised const var, then it is stored in the BSS segment...
To summarize, "const" is just a data QUALIFIER, which means that first the compiler has to decide which segment the variable has to be stored and then if the variable is a const, then it qualifies to be stored in the write protected region of that particular segment.
Consider the code:
const int i = 0;
static const int k = 99;
int function(void)
{
const int j = 37;
totherfunc(&j);
totherfunc(&i);
//totherfunc(&k);
return(j+3);
}
Generally, i can be stored in the text segment (it's a read-only variable with a fixed value). If it is not in the text segment, it will be stored beside the global variables. Given that it is initialized to zero, it might be in the 'bss' section (where zeroed variables are usually allocated) or in the 'data' section (where initialized variables are usually allocated).
If the compiler is convinced the k is unused (which it could be since it is local to a single file), it might not appear in the object code at all. If the call to totherfunc() that references k was not commented out, then k would have to be allocated an address somewhere - it would likely be in the same segment as i.
The constant (if it is a constant, is it still a variable?) j will most probably appear on the stack of a conventional C implementation. (If you were asking in the comp.std.c news group, someone would mention that the standard doesn't say that automatic variables appear on the stack; fortunately, SO isn't comp.std.c!)
Note that I forced the variables to appear because I passed them by reference - presumably to a function expecting a pointer to a constant integer. If the addresses were never taken, then j and k could be optimized out of the code altogether. To remove i, the compiler would have to know all the source code for the entire program - it is accessible in other translation units (source files), and so cannot as readily be removed. Doubly not if the program indulges in dynamic loading of shared libraries - one of those libraries might rely on that global variable.
(Stylistically - the variables i and j should have longer, more meaningful names; this is only an example!)
Depends on your compiler, your system capabilities, your configuration while compiling.
gcc puts read-only constants on the .text section, unless instructed otherwise.
Usually they are stored in read-only data section (while global variables' section has write permissions). So, trying to modify constant by taking its address may result in access violation aka segfault.
But it depends on your hardware, OS and compiler really.
offcourse not , because
1) bss segment stored non inilized variables it obviously another type is there.
(I) large static and global and non constants and non initilaized variables it stored .BSS section.
(II) second thing small static and global variables and non constants and non initilaized variables stored in .SBSS section this included in .BSS segment.
2) data segment is initlaized variables it has 3 types ,
(I) large static and global and initlaized and non constants variables its stord in .DATA section.
(II) small static and global and non constant and initilaized variables its stord in .SDATA1 sectiion.
(III) small static and global and constant and initilaized OR non initilaized variables its stord in .SDATA2 sectiion.
i mention above small and large means depents upon complier for example small means < than 8 bytes and large means > than 8 bytes and equal values.
but my doubt is local constant are where it will stroe??????
This is mostly an educated guess, but I'd say that constants are usually stored in the actual CPU instructions of your compiled program, as immediate data. So in other words, most instructions include space for the address to get data from, but if it's a constant, the space can hold the value itself.
This is specific to Win32 systems.
It's compiler dependence but please aware that it may not be even fully stored. Since the compiler just needs to optimize it and adds the value of it directly into the expression that uses it.
I add this code in a program and compile with gcc for arm cortex m4, check the difference in the memory usage.
Without const:
int someConst[1000] = {0};
With const:
const int someConst[1000] = {0};
Global and constant are two completely separated keywords. You can have one or the other, none or both.
Where your variable, then, is stored in memory depends on the configuration. Read up a bit on the heap and the stack, that will give you some knowledge to ask more (and if I may, better and more specific) questions.
It may not be stored at all.
Consider some code like this:
#import<math.h>//import PI
double toRadian(int degree){
return degree*PI*2/360.0;
}
This enables the programmer to gather the idea of what is going on, but the compiler can optimize away some of that, and most compilers do, by evaluating constant expressions at compile time, which means that the value PI may not be in the resulting program at all.
Just as an an add on ,as you know that its during linking process the memory lay out of the final executable is decided .There is one more section called COMMON at which the common symbols from different input files are placed.This common section actually falls under the .bss section.
Some constants aren't even stored.
Consider the following code:
int x = foo();
x *= 2;
Chances are that the compiler will turn the multiplication into x = x+x; as that reduces the need to load the number 2 from memory.
I checked on x86_64 GNU/Linux system. By dereferencing the pointer to 'const' variable, the value can be changed. I used objdump. Didn't find 'const' variable in text segment. 'const' variable is stored on stack.
'const' is a compiler directive in "C". The compiler throws error when it comes across a statement changing 'const' variable.

Resources