why data segment of c was separated as two sections? - c

All globally initialised values are stored in .data segment .i.e. initialised data segment and uninitialised values are stored in bss and compiler initialise those uninitiated values to zero automatically in bss. Then why data segment is separated as .data and bss.
whether it has an advantage or not ? or any benefit

The C programming language (it is a specification written in English) does not know about .bss or .data or data segments. Check that by reading n1570 (the latest draft of C11).
In some cases (e.g. embedded computing) you don't have any data segment. For example when you cross-compile for an Arduino, the resulting code gets uploaded to the flash memory of that microcontroller (and data is in RAM, which your program would perhaps explicitly clear).
(most of my answer below is focused on Linux, but you could adapt it to other OSes)
Regarding the data segment on Unix-like systems like Linux, read more the specification of ELF. It is convenient to avoid spending file space in the executable file for something (the .bss) which is all zeros. The kernel code for execve(2) is able to set up an initial virtual address space with some cleared data (which is not mapped to some file segment). See also mmap(2) & elf(5) & ld-linux(8). Try some cat /proc/$$/maps command to understand the virtual address space of your shell. See proc(5).
So the executable contains segments (some of which are all zeros and don't take any disk space), and is constructed by the linker -which also handle relocations- from several object files provided by the compiler from source code files. On Linux, use objdump(1) and readelf(1) (and nm(1)...) to explore and inspect ELF executables and ELF object files.
BTW a cleared data segment don't need to be fetched from disk by the virtual memory subsystem, and that might make things slightly faster. The kernel would just clear a page in RAM when it is paged in.
So the .bss exist to avoid wasting disk space in executables (and to speedup initializing of zeroed data). Obviously, some mutable data is explicitly initialized to non-zero content (and that needs to sit in .data and takes some disk space in the executable). Constant immutable read-only data goes into .rodata (into the text segment, generally shared by several processes running the same program)
You could configure your linker (e.g. with some linker script) to put all the data (even the cleared ones) in some explicit data segment (but doing so is just wasting disk space)...... Historically, Unix have been developed on machines with little but costly disk space (so wasting it was unthinkable at that time, hence the need of .bss; today you care less!)
Read Levine's book Linkers and Loaders for more, and Advanced Linux Programming and Operating Systems : Three Easy Pieces.

Related

where is it documented that global array in C, compiled by gcc, is initialized like "copy-on-write"?

For this C code:
foobar.c:
static int array[256];
int main() {
return 0;
}
the array is initialized to all 0's, by the C standard. However, when I compile
gcc -S foobar.c
this produces the assembly code foobar.s that I can inspect, and nowhere in foobar.s, is there any initialization of contents of the array.
Hence I reason, that the contents are not initialized, only when an element of the array is inspected, is it initialized, kind of like "copy-on-write" mechanism for fork.
Is my reasoning correct? If so, is this a documented feature, and if so where can I find that documentation?
There's kind of a lot of levels here. This answer addresses Linux in particular, but the same concepts are likely to apply on other systems, possibly with different names.
The compiler requires that the object be "zero initialized". In other words, when a memory read instruction is executed with an address in that range, the value that it reads must be zero. As you say, this is necessary to achieve the behavior dictated by the C standard.
The compiler accomplishes this by asking the assembler to fill the space with zeros, one way or another. It may use the .space or .zero directive which implicitly requests this. It will also place the object in a section with the special name .bss (the reasons for this name are historical). If you look further up in the assembly output, you should see a directive like .bss or .section .bss. The assembler and linker promises that this entire section will be (somehow) initialized to zero. This is documented:
The bss section is used for local common variable storage. You may allocate address space in the bss section, but you may not dictate data to load into it before your program executes. When your program starts running, all the contents of the bss section are zeroed bytes.
Okay, so now what do the assembler and linker do to make it happen? Well, an ELF executable file has a segment header, which specifies how and where code and data from the file should be mapped into the program's memory. (Please note that the use of the word "segment" here has nothing to do with the x86 memory segmentation model or segment registers, and is only vaguely related to the term "segmentation fault".) The size of the segment, and the amount of data to be mapped, are specified separately. If the size is greater, then all remaining bytes are to be initialized to zero. This is also documented in the above-linked man page:
PT_LOAD
The array element specifies a loadable segment,
described by p_filesz and p_memsz. The bytes
from the file are mapped to the beginning of the
memory segment. If the segment's memory size
p_memsz is larger than the file size p_filesz,
the "extra" bytes are defined to hold the value
0 and to follow the segment's initialized area.
So the linker ensures that the ELF executable contains such a segment, and that all objects in the .bss section are in this segment, but not within the part that is mapped to the file.
Once all this is done, then the observable behavior is guaranteed: as above, when an instruction attempts to read from this object before it has been written, the value it reads will be zero.
Now as to how that behavior is ensured at runtime: that is the job of the kernel. It could do it by pre-allocating actual physical memory for that range of virtual addresses, and filling it with zeros. Or by an "allocate on demand" method, like what you describe, by leaving those pages unmapped in the CPU's page tables. Then any access to those pages by the application will cause a page fault, which will be handled by the kernel, which will allocate zero-filled physical memory at that time, and then restart the faulting instruction. This is completely transparent to the application. It just sees that the read instruction got the value zero. If there was a page fault, then it just seems to the application like the read instruction took a long time to execute.
The kernel normally uses the "on demand" method, because it is more efficient in case not all of the "zero initialized" memory is actually used. But this is not going to be documented as guaranteed behavior; it is an implementation detail. An application programmer need not care, and in fact must not care, how it works under the hood. If the Linux kernel maintainers decide tomorrow to switch everything to the pre-allocate method, every application will work exactly as it did before, just maybe a little faster or slower.

what is the mechansm of memory readable only ?

In generaly memory can be readable and writable.when C compiler set memory const,what is mechan of it?who block the memory being written.if by mistake to force to write the marked const memory,who is reporting the segment error?
It's the operating systems which marks pages of virtual memory as either readable, writable or executable (or a combination of all).
The compiler and linker works together to mark special sections of the executable file, and then the operating system loader handles setting up the memory itself.
Nothing of this is part of the C standard, which only specifies that attempting to modify a const variable is undefined behavior.
There is no specified mechanism in the C11 standard for read only memory. Check by reading n1570. But be scared of undefined behavior (e.g. writing in some const data).
In practice, on many C implementations running on current operating systems (e.g. Linux, Windows, Android, MacOSX, ...) and desktops, tablets or servers with an x86-64 or an ARM processor, a process has some virtual address space, with various segments, some being read only (and managed by the operating system kernel with the help of the MMU). Read also about virtual memory & segmentation fault. Take several days to read a book like Operating Systems: Three Easy Pieces (freely downloadable).
On embedded microcontrollers (e.g. Arduino like), some memory might be a hardware ROM. And some compilers might (but are not required to!) use it for some of your constant data.
You might use linker scripts (with GNU ld) to organize some read only segments into read-only memory. This is very implementation specific.
However, some platforms don't have any kind of user-programmable read-only memory (e.g. some embedded systems have a factory ROM containing a fixed boot loader in firmware, and everything else is in RAM). YMMV.
The complier and linker implement it, for instance in a embeded system, data on RAM is changable, while data on Flash/ROM is not changable.
So if the data is defined with const, it will be place into a non-volative storage place, e.g. Flash/ROM, disk.
Defining a variable with const has two benefits:
- Avoid this variable is changed by coding error.(complier error)
- Reduce RAM usage, e.g. A long text should be placed into Flash/ROM or disk.

Where memory segments are defined?

I just learned about different memory segments like: Text, Data, Stack and Heap. My question is:
1- Where the boundaries between these sections are defined? Is it in Compiler or OS?
2- How the compiler or OS know which addresses belong to each section? Should we define it anywhere?
This answer is from the point of view of a more special-purpose embedded system rather than a more general-purpose computing platform running an OS such as Linux.
Where the boundaries between these sections are defined? Is it in Compiler or OS?
Neither the compiler nor the OS do this. It's the linker that determines where the memory sections are located. The compiler generates object files from the source code. The linker uses the linker script file to locate the object files in memory. The linker script (or linker directive) file is a file that is a part of the project and identifies the type, size and address of the various memory types such as ROM and RAM. The linker program uses the information from the linker script file to know where each memory starts. Then the linker locates each type of memory from an object file into an appropriate memory section. For example, code goes in the .text section which is usually located in ROM. Variables go in the .data or .bss section which are located in RAM. The stack and heap also go in RAM. As the linker fills one section it learns the size of that section and can then know where to start the next section. For example, the .bss section may start where the .data section ended.
The size of the stack and heap may be specified in the linker script file or as project options in the IDE.
IDEs for embedded systems typically provide a generic linker script file automatically when you create a project. The generic linker file is suitable for many projects so you may never have to customize it. But as you customize your target hardware and application further you may find that you also need to customize the linker script file. For example, if you add an external ROM or RAM to the board then you'll need to add information about that memory to the linker script so that the linker knows how to locate stuff there.
The linker can generate a map file which describes how each section was located in memory. The map file may not be generated by default and you may need to turn on a build option if you want to review it.
How the compiler or OS know which addresses belong to each section?
Well I don't believe the compiler or OS actually know this information, at least not in the sense that you could query them for the information. The compiler has finished its job before the memory sections are located by the linker so the compiler doesn't know the information. The OS, well how do I explain this? An embedded application may not even use an OS. The OS is just some code that provides services for an application. The OS doesn't know and doesn't care where the boundaries of memory sections are. All that information is already baked into the executable code by the time the OS is running.
Should we define it anywhere?
Look at the linker script (or linker directive) file and read the linker manual. The linker script is input to the linker and provides the rough outlines of memory. The linker locates everything in memory and determines the extent of each section.
For your Query :-
Where the boundaries between these sections are defined? Is it in Compiler or OS?
Answer is OS.
There is no universally common addressing scheme for the layout of the .text segment (executable code), .data segment (variables) and other program segments. However, the layout of the program itself is well-formed according to the system (OS) that will execute the program.
How the compiler or OS know which addresses belong to each section? Should we define it anywhere?
I divided your this question into 3 questions :-
About the text (code) and data sections and their limitation?
Text and Data are prepared by the compiler. The requirement for the compiler is to make sure that they are accessible and pack them in the lower portion of address space. The accessible address space will be limited by the hardware, e.g. if the instruction pointer register is 32-bit, then text address space would be 4 GiB.
About Heap Section and limit? Is it the total available RAM memory?
After text and data, the area above that is the heap. With virtual memory, the heap can practically grow up close to the max address space.
Do the stack and the heap have a static size limit?
The final segment in the process address space is the stack. The stack takes the end segment of the address space and it starts from the end and grows down.
Because the heap grows up and the stack grows down, they basically limit each other. Also, because both type of segments are writeable, it wasn't always a violation for one of them to cross the boundary, so you could have buffer or stack overflow. Now there are mechanism to stop them from happening.
There is a set limit for heap (stack) for each process to start with. This limit can be changed at runtime (using brk()/sbrk()). Basically what happens is when the process needs more heap space and it has run out of allocated space, the standard library will issue the call to the OS. The OS will allocate a page, which usually will be manage by user library for the program to use. I.e. if the program wants 1 KiB, the OS will give additional 4 KiB and the library will give 1 KiB to the program and have 3 KiB left for use when the program ask for more next time.
Most of the time the layout will be Text, Data, Heap (grows up), unallocated space and finally Stack (grows down). They all share the same address space.
The sections are defined by a format which is loosely tied to the OS. For example on Linux you have ELF and on Mac OS you have Mach-O.
You do not define the sections explicitly as a programmer, in 99.9% of cases. The compiler knows what to put where.

Location of variables in C

I'm trying to understand how C allocates memory to global variables.
I'm working on a simple Kernel. So far it can't do much more than print to screen and enable interrupts. I'm now working on a basic physical memory manager.
My memory manager is a bitmap that sets a 1 or 0 if memory is allocated or available. I need to add the memory that my Kernel is using to the bitmap as 'allocated', so nothing overwrites it.
I can easily find out the start of the Kernel, as it's statically loaded to 0x100000. Figuring out the length shouldn't be too difficult either. The part I'm not sure about is where global variables are put in memory?
Let's say my Kernel is 12K, I can then allocate these 3x 4K blocks of memory to it for protection. Do I need to allocate more to cover the variables it uses? Or are the variables part of that 12K?
Thank you for your help, I hope I am making enough sense.
have a look at
http://www.geeksforgeeks.org/archives/14268
your globals mostly are in the BSS
As the previous answer says, most variables are stored in the .bss section but they can also be stored in the .data or .rodata section depending on if you defined the global variables as static or const. After compiling you can use readelf -S kernel.bin to see exactly how much space each section will utilize. For the .bss section the memory is only occupied when the binary is loaded in memory and does not take any space on disk. This means that your compiled kernel binary will be smaller than the actual size it will later use when brought into memory (by grub usually).
A simple way to figure out exactly how much data your kernel will use besides using readelf is to place the .bss section inside the .data section within your linker script. The size of the kernel binary will then be the same size both on disk as in memory (or actually it will be a bit smaller in memory since not all sections are copied by grub) but then at least you know the minimum amount of memory you need to allocate.
I'd recommend using a custom linker script (assuming you use gcc): it makes the layout of kernel sections explicit and customizable (to read more about linker scripts, read info ld). You can see an example of my OS's linker script here.
To see the default linker script use -v/--verbose option of ld.
Mostly global variables are located in .data.* and .rodata.* sections, variables initialized with 0 go in .bss.

What segments does C compiled program use?

I read on OSDev wiki, that protected mode of x86 architecture allow you to create separate segments for code and data, while you cannot write into code section. That Windows (yes, this is the platform) loads new code into code segment, and data are created on data segment. But, if this is the case, how does program know it must switch segments to the data segment? Becouse if I understand it right, all adress instructions point to the segment you run the code from, unless you switch the descriptor. But I also read, that so colled flat memory model allows you to run code and data within one segment. But I read this only in connection to assembler. So, please, what is the case with C compiled code on Windows? Thanks.
There are two meanings for segment in the explanation:
an 8086 memory address segment
an object module program section segment
The first is related to what is loaded into an 80386+ segment register; it contains a physical memory start address, memory allocation length, permitted read/write/execute access, and whether it grows from low to high or vice versa (plus some more obscure flags, like "copy on reference").
The second meaning is part of the object module language. Basically, there is a segment named code, a segment named data (which contains initialized data), and segment for uninitialized data named bss (named for the pseudo instructions of 1960s assemblers meaning Block Starting with Symbol). When the linker combines object modules, it arranges all the code segments together, all the data segments together elsewhere, and the bss together as well. When the loader maps memory addresses it looks at the total code space and allocates a CPU memory allocation of at least that size, and maps the segment to the code (in a virtual memory situation) or reads the code into the allocated memory—for which it has to temporarily set the memory as data writable. The write-protection is done through the CPU's paging mechanism, as well as the segment register. This is to protect code writing attempts through, for example, an errant data address. The loader also does the similar setup for the two data segment groups. (Besides those, there is setting up a stack segment and allocating it, and mapping shared images.)
As far as the x86 executing instructions, each operand has an associated segment register. Sometimes these are explicit, and sometimes they are implicit. Code is implicitly accessed through CS, stack through SS which is implied whenever the ESP or EBP register is involved, and DS is implied for most other operands. ES, FS, and GS must be specified as an override in all other cases, except for some of the string instructions like movs and cmps. In flat model, all the segment registers map to the same address space, though CS doesn't allow writing.
So, to answer your last question, the CPU has four (or more) segment registers set up at once to access the flat virtual memory space of the process. Each operand access is checked for being appropriate to the instruction (like not incrementing a CS address) and also is checked by the paging protection unit for being allowed.
The info you read is outdated. Windows versions since ~1993 use a flat 32-bit virtual memory space. The values of the CS and DS segment registers no longer matter and cannot be changed. There is still a notion of code vs data, now implemented by memory page attributes. Review the allowed values passed in the flNewProtect argument for the VirtualProtectEx() API function.
You very rarely use this API yourself, the attributes are set by the executable image loader and the heap manager.

Resources