Ambiguous behaviour of .bss segment in C program - c

I wrote the simple C program (test.c) below:-
#include<stdio.h>
int main()
{
return 0;
}
and executed the follwing to understand size changes in .bss segment.
gcc test.c -o test
size test
The output came out as:-
text data bss dec hex filename
1115 552 8 1675 68b test
I didn't declare anything globally or of static scope. So please explain why the bss segment size is of 8 bytes.
I made the following change:-
#include<stdio.h>
int x; //declared global variable
int main()
{
return 0;
}
But to my surprise, the output was same as previous:-
text data bss dec hex filename
1115 552 8 1675 68b test
Please explain.
I then initialized the global:-
#include<stdio.h>
int x=67; //initialized global variable
int main()
{
return 0;
}
The data segment size increased as expected, but I didn't expect the size of bss segment to reduce to 4 (on the contrary to 8 when nothing was declared). Please explain.
text data bss dec hex filename
1115 556 4 1675 68b test
I also tried the comands objdump, and nm, but they too showed variable x occupying .bss (in 2nd case). However, no change in bss size is shown upon size command.
I followed the procedure according to:
http://codingfox.com/10-7-memory-segments-code-data-bss/
where the outputs are coming perfectly as expected.

When you compile a simple main program you are also linking startup code.
This code is responsible, among other things, to init bss.
That code is the code that "uses" 8 bytes you are seeing in .bss section.
You can strip that code using -nostartfiles gcc option:
-nostartfiles
Do not use the standard system startup files when linking. The standard system libraries are used normally, unless -nostdlib or -nodefaultlibs is used
To make a test use the following code
#include<stdio.h>
int _start()
{
return 0;
}
and compile it with
gcc -nostartfiles test.c
Youll see .bss set to 0
text data bss dec hex filename
206 224 0 430 1ae test

Your first two snippets are identical since you aren't using the variable x.
Try this
#include<stdio.h>
volatile int x;
int main()
{
x = 1;
return 0;
}
and you should see a change in .bss size.
Please note that those 4/8 bytes are something inside the start-up code. What it is and why it varies in size isn't possible to tell without digging into all the details of mentioned start-up code.

Related

Why different size of memory gets allocated to integer in BSS and in Data segment? [duplicate]

This question already has an answer here:
Why the int type takes up 8 bytes in BSS section but 4 bytes in DATA section
(1 answer)
Closed 5 years ago.
Please go through the following program -
#include <stdio.h>
void main()
{
}
Memory allocated for each segment is as follows(by using size command on Unix)-
text data bss dec hex filename
1040 484 16 1540 604 try
After declaration of global variable-
#include <stdio.h>
int i;
void main()
{
}
Memory allocated for each segment is as follows(by using size command on Unix)
Here variable 'i' has received memory in BSS(previously it was 16 and now it is 24)-
text data bss dec hex filename
1040 484 24 1548 60c try
After declaration of global variable and initializing it with 10-
#include <stdio.h>
int i=10;
void main()
{
}
Memory allocated for each segment is as follows(by using size command on Unix)
Here variable 'i' has received memory in data segment(previously it was 484 and now it is 488)-
text data bss dec hex filename
1040 488 16 1544 608 try
My question is why the global variable 'i' got the memory of size 8 bytes when it was stored in BSS but got 4 bytes when it was stored in data segment?
Why there is the difference in allocating memory to an integer in BSS and data segment?
why the global variable 'i' got the memory of size 8 bytes when it was stored in BSS but got 4 bytes when it was stored in data segment?
First, why 4 bytes in data segment?
As many folks already answered this - The .data segment contains any global or static variables that are initialized beforehand. An integer is of 4 bytes in size and that is reflecting in data segment size when you have global int i=10; in your program.
Now, why 8 bytes in .bss segment?
You are observing this behavior because of the default linker script of GNU linker GNU ld. You can get information about linker script here.
While linking, GNU linker (GNU ld) is using the default linker script.
The default linker script specifies the alignment for .bss segment.
If you want to see the default linker script, you can do it using command -
gcc -Wl,-verbose main.c
The output of this gcc command will contain following statement:
using internal linker script:
==================================================
// The content between these two lines is the default linker script
==================================================
In the default linker script, you can find the .bss section:
.bss :
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
/* Align here to ensure that the .bss section occupies space up to
_end. Align after .bss to ensure correct alignment even if the
.bss section disappears because there are no input sections.
FIXME: Why do we need it? When there is no .bss section, we don't
pad the .data section. */
. = ALIGN(. != 0 ? 64 / 8 : 1);
}
Here, you can see . = ALIGN(. != 0 ? 64 / 8 : 1); which indicates the default alignment as 8 bytes.
The program:
#include <stdio.h>
int i;
void main()
{
}
when built with default linker script, 'i' get the memory of size 8 bytes in BSS because of 8 bytes alignment:
# size a.out
text data bss dec hex filename
1040 484 24 1548 60c a.out
[bss = 24 bytes (16 + 8)]
GNU linker provides a provision to pass your own linker script to it and in that case, it uses the script passed to it to build the target instead of default linker script.
Just to try this, you can copy the content of default linker script in a file and use this command to pass your linker script to GNU ld:
gcc -Xlinker -T my_linker_script main.c
Since you can have your own linker script, so you can make changes in it and see the change in behavior.
In the .bss section, change this . = ALIGN(. != 0 ? 64 / 8 : 1); to . = ALIGN(. != 0 ? 32 / 8 : 1);. This will change the default alignment from 8 bytes to 4 bytes. Now build your target using linker script with this change.
The output is:
# size a.out
text data bss dec hex filename
1040 484 20 1544 608 a.out
Here you can see bss size is 20 bytes (16 + 4) because of 4 bytes alignment.
Hope this answer your question.

Why the int type takes up 8 bytes in BSS section but 4 bytes in DATA section

I am trying to learn the structure of executable files of C program. My environment is GCC and 64bit Intel processor.
Consider the following C code a.cc.
#include <cstdlib>
#include <cstdio>
int x;
int main(){
printf("%d\n", sizeof(x));
return 10;
}
The size -o a shows
text data bss dec hex filename
1134 552 8 1694 69e a
After I added another initialized global variable y.
int y=10;
The size a shows (where a is the name of the executable file from a.cc)
text data bss dec hex filename
1134 556 12 1702 6a6 a
As we know, the BSS section stores the size of uninitialized global variables and DATA stores initialized ones.
Why int takes up 8 bytes in BSS? The sizeof(x) in my code shows that the int actually takes up 4 bytes.
The int y=10 added 4 bytes to DATA which makes sense since int should take 4 bytes. But, why does it adds 4 bytes to BSS?
The difference between two size commands stays the same after deleting the two lines #include ....
Update:
I think my understanding of BSS is wrong. It may not store the uninitialized global variables. As the Wikipedia says "The size that BSS will require at runtime is recorded in the object file, but BSS (unlike the data segment) doesn't take up any actual space in the object file." For example, even the one line C code int main(){} has bss 8.
Does the 8 or 16 of BSS comes from alignment?
It doesn't, it takes up 4 bytes regardless of which segment it's in. You can use the nm tool (from the GNU binutils package) with the -S argument to get the names and sizes of all of the symbols in the object file. You're likely seeing secondary affects of the compiler including or not including certain other symbols for whatever reasons.
For example:
$ cat a1.c
int x;
$ cat a2.c
int x = 1;
$ gcc -c a1.c a2.c
$ nm -S a1.o a2.o
a1.o:
0000000000000004 0000000000000004 C x
a2.o:
0000000000000000 0000000000000004 D x
One object file has a 4-byte object named x in the uninitialized data segment (C), while the other object file has a 4-byte object named x in the initialized data segment (D).

Memory map of C program with no global and local variables

I write a basic code as
#include<stdio.h>
int main(void)
{
return 0;
}
and check its size as
gcc -Wall test1.c
size a.out
text data bss dec hex filename
988 260 8 1256 4e8 a.out
Just for knowledge i want to know that i do not declare any variable global or local, initialize or uninitialized then why data and bss is shown as 260 and 8 respectively.
Is this for stack pointer and other variables required for code execution ?

Data section in a.out

here is a simple code that I executed
int a;
int main()
{
return 0;
}
Then after compiling with gcc I did
size a.out
I got some output in bss and data section...Then I changed my code to this
int a;
int main()
{
char *p = "hello";
return 0;
}
Again when I saw the output by size a.out after compiling , size of data section remained same..But we know that string hello will be allocated memory in read only initialized part..Then why size of data section remained same?
#include <stdio.h>
int main()
{
return 0;
}
It gives
text data bss dec hex filename
960 248 8 1216 4c0 a.out
when you do
int a;
int main()
{
char *p = "hello";
return 0;
}
it gives
text data bss dec hex filename
982 248 8 1238 4d6 a.out
at that time hello is stored in .rodata and the location of that address is stored in char pointer p but here p is stored on stack
and size doesnt shows stack. And i am not sure but .rodata is here calculated in text or dec.
when you write
int a;
char *p = "hello";
int main()
{
return 0;
}
it gives
text data bss dec hex filename
966 252 8 1226 4ca a.out
now here again "hello" is stored in .rodata but char pointer takes 4 byte and stored in data so data is increment by 4
For more info http://codingfreak.blogspot.in/2012/03/memory-layout-of-c-program-part-2.html
Actually, that's an implementation detail. The compiler works by an as-is principle. Meaning that as long as the behavior of the program is the same, it's free to exclude any piece of code it wants. In this case, it can skip char* p = "hello" altogether.
The string "hello" is allocated in the section .rodata
Even if the total size doesn't changed, it doesn't mean that the code didn't.
I tested your example.
The string "hello" is a constant data, thus it is stored in the readonly .rodata section.
You can see this particular section using objdump, for example:
objdump -s -j .rodata <yourbinary>
With gcc 4.6.1 without any options, I got for your second code:
Contents of section .rodata:
4005b8 01000200 68656c6c 6f00 ....hello.
Since you don't use that char * in your code, the compiler optimized it away.

size of text area

Consider the following program:
#include <stdio.h>
int main(void)
{
return 0;
}
When i run the following commands:
gcc memory-layout.c -o memory-layout
size memory-layout
I get the output as:
text data bss dec hex filename
960 248 8 1216 4c0 memory-layout
As text area contains the executable instructions of a program, why the output is showing size of text area as 960, which is too big with respect to the size of the instructions, as far as I can count.
The reason is probably because the actual start of a program isn't really the main function, but a piece of code added in the linking stage. This code setup the libraries, clears the BSS segment, and other initialization before calling your main function. There is also code to make sure that everything is cleaned up properly when you return from main.

Resources