Why has the .bss segment not increased when variables are added? - c

Recently,I learned that the .bss segment store uninitialized data. However, when I try a small program as below and use size(1) command in terminal, the .bss segment didn't change, even if I add some global variables. Do I misunderstand something?
jameschu#aspire-e5-573g:~$ cat test.c
#include <stdio.h>
int main(void)
{
printf("hello world\n");
return 0;
}
jameschu#aspire-e5-573g:~$ gcc -c test.c
jameschu#aspire-e5-573g:~$ size test.o
text data bss dec hex filename
89 0 0 89 59 test.o
jameschu#aspire-e5-573g:~$ cat test.c
#include <stdio.h>
int a1;
int a2;
int a3;
int main(void)
{
printf("hello world\n");
return 0;
}
jameschu#aspire-e5-573g:~$ gcc -c test.c
jameschu#aspire-e5-573g:~$ size test.o
text data bss dec hex filename
89 0 0 89 59 test.o

This is because the way global variables work.
The problem that is being solved is that it is possible to declare a global variable, without initializing it, in several .c files and not getting a duplicate symbol error. That is, every global uninitialized declaration works like a weak declaration, that can be considered external if no other declaration contains an initialization.
How it this implemented by the compiler? Easy:
when compiling, instead of adding that variable in the bss segment it will be added to the COMMON segment.
when linking, however, it will merge all the COMMON variables with the same name and discard anyone that is already in other section. The remaining ones will be moved to the bss of the executable.
And that is why you don't see your variables in the bss of the object file, but you do in the executable file.
You can check the contents of the object sections using a more modern alternative to size, such as objdump -x. And note how the variables are placed in *COM*.
It is worth noting that if you declare your global variable as static you are saying that the variable belongs to that compilation unit, so the COMMON is not used and you get the behavior you expect:
int a;
int b;
static int c;
$ size test.o
text data bss dec hex filename
91 0 4 95 5f test.o
Initializing to 0 will get a similar result.
int a;
int b;
int c = 0;
$ size test.o
text data bss dec hex filename
91 0 4 95 5f test.o
However initializing to anything other than 0 will move that variable to data:
int a;
int b = 1;
int c = 0;
$ size test.o
text data bss dec hex filename
91 4 4 99 5f test.o

Related

Shellcode not running, despite disabling stack protections

I am exploring shellcode. I wrote an example program as part of my exploration.
Using objdump, I got the following shellcode:
\xb8\x0a\x00\x00\x00\xc
for the simple function:
int boo()
{
return(10);
}
I then wrote the following program to attempt to run the shellcode:
#include <stdio.h>
#include <stdlib.h>
unsigned char code[] = "\xb8\x0a\x00\x00\x00\xc3";
int main(int argc, char **argv) {
int foo_value = 0;
int (*foo)() = (int(*)())code;
foo_value = foo();
printf("%d\n", foo_value);
}
I am compiling using gcc, with the options:
-fno-stack-protector -z execstack
However, when I attempt to run, I still get a segfault.
What am I messing up?
You're almost there!
You have placed your code[] outside of main, it's a global array. Global variables are not placed on the stack. They can be placed:
In the BSS section if there are not initialized
In the data section if there are initialized and access in both
read/write
In the rodata section if there are only accessed in read
Let's verify this You can use readelf command to check all the sections of your binary (I only show the ones we are interested in):
$ readelf -S --wide <your binary>
There are 31 section headers, starting at offset 0x39c0:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[...]
[16] .text PROGBITS 0000000000001060 001060 0001a5 00 AX 0 0 16
[...]
[18] .rodata PROGBITS 0000000000002000 002000 000008 00
[...]
[25] .data PROGBITS 0000000000004000 003000 000017 00 WA 0 0 8
[...]
[26] .bss NOBITS 0000000000004017 003017 000001 00 WA 0 0 1
Then we can look for your symbol code in your binary:
$ readelf -s <your binary> | grep code
66: 0000000000004010 7 OBJECT GLOBAL DEFAULT 25 code
This confirms that your variable/array code is in .data section, which doesn't present the X flag, so you cannot execute code from it.
From there, the solution is obvious, place your array in your main function:
int main(int argc, char **argv) {
uint8_t code[] = "\xb8\x0a\x00\x00\x00\xc3";
int foo_value = 0;
int (*foo)() = (int(*)())code;
foo_value = foo();
printf("%d\n", foo_value);
}
However, this may also not work!
Your C compiler may find that yes, you are using code, but never reading from it anything, so it will optimize it and simply allocate it on the stack without initializing it. This is what happens with my version of GCC.
To force the compiler to not optimize the array, use volatile keyword.
int main(int argc, char **argv) {
volatile uint8_t code[] = "\xb8\x0a\x00\x00\x00\xc3";
int foo_value = 0;
int (*foo)() = (int(*)())code;
foo_value = foo();
printf("%d\n", foo_value);
}
In a real use-case, your array would be allocated on the stack and sent as a parameter to another function which itself would modify the array content with shellcode. So you wouldn't encounter such compiler optimization issue.

Ambiguous behaviour of .bss segment in C program

I wrote the simple C program (test.c) below:-
#include<stdio.h>
int main()
{
return 0;
}
and executed the follwing to understand size changes in .bss segment.
gcc test.c -o test
size test
The output came out as:-
text data bss dec hex filename
1115 552 8 1675 68b test
I didn't declare anything globally or of static scope. So please explain why the bss segment size is of 8 bytes.
I made the following change:-
#include<stdio.h>
int x; //declared global variable
int main()
{
return 0;
}
But to my surprise, the output was same as previous:-
text data bss dec hex filename
1115 552 8 1675 68b test
Please explain.
I then initialized the global:-
#include<stdio.h>
int x=67; //initialized global variable
int main()
{
return 0;
}
The data segment size increased as expected, but I didn't expect the size of bss segment to reduce to 4 (on the contrary to 8 when nothing was declared). Please explain.
text data bss dec hex filename
1115 556 4 1675 68b test
I also tried the comands objdump, and nm, but they too showed variable x occupying .bss (in 2nd case). However, no change in bss size is shown upon size command.
I followed the procedure according to:
http://codingfox.com/10-7-memory-segments-code-data-bss/
where the outputs are coming perfectly as expected.
When you compile a simple main program you are also linking startup code.
This code is responsible, among other things, to init bss.
That code is the code that "uses" 8 bytes you are seeing in .bss section.
You can strip that code using -nostartfiles gcc option:
-nostartfiles
Do not use the standard system startup files when linking. The standard system libraries are used normally, unless -nostdlib or -nodefaultlibs is used
To make a test use the following code
#include<stdio.h>
int _start()
{
return 0;
}
and compile it with
gcc -nostartfiles test.c
Youll see .bss set to 0
text data bss dec hex filename
206 224 0 430 1ae test
Your first two snippets are identical since you aren't using the variable x.
Try this
#include<stdio.h>
volatile int x;
int main()
{
x = 1;
return 0;
}
and you should see a change in .bss size.
Please note that those 4/8 bytes are something inside the start-up code. What it is and why it varies in size isn't possible to tell without digging into all the details of mentioned start-up code.

Why the int type takes up 8 bytes in BSS section but 4 bytes in DATA section

I am trying to learn the structure of executable files of C program. My environment is GCC and 64bit Intel processor.
Consider the following C code a.cc.
#include <cstdlib>
#include <cstdio>
int x;
int main(){
printf("%d\n", sizeof(x));
return 10;
}
The size -o a shows
text data bss dec hex filename
1134 552 8 1694 69e a
After I added another initialized global variable y.
int y=10;
The size a shows (where a is the name of the executable file from a.cc)
text data bss dec hex filename
1134 556 12 1702 6a6 a
As we know, the BSS section stores the size of uninitialized global variables and DATA stores initialized ones.
Why int takes up 8 bytes in BSS? The sizeof(x) in my code shows that the int actually takes up 4 bytes.
The int y=10 added 4 bytes to DATA which makes sense since int should take 4 bytes. But, why does it adds 4 bytes to BSS?
The difference between two size commands stays the same after deleting the two lines #include ....
Update:
I think my understanding of BSS is wrong. It may not store the uninitialized global variables. As the Wikipedia says "The size that BSS will require at runtime is recorded in the object file, but BSS (unlike the data segment) doesn't take up any actual space in the object file." For example, even the one line C code int main(){} has bss 8.
Does the 8 or 16 of BSS comes from alignment?
It doesn't, it takes up 4 bytes regardless of which segment it's in. You can use the nm tool (from the GNU binutils package) with the -S argument to get the names and sizes of all of the symbols in the object file. You're likely seeing secondary affects of the compiler including or not including certain other symbols for whatever reasons.
For example:
$ cat a1.c
int x;
$ cat a2.c
int x = 1;
$ gcc -c a1.c a2.c
$ nm -S a1.o a2.o
a1.o:
0000000000000004 0000000000000004 C x
a2.o:
0000000000000000 0000000000000004 D x
One object file has a 4-byte object named x in the uninitialized data segment (C), while the other object file has a 4-byte object named x in the initialized data segment (D).

Memory map of C program with no global and local variables

I write a basic code as
#include<stdio.h>
int main(void)
{
return 0;
}
and check its size as
gcc -Wall test1.c
size a.out
text data bss dec hex filename
988 260 8 1256 4e8 a.out
Just for knowledge i want to know that i do not declare any variable global or local, initialize or uninitialized then why data and bss is shown as 260 and 8 respectively.
Is this for stack pointer and other variables required for code execution ?

If a global variable is initialized to 0, will it go to BSS?

All the initialized global/static variables will go to initialized data section.
All the uninitialized global/static variables will go to uninitialed data section(BSS). The variables in BSS will get a value 0 during program load time.
If a global variable is explicitly initialized to zero (int myglobal = 0), where that variable will be stored?
Compiler is free to put such variable into bss as well as into data. For example, GCC has a special option controlling such behavior:
-fno-zero-initialized-in-bss
If the target supports a BSS section, GCC by default puts variables that are initialized to zero into BSS. This
can save space in the resulting code. This option turns off this
behavior because some programs explicitly rely on variables going to
the data section. E.g., so that the resulting executable can find the
beginning of that section and/or make assumptions based on that.
The default is -fzero-initialized-in-bss.
Tried with the following example (test.c file):
int put_me_somewhere = 0;
int main(int argc, char* argv[]) { return 0; }
Compiling with no options (implicitly -fzero-initialized-in-bss):
$ touch test.c && make test && objdump -x test | grep put_me_somewhere
cc test.c -o test
0000000000601028 g O .bss 0000000000000004 put_me_somewhere
Compiling with -fno-zero-initialized-in-bss option:
$ touch test.c && make test CFLAGS=-fno-zero-initialized-in-bss && objdump -x test | grep put_me_somewhere
cc -fno-zero-initialized-in-bss test.c -o test
0000000000601018 g O .data 0000000000000004 put_me_somewhere
It's easy enough to test for a specific compiler:
$ cat bss.c
int global_no_value;
int global_initialized = 0;
int main(int argc, char* argv[]) {
return 0;
}
$ make bss
cc bss.c -o bss
$ readelf -s bss | grep global_
32: 0000000000400420 0 FUNC LOCAL DEFAULT 13 __do_global_dtors_aux
40: 0000000000400570 0 FUNC LOCAL DEFAULT 13 __do_global_ctors_aux
55: 0000000000601028 4 OBJECT GLOBAL DEFAULT 25 global_initialized
60: 000000000060102c 4 OBJECT GLOBAL DEFAULT 25 global_no_value
We're looking for the location of 0000000000601028 and 000000000060102c:
$ readelf -S bss
There are 30 section headers, starting at offset 0x1170:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[24] .data PROGBITS 0000000000601008 00001008
0000000000000010 0000000000000000 WA 0 0 8
[25] .bss NOBITS 0000000000601018 00001018
0000000000000018 0000000000000000 WA 0 0 8
It looks like both values are stored in the .bss section on my system: gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4).
The behavior is dependent upon the C implementation. It may end up in either .data or .bss, and to increase changes that it does not end up in .data taking redundant space up, it's better not to explicitly initialize it to 0, since it will be set to 0 anyway if the object is of static duration.

Resources