Related
Is a variable that is stored in the .data section by definition a global variable that has program scope? In other words are these two words synonymous and one implies the other, or, for example would it be possible to have a global variable that is not stored in the .data section, or a label/variable that is not global?
Just to give a basic example:
// this is compiled as in the .data section with a .globl directive
char global_int = 11;
int main(int argc, char * argv[])
{
}
Would compile to something like:
global_int:
.byte 11
main:
...
But I'm seeing if the two terms -- global and "in the .data section" are the same thing or if there are counterexamples.
There are two different concepts: Which "section" a variable goes into and its "visibility"
For comparison, I've add a .bss section variable:
char global_int = 11;
char nondata_int;
int
main(int argc, char *argv[])
{
}
Compiling with cc -S produces:
.file "fix1.c"
.text
.globl global_int
.data
.type global_int, #object
.size global_int, 1
global_int:
.byte 11
.comm nondata_int,1,1
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)"
.section .note.GNU-stack,"",#progbits
Note the .data to put the global_int variable in the data section. And, .comm to put nondata_int into the .bss section
Also, note the .globl to make the variables have global visibility (i.e. can be seen by other .o files).
Loosely, .data and/or .bss are the sections that the variables are put into. And, global [.globl] are the visibility. If you did:
static int foobar = 63;
Then, foobar would go into the .data section but be local. In the nm output below, instead of D, it would be d to indicate local/static visibility. Other .o files would not be able to see this [or link to it].
An nm of the .o program produces:
0000000000000000 D global_int
0000000000000000 T main
0000000000000001 C nondata_int
And, an nm -g of the final executable produces:
000000000040401d B __bss_start
0000000000404018 D __data_start
0000000000404018 W data_start
0000000000401050 T _dl_relocate_static_pie
0000000000402008 R __dso_handle
000000000040401d D _edata
0000000000404020 B _end
0000000000401198 T _fini
000000000040401c D global_int
w __gmon_start__
0000000000401000 T _init
0000000000402000 R _IO_stdin_used
0000000000401190 T __libc_csu_fini
0000000000401120 T __libc_csu_init
U __libc_start_main##GLIBC_2.2.5
0000000000401106 T main
000000000040401e B nondata_int
0000000000401020 T _start
0000000000404020 D __TMC_END__
UPDATE:
thanks for this answer. Regarding And, .comm to put nondata_int into the .bss section. Could you please explain that a bit? I don't see any reference to .bss so how are those two related?
Sure. There's probably a more rigorous explanation, but loosely, when you do:
int nondata_int;
You are defining a "common" section variable [the historical origin is from Fortran's common].
When linking [to create the final executable], if no other .o [or .a] has declared a value for it, it will be put into the .bss section as a B symbol.
But, if another .o has defined it (e.g. define_it.c):
int nondata_int = 43;
There, define_it.o will put it in the .data section as a D symbol
Then, when you link the two:
gcc -o executable fix1.o define_it.o
Then, in executable, it will go to the .data section as a D symbol.
So, .o files have/use .comm [the assembler directive] and C common section.
Executables have only .data, and .bss. So, given the .o files a common symbol goes to [is promoted to] .bss if it has never been initialized and .data if any .o has initialized it.
Loosely, .comm/C is a suggestion and .data and .bss is a "commitment"
This is a nicety of sorts. Technically, in fix1.c, if we knew beforehand that we were going to be linked with define_it.o, we would [probably] want to do:
extern char nondata_int;
Then, in fix1.o, the would be marked as an "undefined" symbol (i.e. nm would show U).
But, then, if fix1.o were not linked to anything that defined the symbol, the linker would complain about an undefined symbol.
The common symbol allows us to have multiple .o files that each do:
int nondata_int;
They all produce C symbols. The linker combines all to produce a single symbol.
So, again common C symbols are:
I want a global named X and I want it to be the same X as found in any other .o files, but don't complain about the symbol being multiply defined. If one [and only one] of those .o files gives it an initialized value, I'd like to benefit from that value.
Historically ...
IIRC [and I could be wrong about this], common was added [to the linker] to support Fortran COMMON declarations/variables.
That is, all fortran .o files just declared a symbol as common [its concept of global], but the fortran linker was expected to combine them.
Classic/old fortran could only specify a variable as COMMON (i.e. in C, equivalent to int val;) but fortran did not have global initializers (i.e. it did not have extern int val; or int val = 1;)
This common was useful for C, so, at some point it was added.
In the good old days (tm), the common linker type did not exist and one had to have an explicit extern in all but one .o file and one [and only one] that declared it. That .o that declared it could define it with a value (e.g.) int val = 1; or without (e.g.) int val; but all other .o files had to use extern int val;
I'm writing a simple program that converts brainfuck code into x86_64 assembly. Part of that involves creating a large zero-initialized array at the beginning of the program. Thus, each compiled program starts with the following assembly code:
.data
ARR:
.space 32430
.text
.globl _start
.type _start, #function
_start:
... #code as compiled from the brainfuck program
...
From there the compiled program is supposed to be able to access any part of that array, but it should segfault if it tries to access memory before or after it.
Because the array is followed directly by a .text section, which by my understanding is read only, and because it is the first section of the program, I expected that my desired behavior would follow naturally. Unfortunately, this is not the case: compiled programs are able to access non-zero initialized data to the left of (that is, at lower addresses than) the beginning of the array.
Why is this the case and is there anything I can include in the assembly code that would prevent it?
This is, of course, highly system-dependent, but since your observations suit a typical Linux/GNU system, I'll refer to such a system.
what I assume is that the linker isn't putting my segments where I think it is.
True, the linker puts the segments not in the order they appear in your code snippet, but rather .text first, .data second. We can see this e. g. with
> objdump -h ARR
ARR: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000042 08048074 08048074 00000074 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00007eae 080490b8 080490b8 000000b8 2**2
CONTENTS, ALLOC, LOAD, DATA
compiled programs are able to access non-zero initialized data to the left of (that is, at lower addresses than) the beginning of the array.
Why is this the case …
As we also see in the above example, the .data section is linked at memory address 080490b8. Although memory pages have the length PAGE_SIZE (here getconf PAGE_SIZE yields 4096, i. e. 100016) and start at multiples of that size, the data starts at an address offset equal to the file offset 000000b8 (where the data is stored in the disk file), because the file pages containing the .data section are mapped into memory as copy-on-write pages. The non-zero initialized data below the .data section is just what happens to be in the first file page at bytes 0 to b716, including .text.
… is there anything I can include in the assembly code that would prevent it?
I'd prefer a solution that places my segments such that a bad array access causes a segfault.
As Margaret Bloom and Ped7g hinted at, you could allocate additional data below ARR and create an inaccessible guard page. This can be achieved with minimal effort by aligning ARR to the next page address. The example program below implements this and allows to test it by accepting an index argument (optionally negative) with which the ARR data is accessed; if within bounds, it should exit with status 0, otherwise segfault. Note: This method works only if the .text section does not end at a page boundary, because if it does, the .align 4096 is without effect; but since the assembly code is created with a converter program, that program should be able to check this and add a few extra .text bytes if needed.
.data
.align 4096
ARR:
.space 30000 # we'll actually get 32768
.text
.globl _start
.type _start, #function
_start:
mov (%esp),%ebx # argc
cmp $1,%ebx
jbe 9f
mov $0,%ax
mov $1,%ebx # sign 1
mov 8(%esp),%esi # argv[1]
0: movb (%esi),%cl # convert argument string to integer
jcxz 1f
sub $'0',%cl
js 2f
mov $10,%dx
mul %dx
add %cx,%ax
jmp 3f
2: neg %ebx # change sign
3: add $1,%esi
jmp 0b
1: mul %ebx # multiply with sign 1 or -1
movzx ARR(%eax),%ebx# load ARR[atoi(argv[1])]
9: mov $1,%eax
int $128 # _exit(ebx);
I tried to read an input into a buffer with fgets. I pushed the 3 parameters, but got segmentation fault. I tried to see the problem with GDB, but I didn't understand the message that I got there.
This is the code:
section .rodata
buffer: db 10
section .text
align 16
global main
extern fgets
extern stdin
main:
push ebp
mov ebp, esp
pushad
push dword[stdin]
push 10;
push buffer;
call fgets;
add esp, 12
popad
mov esp, ebp
pop ebp
ret
And this is the message that I got:
Program received signal SIGSEGV, Segmentation fault.
__GI__IO_getline_info (fp=fp#entry=0xf7fb1c20 <_IO_2_1_stdin_>,
buf=buf#entry=0x80484f0 "\n", n=8, n#entry=9, delim=delim#entry=10,
extract_delim=extract_delim#entry=1, eof=eof#entry=0x0) at iogetline.c:86
86 iogetline.c: No such file or directory.
What is wrong with my code?
You segfault because you ask fgets to write to an address in the .rodata section. It's of course read-only.
Put your buffer in the .bss section, and use resb 10 to reserve 10 bytes. Your current version is one byte, initialized to { 10 }. You don't want to store a bunch of zeros in your executable for no reason; that's what the bss is for.
section .bss
buffer: resb 10
buffer_length equ $ - buffer
section .text
align 16
global main
extern fgets
extern stdin
main:
push dword [stdin]
push buffer_length
push buffer ; 3 pushes gets the stack back to 16B-alignment
call fgets
add esp, 12
ret
You don't need pusha, or a stack frame (the stuff with ebp) in this function. Normally you only save/restore call-preserved registers you want to use, not all of them every time.
As Michael Petch points out, it would also be better to reserve space on the stack for the buffer, instead of using static storage. Have a look at compiler output for an equivalent C function that uses a local array. (e.g. on http://gcc.godbolt.org/).
static int func_name (const uint8_t * address)
{
int result;
asm ("movl $1f, %0; movzbl %1, %0; 1:"
: "=&a" (result) : "m" (*address));
return result;
}
I have gone through inline assembly references over internet.
But i am unable to figure out what this code is doing, eg. what is $1f ?
And what does "m" means? Isn't the normal inline convention to use "=r" and "r" ?
The code is functionally identical to return *address but not absolutely equivalent to this wrt. to the generated binary / object file.
In ELF, the usage of the forward reference (i.e. the mov $1f, ... to retrieve the address of the assembly local label) results in the creation of what's called a relocation. A relocation is an instruction to the linker (either at executable creation or later to the dynamic linker at executable/library loading) to insert a value only known at link/load time. In the object code, this looks like:
Disassembly of section .text:
0000000000000000 :
0: b8 00 00 00 00 mov $0x0,%eax
5: 0f b6 07 movzbl (%rdi),%eax
8: c3 retq
Notice the value (at offset 1 into the .text section) is zero here even though that's actually not correct - it depends on where in the running code the function will end up. Only the (dynamic) linker can ultimately know this, and the information that this piece of memory needs to be updated as it is loaded is actually placed into the object file:
$ readelf -a xqf.o
ELF Header:
[ ... ]
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
0000000000000009 0000000000000000 AX 0 0 16
[ 2] .rela.text RELA 0000000000000000 000004e0
0000000000000018 0000000000000018 10 1 8
[ ... ]
Relocation section '.rela.text' at offset 0x4e0 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000001 00020000000a R_X86_64_32 0000000000000000 .text + 8
[ ... ]
This ELF section entry says:
look at offset 1 into the .text section
there's a 32bit value that will be zero-extended to 64-bit (R_X86_64_32). This may have been intended for use in 32-bit code, but in a 64-bit non-PIE executable that's still the most efficient way to put an address into a register; smaller than lea 1f(%rip), %0 for a R_X86_64_PC32 RIP-relative relocation. And yes a RIP-relative LEA into a 32-bit register is legal, and saves a byte of machine code if you don't care about truncating the address.
the value you (as the linker) need to put there is that of .text + 8 (which will have to be computed at link / load time)
This entry is created thanks to the mov $1f, %0 instruction. If you leave that out (or just write return *address), it won't be there.
I've forced code generation for the above by removing the static qualifier; without doing so, a simple compile actually creates no code at all (static code gets eliminated if not used, and, a lot of the time, inlined if used).
Due to the fact that the function is static, as said, it'll normally be inlined at the call site by the compiler. The information where it's used therefore usually gets lost, as does the ability of a debugger to instrument it. But the trick shown here can recover this (indirectly), because there will be one relocation entry created per use of the function. In addition to that, methods like this can be used to establish instrumentation points within the binary; insert well-known/strictly-defined but functionally-meaningless small assembly statements at locations recoverable through the object file format, and then let e.g. the debugger / tracing utilities replace them with "more useful" things when needed.
$1f is the address of the 1 label. The f specifies to look for the first label named 1 in the forward direction. "m" is an input operand that is in memory. "=&a" is an output operand that uses the eax register. a specifies the register to use, = makes it an output operand, and & guarantees that other operands will not share the same register.
Here, %0 will be replaced with the first operand (the eax register) and %1 by the second operand (The address pointed to by address).
All these and more are explained in the GCC documentation on Inline assembly and asm contraints.
This piece of code (apart from being non-compilable due to two typos) is hardly useful.
This is what it turns into (use the -S switch):
_func_name:
movl 4(%esp), %edx ; edx = the "address" parameter
movl $1f, %eax ; eax = the address of the "1" label
movzbl (%edx), %eax; eax = byte from address in edx, IOW, "*address"
1:
ret
So the entire body of the function can be replaced with just
return *address;
This is a code snippet from the PintOS project.
The function here is used by the OS kernel to read a byte at address from the user address space. That is done by movzbl %1, %0 where 0% is result and 1% is address. But before that, the kernel has to move the address of $1f(which is the address of the instruction right after movzbl %1, %0) to the eax register. This move seems useless because some context information is missing. The kernel does that for the page fault interrupt handler to use it. Because address could be an invalid one offered by the user, and it might cause a page fault. When that happened, the interrupt handler would take over, set eip equal to eax(which is the memory address of $1f), and also set eax to -1 to indicate that the read failed. After that, the kernel was able to return from the handler to $1f and move on. Without saving the address of $1f, the handler would have no idea where it should return to, and could only go back to movzbl %1, %0 again and again.
I have written the following code, can you explain me what does the assembly tell here.
typedef struct
{
int abcd[5];
} hh;
void main()
{
printf("%d", ((hh*)0)+1);
}
Assembly:
.file "aa.c"
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
save %sp, -112, %sp
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
mov 20, %o1
call printf, 0
nop
return %i7+8
nop
.size main, .-main
.ident "GCC: (GNU) 4.2.1"
Oh wow, SPARC assembly language, I haven't seen that in years.
I guess we go line by line? I'm going to skip some of the uninteresting boilerplate.
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
This is the string constant you used in printf (so obvious, I know!) The important things to notice are that it's in the .rodata section (sections are divisions of the eventual executable image; this one is for "read-only data" and will in fact be immutable at runtime) and that it's been given the label .LLC0. Labels that begin with a dot are private to the object file. Later, the compiler will refer to that label when it wants to load the address of the string constant.
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
.text is the section for actual machine code. This is the boilerplate header for defining the global function named main, which at the assembly level is no different from any other function (in C -- not necessarily so in C++). I don't remember what .proc 020 does.
save %sp, -112, %sp
Save the previous register window and adjust the stack pointer downward. If you don't know what a register window is, you need to read the architecture manual: http://sparc.org/wp-content/uploads/2014/01/v8.pdf.gz. (V8 is the last 32-bit iteration of SPARC, V9 is the first 64-bit one. This appears to be 32-bit code.)
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
This two-instruction sequence has the net effect of loading the address .LLC0 (that's your string constant) into register %o0, which is the first outgoing argument register. (The arguments to this function are in the incoming argument registers.)
mov 20, %o1
Load the immediate constant 100 into %o1, the second outgoing argument register. This is the value computed by ((foo *)0)+1. It's 20 because your struct foo is 20 bytes long (five 4-byte ints) and you asked for the second one within the array starting at address zero.
Incidentally, computing an offset from a pointer is only well-defined in C when there is actually a sufficiently large array at the address of the base pointer; ((foo *)0) is a null pointer, so there isn't an array there, so the expression ((foo *)0)+1 technically has undefined behavior. GCC 4.2.1, targeting hosted SPARC, happens to have interpreted it as "pretend there is an arbitrarily large array of foos at address zero and compute the expected offset for array member 1", but other (especially newer) compilers may do something completely different.
call printf, 0
nop
Call printf. I don't remember what the zero is for. The call instruction has a delay slot (again, read the architecture manual) which is filled in with a do-nothing instruction, nop.
return %i7+8
nop
Jump to the address in register %i7 plus eight. This has the effect of returning from the current function.
return also has a delay slot, which is filled in with another nop. There is supposed to be a restore instruction in this delay slot, matching the save at the top of the function, so that main's caller gets its register window back. I don't know why it's not there. Discussion in the comments talks about main possibly not needing to pop the register window, and/or your having declared main as void main() (which is not guaranteed to work with any C implementation, unless its documentation specifically says so, and is always bad style) ... but pushing and not popping the register window is such a troublesome thing to do on a SPARC that I don't find either explanation convincing. I might even call it a compiler bug.
The assembly calls printf, passing your text buffer and the number 20 on the stack (which is what you asked for in a roundabout way).