Seg Fault in ARM Assembly - c

So, I am trying to learn ARM assembly and basically what I want to do is turn on the LEDs of my BeagleBone Black using pure assembly. I know how to program in C very well, but I am new to ARM assembly if that makes any difference.
Basically I am just trying to modify a character in a string, but it doesn't seem to be working. Maybe it is because I do not fully understand the memory management instructions.
When I run the code it gives me a segmentation fault.
Here is my code:
.syntax unified
.global main
main:
push {ip, lr}
mov r0, beagle_bone_0
mov r1, #0x65
strb r1, [r0]
ldr r0, =beagle_bone_0
bl printf
pop {ip, pc}
beagle_bone_0:
.asciz "/sys/class/leds/beaglebone:green:usr0/brightness"
objdump -x output:
helloworld: file format elf32-littlearm
helloworld
architecture: arm, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00008325
Program Header:
0x70000001 off 0x00000444 vaddr 0x00008444 paddr 0x00008444 align 2**2
filesz 0x00000008 memsz 0x00000008 flags r--
PHDR off 0x00000034 vaddr 0x00008034 paddr 0x00008034 align 2**2
filesz 0x00000100 memsz 0x00000100 flags r-x
INTERP off 0x00000134 vaddr 0x00008134 paddr 0x00008134 align 2**0
filesz 0x00000019 memsz 0x00000019 flags r--
LOAD off 0x00000000 vaddr 0x00008000 paddr 0x00008000 align 2**15
filesz 0x00000450 memsz 0x00000450 flags r-x
LOAD off 0x00000450 vaddr 0x00010450 paddr 0x00010450 align 2**15
filesz 0x00000124 memsz 0x00000128 flags rw-
DYNAMIC off 0x0000045c vaddr 0x0001045c paddr 0x0001045c align 2**2
filesz 0x000000f0 memsz 0x000000f0 flags rw-
NOTE off 0x00000150 vaddr 0x00008150 paddr 0x00008150 align 2**2
filesz 0x00000044 memsz 0x00000044 flags r--
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
filesz 0x00000000 memsz 0x00000000 flags rwx
Dynamic Section:
NEEDED libc.so.6
INIT 0x000082d1
FINI 0x00008439
INIT_ARRAY 0x00010450
INIT_ARRAYSZ 0x00000004
FINI_ARRAY 0x00010454
FINI_ARRAYSZ 0x00000004
HASH 0x00008194
GNU_HASH 0x000081bc
STRTAB 0x00008238
SYMTAB 0x000081e8
STRSZ 0x00000043
SYMENT 0x00000010
DEBUG 0x00000000
PLTGOT 0x0001054c
PLTRELSZ 0x00000020
PLTREL 0x00000011
JMPREL 0x000082b0
REL 0x000082a8
RELSZ 0x00000008
RELENT 0x00000008
VERNEED 0x00008288
VERNEEDNUM 0x00000001
VERSYM 0x0000827c
Version References:
required from libc.so.6:
0x0d696914 0x00 02 GLIBC_2.4
private flags = 5000002: [Version5 EABI] [has entry point]
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 00000019 00008134 00008134 00000134 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.ABI-tag 00000020 00008150 00008150 00000150 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .note.gnu.build-id 00000024 00008170 00008170 00000170 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .hash 00000028 00008194 00008194 00000194 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .gnu.hash 0000002c 000081bc 000081bc 000001bc 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .dynsym 00000050 000081e8 000081e8 000001e8 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .dynstr 00000043 00008238 00008238 00000238 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .gnu.version 0000000a 0000827c 0000827c 0000027c 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .gnu.version_r 00000020 00008288 00008288 00000288 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .rel.dyn 00000008 000082a8 000082a8 000002a8 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .rel.plt 00000020 000082b0 000082b0 000002b0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
11 .init 0000000a 000082d0 000082d0 000002d0 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .plt 00000048 000082dc 000082dc 000002dc 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .text 00000114 00008324 00008324 00000324 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .fini 00000006 00008438 00008438 00000438 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
15 .rodata 00000004 00008440 00008440 00000440 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .ARM.exidx 00000008 00008444 00008444 00000444 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
17 .eh_frame 00000004 0000844c 0000844c 0000044c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
18 .init_array 00000004 00010450 00010450 00000450 2**2
CONTENTS, ALLOC, LOAD, DATA
19 .fini_array 00000004 00010454 00010454 00000454 2**2
CONTENTS, ALLOC, LOAD, DATA
20 .jcr 00000004 00010458 00010458 00000458 2**2
CONTENTS, ALLOC, LOAD, DATA
21 .dynamic 000000f0 0001045c 0001045c 0000045c 2**2
CONTENTS, ALLOC, LOAD, DATA
22 .got 00000020 0001054c 0001054c 0000054c 2**2
CONTENTS, ALLOC, LOAD, DATA
23 .data 00000008 0001056c 0001056c 0000056c 2**2
CONTENTS, ALLOC, LOAD, DATA
24 .bss 00000004 00010574 00010574 00000574 2**0
ALLOC
25 .comment 0000001d 00000000 00000000 00000574 2**0
CONTENTS, READONLY
26 .ARM.attributes 00000031 00000000 00000000 00000591 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00008134 l d .interp 00000000 .interp
00008150 l d .note.ABI-tag 00000000 .note.ABI-tag
00008170 l d .note.gnu.build-id 00000000 .note.gnu.build-id
00008194 l d .hash 00000000 .hash
000081bc l d .gnu.hash 00000000 .gnu.hash
000081e8 l d .dynsym 00000000 .dynsym
00008238 l d .dynstr 00000000 .dynstr
0000827c l d .gnu.version 00000000 .gnu.version
00008288 l d .gnu.version_r 00000000 .gnu.version_r
000082a8 l d .rel.dyn 00000000 .rel.dyn
000082b0 l d .rel.plt 00000000 .rel.plt
000082d0 l d .init 00000000 .init
000082dc l d .plt 00000000 .plt
00008324 l d .text 00000000 .text
00008438 l d .fini 00000000 .fini
00008440 l d .rodata 00000000 .rodata
00008444 l d .ARM.exidx 00000000 .ARM.exidx
0000844c l d .eh_frame 00000000 .eh_frame
00010450 l d .init_array 00000000 .init_array
00010454 l d .fini_array 00000000 .fini_array
00010458 l d .jcr 00000000 .jcr
0001045c l d .dynamic 00000000 .dynamic
0001054c l d .got 00000000 .got
0001056c l d .data 00000000 .data
00010574 l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l d .ARM.attributes 00000000 .ARM.attributes
0000835c l F .text 00000000 call_gmon_start
00000000 l df *ABS* 00000000 crtstuff.c
00010458 l O .jcr 00000000 __JCR_LIST__
00008374 l F .text 00000000 __do_global_dtors_aux
00010574 l O .bss 00000001 completed.5637
00010454 l O .fini_array 00000000 __do_global_dtors_aux_fini_array_entry
00008384 l F .text 00000000 frame_dummy
00010450 l O .init_array 00000000 __frame_dummy_init_array_entry
000083b8 l .text 00000000 beagle_bone_0
00000000 l df *ABS* 00000000 crtstuff.c
0000844c l O .eh_frame 00000000 __FRAME_END__
00010458 l O .jcr 00000000 __JCR_END__
00010454 l .init_array 00000000 __init_array_end
0001045c l O .dynamic 00000000 _DYNAMIC
00010450 l .init_array 00000000 __init_array_start
0001054c l O .got 00000000 _GLOBAL_OFFSET_TABLE_
00008434 g F .text 00000002 __libc_csu_fini
0001056c w .data 00000000 data_start
000082f0 F *UND* 00000000 printf##GLIBC_2.4
00010574 g *ABS* 00000000 __bss_start__
00010578 g *ABS* 00000000 _bss_end__
00010574 g *ABS* 00000000 _edata
00008438 g F .fini 00000000 _fini
00010578 g *ABS* 00000000 __bss_end__
0001056c g .data 00000000 __data_start
000082fc F *UND* 00000000 __libc_start_main##GLIBC_2.4
00000000 w *UND* 00000000 __gmon_start__
00010570 g O .data 00000000 .hidden __dso_handle
00008440 g O .rodata 00000004 _IO_stdin_used
000083f0 g F .text 00000044 __libc_csu_init
00010578 g *ABS* 00000000 _end
00008324 g F .text 00000000 _start
00010578 g *ABS* 00000000 __end__
00010574 g *ABS* 00000000 __bss_start
0000839c g .text 00000000 main
00000000 w *UND* 00000000 _Jv_RegisterClasses
00008318 F *UND* 00000000 abort##GLIBC_2.4
000082d0 g F .init 00000000 _init

The answer to my question was actually really simple. Since ldr r0, =beagle_bone_0 loads the address of beagle_bone_0 into register 0 I can just manipulate beagle_bone_0 with that address.
Working test code:
.syntax unified
.data
beagle_bone_0: .ascii "Hello, world\n"
.text
.global main
main:
push {ip, lr}
ldr r0, =beagle_bone_0
mov r1, #0x65
strb r1, [r0]
bl printf
pop {ip, pc}

I ran and debugged your code. The line mov r0, beagle_bone_0 didn't even compile (on my compiler, at least). You want to load in r0 the address of beagle_bone. For this, you should use the adr pseudo-instruction, that is translated by the compiler in a pc-relative move (something like mov r0, [pc, #8]. You cannot use it this way. Probably your compiler translated it into something different.
So, to fix it, just replace the line mov r0, beagle_bone_0 by adr r0, beagle_bone_0.
Also the string was in the .text section which we cannot edit. So, I put beagle_bone_0 in the .data section.

Related

readelf shows wrong section offset

For some sections of an ELF file, '.dynstr' for example, the offset which is written in the structure of the section header table is something that "readelf -a" does not report. For example the offset is "0x0245" but readelf reports "0x0300" as the offset of the section. I can confirm that the offset returned by "readelf" is wrong using a HexEditor. For the first few sections what readelf is reporting as their corresponding offset is correct, but some point downward all the offsets are wrong. Does any body know why the reported offsets are different from what is written in the file? Or is this a bug related to readelf?
Note: By using "objdump -h" I can also confirm that the offsets reported by readelf are wrong
Note2: Some offsets returned by 'readelf' are even bigger than the file size.
This is as output example or readelf:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000400200 00000200
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 000000000040021c 0000021c
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.build-i NOTE 000000000040023c 0000023c
0000000000000024 0000000000000000 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000400260 00000260
000000000000001c 0000000000000000 A 5 0 8
[ 5] .dynsym DYNSYM 0000000000400280 00000280
0000000000000120 0000000000000018 A 6 1 8
>> [ 6] .dynstr STRTAB 00000000004003a0 000003a0 <<< 0x3a0 is wrong
0000000000000084 0000000000000000 A 0 0 1
This is 'objdump -h' output :
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 0000001c 0000000000400200 0000000000400200 00000200 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.ABI-tag 00000020 000000000040021c 000000000040021c 0000021c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .note.gnu.build-id 00000024 000000000040023c 000000000040023c 0000023c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .gnu.hash 0000001c 0000000000400260 0000000000400260 00000260 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .dynsym 00000048 0000000000400280 0000000000400280 00000280 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .dynstr 00000038 00000000004002c8 00000000004002c8 000002c8 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .gnu.version 00000006 0000000000400300 0000000000400300 00000300 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .gnu.version_r 00000020 0000000000400308 0000000000400308 00000308 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .rela.dyn 00000018 0000000000400328 0000000000400328 00000328 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .rela.plt 00000030 0000000000400340 0000000000400340 00000340 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .init 0000001a 0000000000400370 0000000000400370 00000370 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
11 .plt 00000030 0000000000400390 0000000000400390 00000390 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .text 00000182 00000000004003c0 00000000004003c0 000003c0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .fini 00000009 0000000000400544 0000000000400544 00000544 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .rodata 00000004 0000000000400550 0000000000400550 00000550 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
15 .eh_frame_hdr 00000034 0000000000400554 0000000000400554 00000554 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .eh_frame 000000f4 0000000000400588 0000000000400588 00000588 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
17 .init_array 00000008 0000000000600680 0000000000600680 00000680 2**3
CONTENTS, ALLOC, LOAD, DATA
18 .fini_array 00000008 0000000000600688 0000000000600688 00000688 2**3
CONTENTS, ALLOC, LOAD, DATA
19 .jcr 00000008 0000000000600690 0000000000600690 00000690 2**3
CONTENTS, ALLOC, LOAD, DATA
20 .dynamic 000001d0 0000000000600698 0000000000600698 00000698 2**3
CONTENTS, ALLOC, LOAD, DATA
21 .got 00000008 0000000000600868 0000000000600868 00000868 2**3
CONTENTS, ALLOC, LOAD, DATA
22 .got.plt 00000028 0000000000600870 0000000000600870 00000870 2**3
CONTENTS, ALLOC, LOAD, DATA
23 .data 00000014 0000000000600898 0000000000600898 00000898 2**3
CONTENTS, ALLOC, LOAD, DATA
24 .bss 00000004 00000000006008ac 00000000006008ac 000008ac 2**0
ALLOC
25 .comment 00000039 0000000000000000 0000000000000000 000008ac 2**0
CONTENTS, READONLY

ELF Binary: why symbol value is different from actual symbol address? [duplicate]

readelf output of the object file:
Symbol table '.symtab' contains 15 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS fp16.c
2: 00000000 0 SECTION LOCAL DEFAULT 1
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 NOTYPE LOCAL DEFAULT 1 $t
6: 00000001 194 FUNC LOCAL DEFAULT 1 __gnu_f2h_internal
7: 00000010 0 NOTYPE LOCAL DEFAULT 5 $d
8: 00000000 0 SECTION LOCAL DEFAULT 5
9: 00000000 0 SECTION LOCAL DEFAULT 7
10: 000000c5 78 FUNC GLOBAL HIDDEN 1 __gnu_h2f_internal
11: 00000115 4 FUNC GLOBAL HIDDEN 1 __gnu_f2h_ieee
12: 00000119 4 FUNC GLOBAL HIDDEN 1 __gnu_h2f_ieee
13: 0000011d 4 FUNC GLOBAL HIDDEN 1 __gnu_f2h_alternative
14: 00000121 4 FUNC GLOBAL HIDDEN 1 __gnu_h2f_alternative
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000124 00 AX 0 0 4
[ 2] .rel.text REL 00000000 00058c 000010 08 9 1 4
[ 3] .data PROGBITS 00000000 000158 000000 00 WA 0 0 1
[ 4] .bss NOBITS 00000000 000158 000000 00 WA 0 0 1
[ 5] .debug_frame PROGBITS 00000000 000158 00008c 00 0 0 4
[ 6] .rel.debug_frame REL 00000000 00059c 000060 08 9 5 4
[ 7] .ARM.attributes ARM_ATTRIBUTES 00000000 0001e4 00002f 00 0 0 1
[ 8] .shstrtab STRTAB 00000000 000213 000051 00 0 0 1
[ 9] .symtab SYMTAB 00000000 00041c 0000f0 10 10 10 4
[10] .strtab STRTAB 00000000 00050c 00007e 00 0 0 1
Relocation section '.rel.text' at offset 0x58c contains 2 entries:
Offset Info Type Sym.Value Sym. Name
0000011a 00000a66 R_ARM_THM_JUMP11 000000c5 __gnu_h2f_internal
00000122 00000a66 R_ARM_THM_JUMP11 000000c5 __gnu_h2f_internal
Relocation section '.rel.debug_frame' at offset 0x59c contains 12 entries:
Offset Info Type Sym.Value Sym. Name
00000014 00000802 R_ARM_ABS32 00000000 .debug_frame
00000018 00000202 R_ARM_ABS32 00000000 .text
00000040 00000802 R_ARM_ABS32 00000000 .debug_frame
00000044 00000202 R_ARM_ABS32 00000000 .text
00000050 00000802 R_ARM_ABS32 00000000 .debug_frame
00000054 00000202 R_ARM_ABS32 00000000 .text
00000060 00000802 R_ARM_ABS32 00000000 .debug_frame
00000064 00000202 R_ARM_ABS32 00000000 .text
00000070 00000802 R_ARM_ABS32 00000000 .debug_frame
00000074 00000202 R_ARM_ABS32 00000000 .text
00000080 00000802 R_ARM_ABS32 00000000 .debug_frame
00000084 00000202 R_ARM_ABS32 00000000 .text
.text section structure as I understand it:
.text section has size of 0x124
0x0: unknown byte
0x1-0xC3: __gnu_f2h_internal
0xC3-0xC5: two unknown bytes between those functions (btw what are those?)
0xC5-0x113: __gnu_h2f_internal
0x113-0x115: two unknown bytes between those functions
0x115-0x119: __gnu_f2h_ieee
0x119-0x11D: __gnu_h2f_ieee
0x11D-0x121: __gnu_f2h_alternative
0x121-0x125: __gnu_h2f_alternative // section is only 0x124, what happened to the missing byte?
Notice that the section size is 0x124 and the last function end in 0x125, what happend to the missing byte?
Thanks.
Technically, your "missing byte" is the one right there at 0x0.
Note that you're looking at the value of the symbol, i.e. the runtime function address (this would be a lot clearer if your .text section VMA wasn't 0). Since they're Thumb functions, the addresses have bit 0 set such that the processor will switch to Thumb mode when calling them; the actual locations of those instructions are still halfword-aligned, i.e. 0x0, 0xc4, 0x114, etc. since they couldn't be executed otherwise (you'd take a fault for a misaligned PC). Strip off bit 0 as per what the ARM
ELF spec says about STT_FUNC symbols to get the actual VMA of the instruction corresponding to that symbol, then subtract the start of the section and you should have the same relative offset as within the object file itself.
<offset in section> = (<symbol value> & ~1) - <section VMA>
The extra halfword padding after some functions just ensures each symbol is word-aligned - there are probably various reasons for this, but the first one that comes to mind is that the adr instruction wouldn't work properly if they weren't.

Incorrect function size inside ARM ELF object

readelf output of the object file:
Symbol table '.symtab' contains 15 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS fp16.c
2: 00000000 0 SECTION LOCAL DEFAULT 1
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 NOTYPE LOCAL DEFAULT 1 $t
6: 00000001 194 FUNC LOCAL DEFAULT 1 __gnu_f2h_internal
7: 00000010 0 NOTYPE LOCAL DEFAULT 5 $d
8: 00000000 0 SECTION LOCAL DEFAULT 5
9: 00000000 0 SECTION LOCAL DEFAULT 7
10: 000000c5 78 FUNC GLOBAL HIDDEN 1 __gnu_h2f_internal
11: 00000115 4 FUNC GLOBAL HIDDEN 1 __gnu_f2h_ieee
12: 00000119 4 FUNC GLOBAL HIDDEN 1 __gnu_h2f_ieee
13: 0000011d 4 FUNC GLOBAL HIDDEN 1 __gnu_f2h_alternative
14: 00000121 4 FUNC GLOBAL HIDDEN 1 __gnu_h2f_alternative
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000124 00 AX 0 0 4
[ 2] .rel.text REL 00000000 00058c 000010 08 9 1 4
[ 3] .data PROGBITS 00000000 000158 000000 00 WA 0 0 1
[ 4] .bss NOBITS 00000000 000158 000000 00 WA 0 0 1
[ 5] .debug_frame PROGBITS 00000000 000158 00008c 00 0 0 4
[ 6] .rel.debug_frame REL 00000000 00059c 000060 08 9 5 4
[ 7] .ARM.attributes ARM_ATTRIBUTES 00000000 0001e4 00002f 00 0 0 1
[ 8] .shstrtab STRTAB 00000000 000213 000051 00 0 0 1
[ 9] .symtab SYMTAB 00000000 00041c 0000f0 10 10 10 4
[10] .strtab STRTAB 00000000 00050c 00007e 00 0 0 1
Relocation section '.rel.text' at offset 0x58c contains 2 entries:
Offset Info Type Sym.Value Sym. Name
0000011a 00000a66 R_ARM_THM_JUMP11 000000c5 __gnu_h2f_internal
00000122 00000a66 R_ARM_THM_JUMP11 000000c5 __gnu_h2f_internal
Relocation section '.rel.debug_frame' at offset 0x59c contains 12 entries:
Offset Info Type Sym.Value Sym. Name
00000014 00000802 R_ARM_ABS32 00000000 .debug_frame
00000018 00000202 R_ARM_ABS32 00000000 .text
00000040 00000802 R_ARM_ABS32 00000000 .debug_frame
00000044 00000202 R_ARM_ABS32 00000000 .text
00000050 00000802 R_ARM_ABS32 00000000 .debug_frame
00000054 00000202 R_ARM_ABS32 00000000 .text
00000060 00000802 R_ARM_ABS32 00000000 .debug_frame
00000064 00000202 R_ARM_ABS32 00000000 .text
00000070 00000802 R_ARM_ABS32 00000000 .debug_frame
00000074 00000202 R_ARM_ABS32 00000000 .text
00000080 00000802 R_ARM_ABS32 00000000 .debug_frame
00000084 00000202 R_ARM_ABS32 00000000 .text
.text section structure as I understand it:
.text section has size of 0x124
0x0: unknown byte
0x1-0xC3: __gnu_f2h_internal
0xC3-0xC5: two unknown bytes between those functions (btw what are those?)
0xC5-0x113: __gnu_h2f_internal
0x113-0x115: two unknown bytes between those functions
0x115-0x119: __gnu_f2h_ieee
0x119-0x11D: __gnu_h2f_ieee
0x11D-0x121: __gnu_f2h_alternative
0x121-0x125: __gnu_h2f_alternative // section is only 0x124, what happened to the missing byte?
Notice that the section size is 0x124 and the last function end in 0x125, what happend to the missing byte?
Thanks.
Technically, your "missing byte" is the one right there at 0x0.
Note that you're looking at the value of the symbol, i.e. the runtime function address (this would be a lot clearer if your .text section VMA wasn't 0). Since they're Thumb functions, the addresses have bit 0 set such that the processor will switch to Thumb mode when calling them; the actual locations of those instructions are still halfword-aligned, i.e. 0x0, 0xc4, 0x114, etc. since they couldn't be executed otherwise (you'd take a fault for a misaligned PC). Strip off bit 0 as per what the ARM
ELF spec says about STT_FUNC symbols to get the actual VMA of the instruction corresponding to that symbol, then subtract the start of the section and you should have the same relative offset as within the object file itself.
<offset in section> = (<symbol value> & ~1) - <section VMA>
The extra halfword padding after some functions just ensures each symbol is word-aligned - there are probably various reasons for this, but the first one that comes to mind is that the adr instruction wouldn't work properly if they weren't.

What should the value of %esp be at this point in the code?

I've been having trouble getting this code to work.
test $0x10000000, %esp
jz .ERROR
ret
If it jumps to .ERROR, the code just exits. Otherwise the output prints as normal.
When I use test $0x0000000, %esp it quits as I would expect.
These are my sections:
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 00000013 08048114 08048114 00000114 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.ABI-tag 00000020 08048128 08048128 00000128 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .hash 00000038 08048148 08048148 00000148 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .dynsym 00000090 08048180 08048180 00000180 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .dynstr 00000064 08048210 08048210 00000210 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .gnu.version 00000012 08048274 08048274 00000274 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .gnu.version_r 00000020 08048288 08048288 00000288 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .rel.dyn 00000010 080482a8 080482a8 000002a8 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .rel.plt 00000030 080482b8 080482b8 000002b8 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .init 00000024 080482e8 080482e8 000002e8 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
10 .plt 00000070 08048310 08048310 00000310 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
11 .text 00000188 08048380 08048380 00000380 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .springboard 00000023 08048508 08048508 00000508 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .fini 00000015 0804852c 0804852c 0000052c 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .rodata 00000024 08048544 08048544 00000544 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
15 .eh_frame 000000e0 08048568 08048568 00000568 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .dynamic 000000c8 08049648 08049648 00000648 2**2
CONTENTS, ALLOC, LOAD, DATA
17 .got 00000004 08049710 08049710 00000710 2**2
CONTENTS, ALLOC, LOAD, DATA
18 .got.plt 00000024 08049714 08049714 00000714 2**2
CONTENTS, ALLOC, LOAD, DATA
19 .data 00000004 08049738 08049738 00000738 2**2
CONTENTS, ALLOC, LOAD, DATA
20 .bss 00000004 0804973c 0804973c 0000073c 2**2
ALLOC
21 .comment 0000002a 00000000 00000000 0000073c 2**0
CONTENTS, READONLY
Maybe I don't understand this yet, but should %esp be equal to addresses in that range?
I can move the .springboard section to 0x10000000 if I link it with my linker script. The return goes to the springboard section. So my thought was that it shouldn't work here, but if I link it with my script and the springboard section is moved, then it will work. Why is it working in both cases?
I'm guessing the test is returning a non-zero value but I don't understand why.
No, esp is a stack pointer, so it should point to some address inside stack. Your program doesn't seem to provide any stack section, so I guess the OS allocates the stack.
Well, if you are about to return from a function, dword ptr [esp] (but not esp) should indeed contain an address from the sections above, as this should be an address of the next instruction to be executed after the function call.

Converting ARM to C

Given, for example, the following ARM assembly code, are there any straightforward ways to convert it directly to C, using whatever appropriate variable names?
ADD $2 $0 #9
ADD $3 $0 #3
ADD $1 $0 $0
loop: ADD $1 $1 #1
ADD $3 $0 $3, LSL #1
SUB $2 $2 $1
CMP $2 $1
BNE loop
Also, as I'm still learning ARM, how many times will the loop execute say, SUB or ADD? Are there straightforward ways to determine this?
Thanks for the help! Any other insight not particularly aimed at answering the question would also be great.
In short, BNE - Branch Not Equal, could suggest either a do{...}while loop or the other way while (...){...}, even possibly a for( ...; ... < ....; ...){...} loop, that's about far as it can go.
As for reading the addition/subtraction from some registers (read, memory variables in the context of C), you will have to play by reading it and come up with a near equivalent.
A decompiler may not help you at this stage, play with a couple of C code to practice and compile it to assembler language using the -S command parameter passed to the C compiler and see what you get, mostly trial and error am afraid, that is, if you're looking for the exact replica of that code in the above question.
unsigned int r0,r1,r2,r3;
r2=r0+9;
r3=r0+3;
r1=r0+r0;
do
{
r1=r1+1;
r3=r0+(r3<<1);
r2=r2-r1;
} while(r2!=r1);
not knowing what r0 is going in the loop can happen a few times or many times (like millions? billions?) r2 is decreasing, r1 is increasing if they dont collide with an equals the first time they pass they will have to roll around. every loop r1 gets bigger so r2 gets smaller that much faster. should be very easy to add a printf and some test values for r0 and see what happens.
say for example r0 is a 0 before entering this code. r2 is r0+9 = 9; and r1 is double r0 which is 0.
The first so many loops would go like this with the four variables r0,r1,r2,r3
00000000 00000001 00000008 00000006
00000000 00000002 00000007 0000000C
00000000 00000003 00000006 00000018
00000000 00000004 00000005 00000030
00000000 00000005 00000004 00000060
00000000 00000006 00000003 000000C0
00000000 00000007 00000002 00000180
00000000 00000008 00000001 00000300
00000000 00000009 00000000 00000600
00000000 0000000A FFFFFFFF 00000C00
00000000 0000000B FFFFFFFE 00001800
r2 and r1 are not going to collide.
but if r0 was a 1 going in then
00000001 00000003 00000009 00000009
00000001 00000004 00000008 00000013
00000001 00000005 00000007 00000027
00000001 00000006 00000006 0000004F
r0 = 3
00000003 00000007 0000000B 0000000F
00000003 00000008 0000000A 00000021
00000003 00000009 00000009 00000045
r0 needs to be odd so far. but when you make r0 a 9 then
00000009 00000013 00000011 00000021
00000009 00000014 00000010 0000004B
00000009 00000015 0000000F 0000009F
00000009 00000016 0000000E 00000147
00000009 00000017 0000000D 00000297
00000009 00000018 0000000C 00000537
00000009 00000019 0000000B 00000A77
00000009 0000001A 0000000A 000014F7
00000009 0000001B 00000009 000029F7
00000009 0000001C 00000008 000053F7
00000009 0000001D 00000007 0000A7F7
00000009 0000001E 00000006 00014FF7
00000009 0000001F 00000005 00029FF7
00000009 00000020 00000004 00053FF7
00000009 00000021 00000003 000A7FF7
00000009 00000022 00000002 0014FFF7
00000009 00000023 00000001 0029FFF7
00000009 00000024 00000000 0053FFF7
00000009 00000025 FFFFFFFF 00A7FFF7
00000009 00000026 FFFFFFFE 014FFFF7
basically it is a little deterministic with some rules, but if the comparison doesnt happen then the loop may run forever or at least many many cycles.

Resources