I'm trying to figure out why the binaries generated by GCC are so large.
Consider this empty program:
int main() {
return 0;
}
Now I build it with GCC 9.2.1 20190827 (Red Hat 9.2.1-1) and glibc 2.29 without any additional parameters:
gcc -o test test.c
The resulting binary is 21984 bytes (~22 KB). Looking at the generated file with xxd, there are long runs of null-bytes in multiple places:
00000370: 006c 6962 632e 736f 2e36 005f 5f6c 6962 .libc.so.6.__lib
00000380: 635f 7374 6172 745f 6d61 696e 0047 4c49 c_start_main.GLI
00000390: 4243 5f32 2e32 2e35 005f 5f67 6d6f 6e5f BC_2.2.5.__gmon_
000003a0: 7374 6172 745f 5f00 0000 0200 0000 0000 start__.........
000003b0: 0100 0100 0100 0000 1000 0000 0000 0000 ................
000003c0: 751a 6909 0000 0200 1d00 0000 0000 0000 u.i.............
000003d0: f03f 4000 0000 0000 0600 0000 0100 0000 .?#.............
000003e0: 0000 0000 0000 0000 f83f 4000 0000 0000 .........?#.....
000003f0: 0600 0000 0200 0000 0000 0000 0000 0000 ................
00000400: 0000 0000 0000 0000 0000 0000 0000 0000 ................
<3040 bytes of zeroes>
00000ff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001000: f30f 1efa 4883 ec08 488b 05e9 2f00 0048 ....H...H.../..H
<not zeroes>
00001190: f30f 1efa c300 0000 f30f 1efa 4883 ec08 ............H...
000011a0: 4883 c408 c300 0000 0000 0000 0000 0000 H...............
000011b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
<3632 bytes of zeros>
00001ff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00002000: 0100 0200 0000 0000 0000 0000 0000 0000 ................
00002010: 011b 033b 3400 0000 0500 0000 10f0 ffff ...;4...........
<not zeroes>
000020e0: 410e 2842 0e20 420e 1842 0e10 420e 0800 A.(B. B..B..B...
000020f0: 1000 0000 ac00 0000 98f0 ffff 0500 0000 ................
00002100: 0000 0000 0000 0000 0000 0000 0000 0000 ................
<3376 bytes of zeroes>
00002e40: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00002e50: 0011 4000 0000 0000 d010 4000 0000 0000 ..#.......#.....
...
So the resulting binary has around 10 KB, or almost half, of nothing in it.
Looking with size -A, the size is more like what one would expect from a program doing nothing else than returning an exit code:
test :
section size addr
.interp 28 4194984
.note.ABI-tag 32 4195012
.note.gnu.build-id 36 4195044
.gnu.hash 28 4195080
.dynsym 72 4195112
.dynstr 56 4195184
.gnu.version 6 4195240
.gnu.version_r 32 4195248
.rela.dyn 48 4195280
.init 27 4198400
.text 373 4198432
.fini 13 4198808
.rodata 16 4202496
.eh_frame_hdr 52 4202512
.eh_frame 192 4202568
.init_array 8 4210256
.fini_array 8 4210264
.dynamic 400 4210272
.got 16 4210672
.got.plt 24 4210688
.data 4 4210712
.bss 4 4210716
.comment 44 0
.gnu.build.attributes 4472 4218912
Total 5991
When cross-compiling for PowerPC using GCC 9.2.0 and musl 1.1.23 it's even worse. Size of the binary grows to 67872 bytes (~67 KB), and looking with xxd, there is a continuous run of 64074 bytes of only zeroes.
Still, size -A reports even smaller sizes for this version:
test :
section size addr
.interp 26 268435796
.note.gnu.build-id 36 268435824
.hash 36 268435860
.dynsym 64 268435896
.dynstr 39 268435960
.rela.plt 12 268436000
.init 28 268436012
.text 496 268436048
.fini 28 268436544
.eh_frame_hdr 28 268436572
.eh_frame 80 268436600
.init_array 4 268566284
.fini_array 4 268566288
.dynamic 216 268566292
.branch_lt 8 268566508
.got 12 268566516
.plt 4 268566528
.data 4 268566532
.bss 28 268566536
.comment 17 0
Total 1170
I also tried to compile the program with an old version of GCC which I happened to have handy: GCC 4.7.2 with uClibc 1.0.12. With this combination, the resulting binary is only 4769 bytes (~4 KB), and has no apparent runs of null-bytes in it.
Just to make sure that this doesn't only happen on tiny programs that do nothing, I looked at some real programs that I have cross-compiled with GCC 9.2.0 and musl 1.1.23. For example, tcpdump binary, compiled using -Os and stripped, contains a 32628 byte long continous run of null-bytes. So, why are zeroes trying to consume all of my disk space?
Recent binutils defaults to -z separate-code, which adds additional PT_LOAD segments to the program which need further alignment.
You can override the default like this:
gcc -Wl,-z,noseparate-code -o test test.c
Due to alignment requirements, some zeros will still remain with this change.
Answer from Florian Weimer helped me to the right direction. The culprit was not -z separate-code, but -z relro.
By adding -Wl,-z,norelro to PowerPC GCC options, file size for an empty program dropped from 67872 bytes to 3772 bytes! On x64 the impact was smaller: from 21984 to 18584 bytes. On a small, but actually functional, program the difference on PowerPC was around 50 % smaller, and with tcpdump, which I compared before, it's almost 32 KB.
The relro linker option apparently creates a new segment, which is used to remap the global offset table and mark it as read-only, which protects the program from stack overflowing attack. This explanation is most likely inaccurate; I didn't understand much of what I read while trying to figure it out.
The size difference on PPC is exactly 62 KB. Why such a large area is created, I have no idea.
Although the setting would be good to be kept enabled as a hardening measure, unfortunately my target board has only 11 MB of available flash, and I'm trying to fit a Linux-based system on it, so every byte counts, and I will disable the setting to keep the binary sizes down.
So, why are zeroes trying to consume all of my disk space?
Because on most modern systems 22K extra bytes on disk are immaterial.
Some of the costs you observe are due to dynamic linking, some due to padding, some are to help you with debugging (e.g. .comment, .note.gnu.build-id, .eh_frame*.).
I can get the binary down to 624 bytes by not using libc and linking statically and stripping:
cat t.c
void _start()
{
__asm__("movq $60,%rax; xorq %rdi,%rdi; syscall");
}
gcc -O3 t.c -static -nostdlib -Wl,-z,noseparate-code,--build-id=none &&
strip --strip-all a.out &&
./a.out && ls -l a.out
-rwxr-x--- 1 me mygroup 624 Nov 25 19:34 a.out
There is still .comment and .eh_frame which could be removed.
Related
I want to use Note section of ELF file to propagate some information within my tool set. Similar method is used by Microsoft tools and by NASM in COFF modules, where payload data in section .drectve contains linker parameters, for instance the text /IMPORT:ExitProcess.
First I used NASM on Linux to create 32bit ELF module with Note section .drectve:
me#vm:~/$ cat NOTE32.asm
BITS 32
SECTION .drectve
DD 7 ; namesz = size of "ABCDEFG".
DD 8 ; descsz = size of payload.
DD 9 ; type = randomly chosen value.
DB "ABCDEFG",0 ; name = owner; aligned size=8.
DB "payload." ; desc = useful contents; aligned size=8.
me#vm:~/$ nasm -f ELF32 NOTE32.asm -o NOTE32.o
As NASM cannot create NOTE-type sections directly, I had to edit the output file with hexaeditor
and rewrite section type from SHT_PROGBITS (1) to SHT_NOTE (7).
Readelf then displayed my handcrafted note correctly, although it couldn't interpret my arbitrary chosen Owner and Type, of course:
me#vm:~/$ readelf -hSn NOTE32.o
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 64 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 5
Section header string table index: 2
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .drectve NOTE 00000000 000110 00001c 00 A 0 0 1
[ 2] .shstrtab STRTAB 00000000 000130 000024 00 0 0 1
[ 3] .symtab SYMTAB 00000000 000160 000030 10 4 3 4
[ 4] .strtab STRTAB 00000000 000190 00000c 00 0 0 1
Displaying notes found at file offset 0x00000110 with length 0x0000001c:
Owner Data size Description
ABCDEFG 0x00000008 Unknown note type: (0x00000009)
me#vm:~/$
So far, so good. Then I repeated the process with 64bit ELF module, where the fields in NOTE section are 8 bytes wide, according to chapter 9. of ELF-64 Object File Format on page 13:
Sections of type SHT_NOTE and segments of type PT_NOTE are used by
compilers and other tools to mark an object file with special
information that has special meaning to a particular tool set. These
sections and segments contain any number of note entries, each of
which is an array of 8-byte words in the byte order defined in the ELF
file header. The format of a note entry is shown in Figure 7.
me#vm:~/$ cat NOTE64.asm
BITS 64
SECTION .drectve
DQ 7 ; namesz = size of "ABCDEFG".
DQ 8 ; descsz = size of payload.
DQ 9 ; type = randomly chosen value.
DB "ABCDEFG",0 ; name = owner; aligned size=8.
DB "payload." ; desc = useful contents; aligned size=8.
me#vm:~/$ nasm -f ELF64 NOTE64.asm -o NOTE64.o
Here is the file dump with SHT_PROGBITS at offset 84h manually rewritten to SHT_NOTE:
me#vm:~/$ xxd NOTE64.o
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0100 3e00 0100 0000 0000 0000 0000 0000 ..>.............
00000020: 0000 0000 0000 0000 4000 0000 0000 0000 ........#.......
00000030: 0000 0000 4000 0000 0000 4000 0500 0200 ....#.....#.....
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000080: 0100 0000 0700 0000 0200 0000 0000 0000 ................
00000090: 0000 0000 0000 0000 8001 0000 0000 0000 ................
000000a0: 2800 0000 0000 0000 0000 0000 0000 0000 (...............
000000b0: 0800 0000 0000 0000 0800 0000 0000 0000 ................
000000c0: 0a00 0000 0300 0000 0000 0000 0000 0000 ................
000000d0: 0000 0000 0000 0000 b001 0000 0000 0000 ................
000000e0: 2400 0000 0000 0000 0000 0000 0000 0000 $...............
000000f0: 0100 0000 0000 0000 0000 0000 0000 0000 ................
00000100: 1400 0000 0200 0000 0000 0000 0000 0000 ................
00000110: 0000 0000 0000 0000 e001 0000 0000 0000 ................
00000120: 4800 0000 0000 0000 0400 0000 0300 0000 H...............
00000130: 0400 0000 0000 0000 1800 0000 0000 0000 ................
00000140: 1c00 0000 0300 0000 0000 0000 0000 0000 ................
00000150: 0000 0000 0000 0000 3002 0000 0000 0000 ........0.......
00000160: 0c00 0000 0000 0000 0000 0000 0000 0000 ................
00000170: 0100 0000 0000 0000 0000 0000 0000 0000 ................
00000180: 0700 0000 0000 0000 0800 0000 0000 0000 ................
00000190: 0900 0000 0000 0000 4142 4344 4546 4700 ........ABCDEFG.
000001a0: 7061 796c 6f61 642e 0000 0000 0000 0000 payload.........
000001b0: 002e 6472 6563 7476 6500 2e73 6873 7472 ..drectve..shstr
000001c0: 7461 6200 2e73 796d 7461 6200 2e73 7472 tab..symtab..str
000001d0: 7461 6200 0000 0000 0000 0000 0000 0000 tab.............
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 0100 0000 0400 f1ff ................
00000200: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000210: 0000 0000 0300 0100 0000 0000 0000 0000 ................
00000220: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000230: 004e 4f54 4536 342e 6173 6d00 0000 0000 .NOTE64.asm.....
me#vm:~/$
GNU readelf (GNU Binutils for Ubuntu) 2.26.1 interprets the 64bit NOTE section .drectve incorrectly:
me#vm:~/$ readelf -hSnW NOTE64.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 64 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 5
Section header string table index: 2
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .drectve NOTE 0000000000000000 000180 000028 08 A 0 0 8
[ 2] .shstrtab STRTAB 0000000000000000 0001b0 000024 00 0 0 1
[ 3] .symtab SYMTAB 0000000000000000 0001e0 000048 18 4 3 4
[ 4] .strtab STRTAB 0000000000000000 000230 00000c 00 0 0 1
Displaying notes found at file offset 0x00000180 with length 0x00000028:
Owner Data size Description
0x00000000 Unknown note type: (0x00000008)
readelf: Warning: note with invalid namesz and/or descsz found at offset 0x14
readelf: Warning: type: 0x474645, namesize: 0x00000000, descsize: 0x44434241
me#vm:~/$
Apparently readelf misinterprets the section contents at file offset 180h as two NOTE arrays of 32bit DWORDs, the second entry starting at file offset 194h.
I have also tried to change OS/ABI value from UNIX - System V (0) to UNIX - GNU (3) in ELF_header.e_ident.EI_OSABI but with no effect.
Now I am in dilemma whether I should
generate DWORD fields in both ELF32 and ELF64 formats produced by my tool (which contradicts ELF64 specification), or
keep QWORD fields in ELF64 and face readelf's complains.
From /usr/include/elf.h:
typedef uint32_t Elf64_Word;
...
typedef struct
{
Elf64_Word n_namesz; /* Length of the note's name. */
Elf64_Word n_descsz; /* Length of the note's descriptor. */
Elf64_Word n_type; /* Type of the note. */
} Elf64_Nhdr;
Clearly the size of the 64-bit n_namesz etc. is 4 bytes, not 8.
The source you cite is:
not authoritative and
wrong
A more authoritative source states:
"For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in the format of the target processor."
I have spent the past few days experimenting with assembly, and now understand the relationship between assembly and machine code (using x86 via NASM on OSX, reading the Intel docs).
Now I am trying to understand the details of how the linker works, and specifically want to understand the structure of Mach-O object files, starting with the Mach-O headers.
My question is, can you map out how the Mach-O headers below map to the otool command output (which displays the headers, but they are in a different format)?
Some reasons for this question include:
It will help me see how the documents on the "structure of Mach-O headers" look in real-world object files.
It will simplify the path to understanding, so myself and other newcomers don't have to spend many hours or days wondering "do they mean this, or this" type thing. It's hard without previous experience to mentally translate the general Mach-O documentation into an actual object file in the real world.
Below I show the example and process I went through to try to decode the Mach-O header from a real object file. Throughout the descriptions below, I try to show hints of all the little/subtle questions that arise. Hopefully this will provide a sense of how this can be very confusing to a newcomer.
Example
Starting with a basic C file called example.c:
#include <stdio.h>
int
main() {
printf("hello world");
return 0;
}
Compile it with gcc example.c -o example.out, which gives:
cffa edfe 0700 0001 0300 0080 0200 0000
1000 0000 1005 0000 8500 2000 0000 0000
1900 0000 4800 0000 5f5f 5041 4745 5a45
524f 0000 0000 0000 0000 0000 0000 0000
0000 0000 0100 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 1900 0000 2802 0000
5f5f 5445 5854 0000 0000 0000 0000 0000
0000 0000 0100 0000 0010 0000 0000 0000
0000 0000 0000 0000 0010 0000 0000 0000
0700 0000 0500 0000 0600 0000 0000 0000
5f5f 7465 7874 0000 0000 0000 0000 0000
5f5f 5445 5854 0000 0000 0000 0000 0000
400f 0000 0100 0000 2d00 0000 0000 0000
400f 0000 0400 0000 0000 0000 0000 0000
0004 0080 0000 0000 0000 0000 0000 0000
5f5f 7374 7562 7300 0000 0000 0000 0000
5f5f 5445 5854 0000 0000 0000 0000 0000
6e0f 0000 0100 0000 0600 0000 0000 0000
6e0f 0000 0100 0000 0000 0000 0000 0000
0804 0080 0000 0000 0600 0000 0000 0000
5f5f 7374 7562 5f68 656c 7065 7200 0000
... 531 total lines of this
Run otool -h example.out, which prints:
example.out:
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x80 2 16 1296 0x00200085
Research
To understand the Mach-O file format, I found these resources helpful:
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/index.html#//apple_ref/doc/uid/TP40000895
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/index.html
https://www.mikeash.com/pyblog/friday-qa-2012-11-30-lets-build-a-mach-o-executable.html
http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h
http://www.opensource.apple.com/source/dtrace/dtrace-78/head/arch.h
http://www.opensource.apple.com/source/xnu/xnu-792.13.8/osfmk/mach/machine.h
Those last 3 from opensource.apple.com contain all the constants, such as these:
#define MH_MAGIC_64 0xfeedfacf /* the 64-bit mach magic number */
#define MH_CIGAM_64 0xcffaedfe /* NXSwapInt(MH_MAGIC_64) */
...
#define CPU_TYPE_MC680x0 ((cpu_type_t) 6)
#define CPU_TYPE_X86 ((cpu_type_t) 7)
#define CPU_TYPE_I386 CPU_TYPE_X86 /* compatibility */
#define CPU_TYPE_X86_64 (CPU_TYPE_X86 | CPU_ARCH_ABI64)
The structure of the Mach-O header is shown as:
struct mach_header_64 {
uint32_t magic; /* mach magic number identifier */
cpu_type_t cputype; /* cpu specifier */
cpu_subtype_t cpusubtype; /* machine specifier */
uint32_t filetype; /* type of file */
uint32_t ncmds; /* number of load commands */
uint32_t sizeofcmds; /* the size of all the load commands */
uint32_t flags; /* flags */
uint32_t reserved; /* reserved */
};
Given this information, the goal was to find each of those pieces of the Mach-O header in the example.out object file.
First: Finding the "magic" number
Given that example and research, I was able to identify the first part of the Mach-O header, the "magic number". That was cool.
But it wasn't a straightforward process. Here are the pieces of information that had to be collected to figure that out.
The first column of the otool output shows "magic" to be 0xfeedfacf.
The Apple Mach-O docs say that the header should be either MH_MAGIC or MH_CIGAM ("magic" in reverse). So found those through google in mach-o/loader.h. Since I am using 64-bit architecture and not 32-bit, went with MH_MAGIC_64 (0xfeedfacf) and MH_CIGAM_64 (0xcffaedfe).
Looked through example.out file and the first 8 hex codes were cffa edfe, which matches MH_CIGAM_64! It's in a different format which throws you off a little bit, but they are 2 different hex formats that are close enough to see the connection. They are also reversed.
Here are the 3 numbers, which were enough to sort of figure out what the magic number is:
0xcffaedfe // value from MH_CIGAM_64
0xfeedfacf // value from otool
cffa edfe // value in example.out
So that's exciting! Still not totally sure if I am coming to the right conclusion about these numbers, but hope so.
Next: Finding the cputype
Now it starts to get confusing. Here are the pieces that needed to be put together to almost make sense of it, but this is where I'm stuck so far:
otool shows 16777223. This apple stackexchange question gave some hints on how to understand this.
Found CPU_TYPE_X86_64 in mach/machine.h, and had to do several calculations to figure out it's value.
Here are the relevant constants to do calculate the value of CPU_TYPE_X86_64:
#define CPU_ARCH_ABI64 0x01000000 /* 64 bit ABI */
#define CPU_TYPE_X86 ((cpu_type_t) 7)
#define CPU_TYPE_I386 CPU_TYPE_X86 /* compatibility */
#define CPU_TYPE_X86_64 (CPU_TYPE_X86 | CPU_ARCH_ABI64)
So basically:
CPU_TYPE_X86_64 = 7 BITWISEOR 0x01000000 // 16777223
That number 16777223 matches what is shown by otool, nice!
Next, tried to find that number in the example.out, but it doesn't exist because that is a decimal number. I just converted this to hex in JavaScript, where
> (16777223).toString(16)
'1000007'
So not sure if this is the correct way to generate a hex number, especially one that will match the hex numbers in a Mach-O object file. 1000007 is only 7 numbers too, so don't know if you are supposed to "pad" it or something.
Anyways, you see this number example.out, right after the magic number:
0700 0001
Hmm, they seem somewhat related:
0700 0001
1000007
It looks like there was a 0 added to the end of 1000007, and that it was reversed.
Question
At this point I wanted to ask the question, already spent a few hours to get to this point. How does the structure of the Mach-O header map to the actual Mach-O object file? Can you show how each part of the header shows up in the example.out file above, with a brief explanation why?
Part of what's confusing you is endianness. In this case, the header is stored in the native format for the platform. Intel-compatible platforms are little-endian systems, meaning the least-significant byte of a multi-byte value is first in the byte sequence.
So, the byte sequence 07 00 00 01, when interpreted as a little-endian 32-bit value, corresponds to 0x01000007.
The other thing you need to know to interpret the structure is the size of each field. All of the uint32_t fields are pretty straightforward. They are 32-bit unsigned integers.
Both cpu_type_t and cpu_subtype_t are defined in machine.h that you linked to be equivalent to integer_t. integer_t is defined to be equivalent to int in /usr/include/mach/i386/vm_types.h. OS X is an LP64 platform, which means that longs and pointers are sensitive to the architecture (32- vs. 64-bit), but int is not. It's always 32-bit.
So, all of the fields are 32 bits or 4 bytes in size. Since there are 8 fields, that's a total of 32 bytes.
From your original hexdump, here's the part which corresponds to the header:
cffa edfe 0700 0001 0300 0080 0200 0000
1000 0000 1005 0000 8500 2000 0000 0000
Broken out by field:
struct mach_header_64 {
uint32_t magic; cf fa ed fe -> 0xfeedfacf
cpu_type_t cputype; 07 00 00 01 -> 0x01000007
cpu_subtype_t cpusubtype; 03 00 00 80 -> 0x80000003
uint32_t filetype; 02 00 00 00 -> 0x00000002
uint32_t ncmds; 10 00 00 00 -> 0x00000010
uint32_t sizeofcmds; 10 05 00 00 -> 0x00000510
uint32_t flags; 85 00 20 00 -> 0x00200085
uint32_t reserved; 00 00 00 00 -> 0x00000000
};
MAGIC or CIGAM gives you hints on byte ordering used in the file. When you read the first four bytes as cffaedfe this means that you should interpret any 4 bytes in little endian. Means that you write numbers with units first, then tenth, etc. So, when you read 07000001 it represents the number 01000007 which is exactly what you were waiting for (1000007) except the leading 0. May I suggest you to read about byte ordering?
After asking about the relation between assembly and machine code, I am beginning to read through the Intel 64 instruction set reference.
There is still a lot to learn here, but after looking through the first two chapters (need to study chapter 2 much more), I don't feel any closer to understanding what the machine code means yet. Maybe after reading all 1300+ pages, and the Art of Assembly, and perhaps a CS architecture course, how this applies in practice will start to make sense.
But in the mean time, can you explain why the numbers in a compiled assembly file (or any "binary" I guess is what you'd call it, which is just machine code in my understanding) is organized into a grid of 8 columns with 4 hexidecimal numbers each? This may be obvious to you but I have no idea if it means anything or not.
cffa edfe 0700 0001 0300 0000 0100 0000
0200 0000 0001 0000 0000 0000 0000 0000
1900 0000 e800 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
2e00 0000 0000 0000 2001 0000 0000 0000
2e00 0000 0000 0000 0700 0000 0700 0000
0200 0000 0000 0000 5f5f 7465 7874 0000
0000 0000 0000 0000 5f5f 5445 5854 0000
0000 0000 0000 0000 0000 0000 0000 0000
2000 0000 0000 0000 2001 0000 0000 0000
5001 0000 0100 0000 0005 0080 0000 0000
0000 0000 0000 0000 5f5f 6461 7461 0000
0000 0000 0000 0000 5f5f 4441 5441 0000
0000 0000 0000 0000 2000 0000 0000 0000
0e00 0000 0000 0000 4001 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0200 0000 1800 0000
5801 0000 0400 0000 9801 0000 1c00 0000
e800 0000 00b8 0400 0002 bf01 0000 0048
be00 0000 0000 0000 00ba 0e00 0000 0f05
4865 6c6c 6f2c 2077 6f72 6c64 210a 0000
1100 0000 0100 000e 0700 0000 0e01 0000
0500 0000 0000 0000 0d00 0000 0e02 0000
2000 0000 0000 0000 1500 0000 0200 0000
0e00 0000 0000 0000 0100 0000 0f01 0000
0000 0000 0000 0000 0073 7461 7274 0077
7269 7465 006d 6573 7361 6765 006c 656e
6774 6800
More specifically...
As pointed out in the selected answer in the other question about the relation between assembly and machine code, all the information is at least somewhere in the Intel docs. For example, at the beginning of Chapter 2, they say these things:
LOCK prefix is encoded using F0H.
REPNE/REPNZ prefix is encoded using F2H...
The LOCK prefix (F0H) forces an operation that ensures exclusive use of shared memory in a multiprocessor environment...
Repeat prefixes (F2H, F3H) cause an instruction to be repeated for each element of a string...
I understand that by F0H, they really just mean "f0 which is a hexidecimal number in case that isn't clear". So then you can find that number a couple of times in the machine code above. For example, near the bottom in the 6th column is bf01.
Without knowing much more than this, I am trying to put together the very specific (but not very practical) intel docs with some actual machine code, so I can start to really "get" how the intel docs are actually applied.
As a first step in that process of understanding, I am wondering this:
Is the f0 in that bf01 the same thing that the intel docs are describing? That is, is it the LOCK prefix F0H? Or if not, how do you know that?
Why are the numbers in a grid of 8 columns of 4 numbers each?
If f0 in the bf01 chunk does mean that LOCK prefix, why is it starting at an odd position (that is, it's not starting at an even position like position 0 or 2 in a column)? This is the main reason for this whole question. If it can appear at an odd position, then is breaking them into 8 columns of 4 numbers each just arbitrary (i.e. just makes it look pretty), because if all opcodes are at least 2 characters, then it would never appear at an odd position.
Why are the numbers in a grid of 8 columns of 4 numbers each?
This is how you, or the tool you're using, is choosing to display them. I personally would display individual bytes rather than two-byte words. I would choose the number of columns depending on how I am going to display or print out the hex dump.
The best to study hex dumps of machine code is using a disassembler. There is an online one here. For example, it disassembles the following hex dump
55 31 D2 89 E5 8B 45 08 56 8B 75 0C 53 8D 58 FF
0F B6 0C 16 88 4C 13 01 83 C2 01 84 C9 75 F1 5B
5E 5D C3
to
.data:0x00000000 55 push ebp
.data:0x00000001 31d2 xor edx,edx
.data:0x00000003 89e5 mov ebp,esp
.data:0x00000005 8b4508 mov eax,DWORD PTR [ebp+0x8]
.data:0x00000008 56 push esi
.data:0x00000009 8b750c mov esi,DWORD PTR [ebp+0xc]
.data:0x0000000c 53 push ebx
.data:0x0000000d 8d58ff lea ebx,[eax-0x1]
.data:0x00000010
.data:0x00000010 loc_00000010:
┏▶ .data:0x00000010 0fb60c16 movzx ecx,BYTE PTR [esi+edx*1]
┃ .data:0x00000014 884c1301 mov BYTE PTR [ebx+edx*1+0x1],cl
┃ .data:0x00000018 83c201 add edx,0x1
┃ .data:0x0000001b 84c9 test cl,cl
┗ .data:0x0000001d 75f1 jne loc_00000010
.data:0x0000001f 5b pop ebx
.data:0x00000020 5e pop esi
.data:0x00000021 5d pop ebp
why the numbers in a compiled assembly file ... is [sic] organized into a grid of 8 columns with 4 hexidecimal [sic] numbers each?
An arbitrary convenient arrangement. Generically binary files have no structure other than order (like a queue or stream).
Is the f0 in that bf01 the same thing that the intel docs are describing? That is, is it the LOCK prefix F0H? Or if not, how do you know that?
No. F0 is one byte. bf01 is two bytes--bf and 01.
Why are the numbers in a grid of 8 columns of 4 numbers each?
See above.
More importantly, compiled programs contain more information than simply binary machine code. They also contain loading information, static data, sometimes a table of symbols, external linkage requirements, etc. etc. So picking any arbitrary byte in an executable file may not be a machine code.
In my spare time, I have been working on implementing a BitTorrent client in C. Currently it communicates with the tracker, connects to the swarm, requests pieces of the torrent file from peers, and receives pieces of the torrent file. However, when it comes to verifying that the received piece is correct (by taking a SHA1 hash and comparing it to the hash provided in the .torrent metadata), it always fails.
To debug this, I downloaded a torrent with a known-working BitTorrent client, and then modified my own BitTorrent implementation to request and download only the very beginning of the torrent (the first piece). I then compared the two files with Emacs' hexl-mode.
Known good:
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................
...
00008000: 0143 4430 3031 0100 4c49 4e55 5820 2020 .CD001..LINUX
00008010: 2020 2020 2020 2020 2020 2020 2020 2020
00008020: 2020 2020 2020 2020 5562 756e 7475 2031 Ubuntu 1
00008030: 312e 3034 2069 3338 3620 2020 2020 2020 1.04 i386
My implementation:
00000000: a616 f132 7f00 0080 5066 0000 0000 0080 ...2....Pf......
00000010: 5066 0000 0000 0060 3b62 0000 0000 0098 Pf.....`;b......
00000020: 3b62 0000 0000 00d0 3b62 0000 0000 0008 ;b......;b......
00000030: 3c62 0000 0000 0040 3c62 0000 0000 0078 <b.....#<b.....x
00000040: 3c62 0000 0000 00b0 3c62 0000 0000 00e8 <b......<b......
00000050: 3c62 0000 0000 0020 3d62 0000 0000 0058 <b..... =b.....X
00000060: 3d62 0000 0000 0090 3d62 0000 0000 00c8 =b......=b......
00000070: 3d62 0000 0000 0000 3e62 0000 0000 0038 =b......>b.....8
...
0000d000: 0243 4430 3031 0100 004c 0049 004e 0055 .CD001...L.I.N.U
0000d010: 0058 0020 0020 0020 0020 0020 0020 0020 .X. . . . . . .
0000d020: 0020 0020 0020 0020 0055 0062 0075 006e . . . . .U.b.u.n
0000d030: 0074 0075 0020 0031 0031 002e 0030 0034 .t.u. .1.1...0.4
0000d040: 0020 0069 0033 0038 0000 0000 0000 0000 . .i.3.8........
I figured, then, that I must be writing the received piece to the incorrect offset, resulting in the correct data occuring at the wrong location in the file. To verify this, I fired up gdb and inspected the very beginning of the first piece after receiving it from a peer, expecting it to contain all zeroes, like the beginning of the known-good file.
(gdb) break network.c:40
Breakpoint 1 at 0x402fe7: file network.c, line 40.
(gdb) run
Starting program: /home/robb/slug/slug
[Thread debugging using libthread_db enabled]
[New Thread 0x7fffcb58d700 (LWP 12936)]
[Thread 0x7fffcb58d700 (LWP 12936) exited]
ANNOUNCE: 50 peers.
CONNECTED: 62.245.41.28
CONNECTED: 89.178.142.45
CONNECTED: 66.65.166.17
...
UNCHOKE: 95.26.0.1
Requested piece 0 from peer 95.26.0.1.
UNCHOKE: 202.231.116.163
PIECE: #0 from 95.26.0.1
Breakpoint 1, handle_piece (p=0x42d7e0) at network.c:41
41 memcpy(p->torrent->mmap + length, &p->message[9], REQUEST_LENGTH);
(gdb) p off
$1 = 0
(gdb) p index
$2 = 0
(gdb) p p->message[9]
$3 = 46 '.'
(gdb) p p->message[10]
$4 = 67 'C'
(gdb) p p->message[11]
$5 = 0 '\000'
(gdb) p p->message[12]
$6 = 0 '\000'
(gdb) p p->message[13]
$7 = 0 '\000'
(gdb) p p->message[14]
$8 = 0 '\000'
(gdb) p p->message[15]
$9 = 0 '\000'
(gdb) p p->message[16]
$10 = 128 '\200'
(gdb) p p->message[17]
$11 = 46 '.'
(gdb) p p->message[18]
$12 = 67 'C'
As you can see, the data I received from the peer doesn't contain all zeroes like the beginning of the known-good file. Why?
The full source of my program is availabe at https://github.com/robertseaton/slug.
This fails to take into account that bufferevent_read may fail and return a negative amount:
void get_msg (struct bufferevent* bufev, struct Peer* p)
{
uint64_t amount_read = p->message_length - p->amount_pending;
int64_t message_length = bufferevent_read(bufev, &p->message[amount_read], p->amount_pending);
Replace with:
void get_msg (struct bufferevent* bufev, struct Peer* p)
{
uint64_t amount_read = p->message_length - p->amount_pending;
int64_t message_length = bufferevent_read(bufev, &p->message[amount_read], p->amount_pending);
/* possible bufferevent_read found nothing */
if (message_length < 0)
message_length = 0;
Reading the source I found this in network.c:
memcpy(&index, &p->message[1], sizeof(index));
memcpy(&off, &p->message[5], sizeof(off));
index = ntohl(index);
off = ntohl(off);
length = index * p->torrent->piece_length + off;
#ifdef DEBUG
if (off == 0)
printf("PIECE: #%d from %s\n", index, inet_ntoa(p->addr.sin_addr));
#endif
memcpy(p->torrent->mmap + length, &p->message[9], REQUEST_LENGTH);
p->torrent->pieces[index].amount_downloaded += REQUEST_LENGTH;
I think the last two lines are intended to be:
memcpy(p->torrent->mmap + length, &p->message[9], length);
p->torrent->pieces[index].amount_downloaded += length;
BTW REQUEST_LENGTH = 16K.
More probably this "length-thing" should be p->message_length, or (p->message_length - 9)
The other bug is probably a strlen()+1 type of bug.
gcc 4.6.0
What does binary data look like? Is it all 1's and 0's.
I was just wondering, as I was talking to another programmer about copying strings and binary data.
Normally, I use strcpy/strncpy functions to copy strings and memcpy/memmove to copy binary data. However, I am just wondering what does it looks like?
Many thanks for any suggestions,
depends on what you're using to view it. here it's in hexadecimal and ASCII:
jcomeau#intrepid:~$ xxd /bin/bash | head -n 10
0000000: 7f45 4c46 0101 0100 0000 0000 0000 0000 .ELF............
0000010: 0200 0300 0100 0000 5021 0608 3400 0000 ........P!..4...
0000020: 345c 0c00 0000 0000 3400 2000 0800 2800 4\......4. ...(.
0000030: 1c00 1b00 0600 0000 3400 0000 3480 0408 ........4...4...
0000040: 3480 0408 0001 0000 0001 0000 0500 0000 4...............
0000050: 0400 0000 0300 0000 3401 0000 3481 0408 ........4...4...
0000060: 3481 0408 1300 0000 1300 0000 0400 0000 4...............
0000070: 0100 0000 0100 0000 0000 0000 0080 0408 ................
0000080: 0080 0408 c013 0c00 c013 0c00 0500 0000 ................
0000090: 0010 0000 0100 0000 c013 0c00 c0a3 1008 ................
here's another way to view it:
jcomeau#intrepid:~$ convert -size 640x$(($(stat -c %s /bin/bash)/640)) \
-depth 8 gray:/bin/bash /tmp/bash.png
jcomeau#intrepid:~$ firefox /tmp/bash.png
Binary data is just a way of saying that it's data which is not text. In other words, it doesn't actually give you a lot of insight as to what the data is, rather it gives you insight as to what the data isn't.
The odd bit is that, in the most technical sense of the words, text is also binary data.
In this context, "binary data" is typically just data which could contain null bytes (e.g, '\0'). String manipulation functions like strcpy() and strncpy() will stop when they see these characters, whereas byte manipulation functions like memcpy() and memmove() will always continue for the number of bytes you tell them to.
Binary data is just that, data encoded in binary form. To better view what the contents of a binary file look like, you would need a hex editor like Hiew Editor for Windows or hexedit for linux.
It's all ones and zeros. But the ones and zeros live differently on your computer. The CPU sees the ones and zeros differently from DRAM and both of these are differently encoded in the hard drive.