What I get after I compile the c file? - c

I use gcc compiled the hello.c:
dele-MBP:temp ldl$ ls
a.out hello.c
now, when I cat a.out:
$ cat a.out
??????? H__PAGEZERO?__TEXT__text__TEXTP1P?__stubs__TEXT??__stub_helper__TEXT???__cstring__TEXT??__unwind_info__TEXT?H??__DATA__nl_symbol_ptr__DATA__la_symbol_ptr__DATH__LINKEDIT ?"? 0 0h ? 8
P?
/usr/lib/dyldס??;K????t22
?*(?P
8??/usr/lib/libSystem.B.dylib&`)h UH??H?? ?E??}?H?u?H?=5??1ɉE??H?? ]Ð?%?L?yAS?%i?h?????Hello
P44?4
there shows the messy code.
I want to know what type of the a.out? is it assembly language? if is why there have so many ??? or %%%?

There are several intermediate file formats, depending on the compiler system you use. Most systems use the following steps, here shown with GCC as example:
Preprocessed C source (gcc -E test.c -o test.i), but this is before compilation, strictly speaking
Assembly source (gcc -S test.c -o test.s)
Object file containing machine code, not executable because calls to external functions are not resolved (gcc -c test.c -o test.o)
Executable file containing machine code (gcc test.c -o test)
Only the first two steps generate text files that you could read by cat or in a text editor. This is BTW a valuable source for insight. However, you can use objdump to see most informations contained in the other formats. Please read its documentation.
Each step does also all steps before it. So (gcc test.c -o test) generates assembly source and object file in temporary files that are removed automatically. You can watch that process by giving GCC the option -v.
Use gcc --help to see some entry points for further investigations.
There is at lot more to say about this process but it would fill a book.

Related

Why gcc compiler giving the complied file a new name?

I have reinstalled mingw in my system and downloaded the gcc compiler.
I was shocked after compiling the first file which was "subject.c" but the name of the compiled file which gcc returned was "a.exe". It should be "subject.exe" but do not know why this happened.
Can anyone please explain the reason behind this ?
expected:
gcc subject.c
ls
subject.c subject.exe
tried:
gcc subject.c
ls
subject.c a.exe
-o can be used to give the name of the output file.
For example,
gcc -Wall -Wextra -pedantic subject.c -o subject.exe
(Do enable your compiler's warnings!)
gcc names its output files, in the absence of other instructions, a.out or a.exe depending on system environment because that is what it's supposed to do.
To override this default behavior, you can use the -o flag which tells gcc that the next argument is the desired name for the output file. For instance:
gcc -o subject.exe subject.c
There is no automatic functionality built into gcc to strip a source file of its file extension and add .exe to the end but this can be done manually with Makefiles or other similar scripts, for instance you can write a Makefile with the following contents:
%.exe: %.c
gcc -o $# $<
Then a command like make subject.exe would be translated to gcc -o subject.exe subject.c, which may be what you're looking for.
There is functionality built into gcc to strip source files of their extensions during different parts of the compilation process, which may have been what confused you. For instance a call like gcc -c subject.c can be expected to produce an object file called subject.o, likewise gcc -S subject.c can be expected to produce an assembly language file called subject.s, however this does not apply to executable files not only for historical reasons, but because programs can be compiled from multiple source files and there is not always a clear way to choose a name for the executable output.

How to remove 'bloat' from a compiled shared object?

I have a gcc C application which compiles to a shared object using the -fpic
option. The intent is to create a 'executable' which allows running the code anywhere in the memory.This is how a sample C program is compiled.
./armeb-eabi-gcc -march=armv5t -mbig-endian -nostdlib -fpic -c main.c
main.c
int main(){
void (*UART)() = 0x594323 | 1;
UART("Hello");
}
The problem is the compiled executable has 'bloat' where i am only looking for machine code and no symbols. I was unable to extract the exact portions from objcopy and objdump which did absolutely nothing. The file size is around 948 bytes which is insane for such simple program.
Here is a snippet of the 'portion' of the file i am looking for.
(The exact highlighted parts could be skewed)
Running
objcopy -I elf32-big -O binary main.o test.bin
gives a 64 byte file which for some odd reason moves part of the string to the top of the file which makes tools like ghidra and ida unable to disassemble properly.
Hopefully it can be seen that the reference to "Hello" is incorrect.

How to generate assembly from a cross-compiled binary?

How to generate assembly from a cross-compiled binary?
Compile command used is:
arm-none-linux-gnueabi-gcc test.c -o test
How can I disassemble the binary test?
I have used:
objdump -d test -m arm
But it says:
test: file format elf32-little
objdump: can't use supplied machine arm
Any help?
GCC generates the assembly already, you only need to tell it not to throw the files away when finished:
arm-none-linux-gnueabi-gcc -save-temps test.c -o test
Note that the generated files will only contain the assembly language of your code and not the stuff that is linked in from the C libraray, e.g. for printf().
To see the full disassembly including library code, you can use arm-none-linux-gnueabi-objdump -d test.
Side note: "test" is a bad example binary name, as there is a binary named test already in /bin/ or /usr/bin/ on any unix or linux system.

gcc generates different results for different filenames

Why does gcc generate different executables for different sourcefilenames?
to test I have this c-programm called test.c and test2.c:
int main(){}
"gcc test.c -o test" and "gcc test2.c -o test2" generate different output files. Using a hex-editor I can see that there still is its source-filename hidden in it. Stripping the files still results in different results (the source-filename is gone). Why does gcc operate this way? I tested clang and tcc as well. Clang behaves the like gcc does, whereas tcc generates the same results for different filenames?
gcc version 4.9.1 (Debian 4.9.1-1)
clang 3.4.2-4
tcc version 0.9.25
Doing a diff on the hexdump of both binaries shows a small difference at around offset 0x0280. Looking through the sections (via objdump -x), the differences appear in the .note.gnu.build-id section. My guess is that this provides some sort of UUID for distinguishing different builds of otherwise similar code, as well as validate debug info (referenced here, about a third of the way down).
The -o option of gcc is to specify the output file. If you give him different -o targets, it will generate different files.
gcc test.c -o foo
And you have a foo executable.
Also, note that without a -o option, gcc will output a a.outexecutable.

Generate assembler code from C file in linux

I would like to know how to generate assembler code from a C program using Unix.
I tried the gcc: gcc -c file.c
I also used firstly cpp and then try as but I'm getting errors.
I'm trying to build an assembler program from 3 different programs
prog1.c prog2.c prog.h
Is it correct to do gcc -S prog1.c prog2.c prog.h?
Seems that is not correct. I don't know if I have to generate the assembler from each of them and then link them
Thanks
According the manual:
`-S'
Stop after the stage of compilation proper; do not assemble. The
output is in the form of an assembler code file for each
non-assembler input file specified.
By default, the assembler file name for a source file is made by
replacing the suffix `.c', `.i', etc., with `.s'.
Input files that don't require compilation are ignored.
so try gcc -S file.c.
From man gcc:
-S Stop after the stage of compilation proper; do not
assemble. The output is an assembler code file for
each non-assembler input file specified.
By default, GCC makes the assembler file name for a
source file by replacing the suffix `.c', `.i',
etc., with `.s'. Use -o to select another name.
GCC ignores any input files that don't require com-
pilation.
If you're using gcc (as it seems) it's gcc -S.
Don't forget to specify the include paths with -I if needed.
gcc -I ../my_includes -S my_file.c
and you'll get my_file.s with the Assembler instructions.
objdump -d also works very nicely, and will give you the assembly listing for the whole binary (exe or shared lib).
This can be a lot clearer than using the compiler generated asm since calls to functions within the same source file can show up not yet resolved to their final locations.
Build your code with -g and you can also add --line and/or --source to the objdump flags.

Resources