How to see the instructions inside of compiled language executable files? - c

When i compile a C/C++ source file the compiler generates another executable file. How to see the instructions of that file? What is the process known as?
gcc hello.c -o hello
./hello
Here, the first after executing the first line a file name 'hello' gets generated. I need to see the instructions of this 'hello' file.

The executable a.out file is in binary format.
You can open that in any text editor(Ex: vi, vim etc) or hex editor but you won't be able to understand the contents.
You can use some commands to get more information about what is contained in the executable file.
Some example commands are: nm, strings, objdump
Example:
$ nm a.out
$ strings a.out
$ objdump -xD --demangle a.out
Read their manual to know more about them

Related

What I get after I compile the c file?

I use gcc compiled the hello.c:
dele-MBP:temp ldl$ ls
a.out hello.c
now, when I cat a.out:
$ cat a.out
??????? H__PAGEZERO?__TEXT__text__TEXTP1P?__stubs__TEXT??__stub_helper__TEXT???__cstring__TEXT??__unwind_info__TEXT?H??__DATA__nl_symbol_ptr__DATA__la_symbol_ptr__DATH__LINKEDIT ?"? 0 0h ? 8
P?
/usr/lib/dyldס??;K????t22
?*(?P
8??/usr/lib/libSystem.B.dylib&`)h UH??H?? ?E??}?H?u?H?=5??1ɉE??H?? ]Ð?%?L?yAS?%i?h?????Hello
P44?4
there shows the messy code.
I want to know what type of the a.out? is it assembly language? if is why there have so many ??? or %%%?
There are several intermediate file formats, depending on the compiler system you use. Most systems use the following steps, here shown with GCC as example:
Preprocessed C source (gcc -E test.c -o test.i), but this is before compilation, strictly speaking
Assembly source (gcc -S test.c -o test.s)
Object file containing machine code, not executable because calls to external functions are not resolved (gcc -c test.c -o test.o)
Executable file containing machine code (gcc test.c -o test)
Only the first two steps generate text files that you could read by cat or in a text editor. This is BTW a valuable source for insight. However, you can use objdump to see most informations contained in the other formats. Please read its documentation.
Each step does also all steps before it. So (gcc test.c -o test) generates assembly source and object file in temporary files that are removed automatically. You can watch that process by giving GCC the option -v.
Use gcc --help to see some entry points for further investigations.
There is at lot more to say about this process but it would fill a book.

What is the difference between executable files?

I have the following C program:
#include<stdio.h>
int main()
{
printf("hhhh");
return 0;
}
Commands to compile, copy and compare:
$ gcc print.c -o a.out
$ objcopy a.out b.out
$ cmp a.out b.out
I have compiled this program and created an executable. Then, I have used the objcopy command to make a copy of the executable. But, when I compare these files, I get this:
files differ: byte 41, line 1
How can I know what contents are missing?
Any help or pointers would be appreciated. Thanks!
How can I know what contents are missing?
What made you believe that any contents is missing?
The way objcopy works is:
parse the contents of the input file into internal representation.
copy parts of the original file to the output file as instructed by options
Nowhere does objdump guarantee that when "copy entire file" is given, the result will be bit-identical.
In particular, non-loadable sections could be reordered or changed in other ways.
The difference is EntSize of .init_array section is 0 bytes in a.out file and it is 8 bytes in the b.out
The EntSize of 0 doesn't make sense for a non-empty section. If you really have such section in your a.out, it's likely that your linker has a bug.

Why only a.out is being created everytime when I make different programs in directory? [duplicate]

This question already has answers here:
Why do some compilers use "a.out" as the default name for executables?
(2 answers)
Closed 6 years ago.
I am learning C in Linux Mint, so I made a directory in which I place my programs. Whenever I compile a program, everytime a.out is being over-written with the new compiled program.
For ex. To compile a hello.c file I run command: cc hello.c, Now this program will create a.out, but I want it to be hello.out
Why is it?
How can I compile so that hello.c should create hello.out file?
Why only a.out is being created everytime when I make different programs in directory?
Because that's the default behavior of your compiler.
How can I compile so that hello.c should create hello.out file?
In general, refer to the documentation for the compiler you're using, which will tell you how to do this.
Assuming you're using gcc or similar, it's the -o option:
gcc hello.c -o hello.out
Default execution in Unix/Linux is a.out file. If you create your own executable then compile program like:
cc hello.c -o hello.out
./hello.out // manually created executable file
or
cc hello.c -o hello
./hello // manually created executable file
Unix/Linux doesn't care about extensions. -o hello basically your suggested name for the executable file that gcc would create.
Why every time a.out is created?
a.out remains the default output file name for executables created by certain compilers/linkers when no output name is specified, even though these executables are no longer in the a.out format.
Please see the wiki page of a.out.
Share your compilation command.
In the compilation command you can mention what the name of the output binary file, like: "gcc hello.c -o hello" then the binary file name will be (hello) because you mention after flag "-o" that you want to name the output file by the name "hello".
If you don't add the flag "-o" with a name, then the default name for the binary file is "a.out".

How to check if a macro exists in an object file in C?

For example, I define a macro:
#ifdef VERSION
//.... do something
#endif
How can I check if VERSION exist in my object file or not? I tried to disassemble it with objdump, but found no actual value of my macro VERSION. VERSION is defined in Makefile.
Try compiling with -g3 option in gcc. It stores macro information too in the generated ELF file.
After this, if you've defined a macro MACRO_NAME just grep for it in the output executable or your object file. For example,
$ grep MACRO_NAME a.out # any object file will do instead of a.out
Binary file a.out matches
Or you can even try,
$ strings -a -n 1 a.out | grep MACRO_NAME
-a Do not scan only the initialized and loaded sections of object files;
scan the whole files.
-n min-len Print sequences of characters that are at least min-len characters long,
instead of the default 4.
The following command displays contents of .debug_macro DWARF section:
$ readelf --debug-dump=macro path/to/binary
or
$ objdump --dwarf=macro path/to/binary
You can also use dwarfdump path/to/binary, but it's not easy to leave only .debug_macro section in the output.

How can I tell if a library was compiled with -g?

I have some compiled libraries on x86 Linux and I want to quickly determine whether they were compiled with debugging symbols.
If you're running on Linux, use objdump --debugging. There should be an entry for each object file in the library. For object files without debugging symbols, you'll see something like:
objdump --debugging libvoidincr.a
In archive libvoidincr.a:
voidincr.o: file format elf64-x86-64
If there are debugging symbols, the output will be much more verbose.
The suggested command
objdump --debugging libinspected.a
objdump --debugging libinspected.so
gives me always the same result at least on Ubuntu/Linaro 4.5.2:
libinspected.a: file format elf64-x86-64
libinspected.so: file format elf64-x86-64
no matter whether the archive/shared library was built with or without -g option
What really helped me to determine whether -g was used is readelf tool:
readelf --debug-dump=decodedline libinspected.so
or
readelf --debug-dump=line libinspected.so
This will print out set of lines consisting of source filename, line number and address if such debug info is included into library, otherwise it'll print nothing.
You may pass whatever value you'll find necessary for --debug-dump option instead of decodedline.
What helped is:
gdb mylib.so
It prints when debug symbols are not found:
Reading symbols from mylib.so...(no debugging symbols found)...done.
Or when found:
Reading symbols from mylib.so...done.
None of earlier answers were giving meaningful results for me: libs without debug symbols were giving lots of output, etc.
nm -a <lib> will print all symbols from library, including debug ones.
So you can compare the outputs of nm <lib> and nm -a <lib> - if they differ, your lib contains some debug symbols.
On OSX you can use dsymutil -s and dwarfdump.
Using dsymutil -s <lib_file> | more you will see source file paths in files that have debug symbols, but only the function names otherwise.
You can use objdump for this.
EDIT: From the man-page:
-W
--dwarf
Displays the contents of the DWARF debug sections in the file, if
any are present.
Answers suggesting the use of objdump --debugging or readelf --debug-dump=... don't work in the case that debug information is stored in a file separate from the binary, i.e. the binary contains a debug link section. Perhaps one could call that a bug in readelf.
The following code should handle this correctly:
# Test whether debug information is available for a given binary
has_debug_info() {
readelf -S "$1" | grep -q " \(.debug_info\)\|\(.gnu_debuglink\) "
}
See Separate Debug Files in the GDB manual for more information.
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/debugging
The command readelf -wi file is a good verification of debuginfo, compiled within your program.

Resources