How to remove 'bloat' from a compiled shared object? - c

I have a gcc C application which compiles to a shared object using the -fpic
option. The intent is to create a 'executable' which allows running the code anywhere in the memory.This is how a sample C program is compiled.
./armeb-eabi-gcc -march=armv5t -mbig-endian -nostdlib -fpic -c main.c
main.c
int main(){
void (*UART)() = 0x594323 | 1;
UART("Hello");
}
The problem is the compiled executable has 'bloat' where i am only looking for machine code and no symbols. I was unable to extract the exact portions from objcopy and objdump which did absolutely nothing. The file size is around 948 bytes which is insane for such simple program.
Here is a snippet of the 'portion' of the file i am looking for.
(The exact highlighted parts could be skewed)
Running
objcopy -I elf32-big -O binary main.o test.bin
gives a 64 byte file which for some odd reason moves part of the string to the top of the file which makes tools like ghidra and ida unable to disassemble properly.
Hopefully it can be seen that the reference to "Hello" is incorrect.

Related

Does it matter whether a compiled C program ends in .o?

For a hello world program, hello.c, does it matter if I compile it to a file name ending in .o? Or is it just a convention? E.g. should I do this:
gcc -o hello.o hello.c
Or this:
gcc -o hello hello.c
In a Linux environment
The situation here is a bit confusing because there are two kinds of "object files" — those that are truly intermediate object files (the ones normally ending in .o), and final executables.
You can use a typical command-line C compiler in two ways. You can compile to an intermediate object file, using the -c option, and then "link" to a final executable as a second step:
cc -c -o hello.o hello.c # step 1
cc -o hello hello.o # step 2
Or you can compile and link in one fell swoop:
cc -o hello hello.c # step 3
In the first case, when you compile and link in separate steps, the extension .o for the intermediate object file is the very strong convention by which everybody knows that it is in fact an intermediate object file. Notice the difference between steps 2 and 3. In step 3, the way the compiler knows it has some compiling to do is the extension .c. In step 2, on the other hand, the extension .o tells it the file is already compiled, and merely needs to be linked.
(Footnotes: Actually the compiler might assume in step 2 that any unrecognized filename was an intermediate object file to be linked. Also, we're talking about Unix here. Under Windows, the conventional extension for intermediate object files is .obj.)
Also, as you may know, the extension .o is very much the default when compiling only. In step 1, it would have sufficed to just say cc -c hello.c.
The advantage to "separate compilation" is that it gives you a lot more flexibility. If you have a larger program, made from several source files, you could recompile everything, all at once, every time, like this:
cc -o program file1.c file2.c file3.c
But if you compile separately, like this:
cc -c file1.c
cc -c file2.c
cc -c file3.c
cc -o program file1.o file2.o file3.o
then later, when you make a change to, say, file2.c, you can take a shortcut and only recompile that one file. (This does come at the cost of some disk space, to keep all those intermediate .o files around, and some complexity and extra typing, which for larger programs you usually let a build program like make take care of for you.)
Another thing you can do is to compile the same file multiple ways. For example, I often find myself wanting to test a utility function in a "standalone" way. As an (unrealistically simple) example, suppose that file3.c contains a function to multiply a number by two:
int doubleme(int x)
{
return x * 2;
}
Suppose that, elsewhere in file1.c and file2.c, whenever I want to multiply an integer by 2, I call my doubleme function. (Obviously this is completely silly and unrealistic, but it's just an example.)
But suppose you want a way to test the doubleme function, in a standalone way. I will often do something like this. At the end of file3.c, I will add:
#ifdef TEST_MAIN
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int x = atoi(argv[1]);
printf("doubleme(%d) = %d\n", x, doubleme(x));
}
#endif
Now I can compile file3.c in two different ways. If I compile it normally, like this:
cc -c file3.c
then I get file3.o, containing the compiled version of the doubleme function, that I can link in when I build myprogram. Or, I can say
cc -c -DTEST_MAIN -o file3_test.o file3.c
cc -o file3_test file3_test.o
and then I can invoke things like
file3_test 55
to test out the function.
By convention extension (in linux at least) .o implies an Object File, not an executable. So, yes, you can use this extension, as in gcc -o hello.o hello.c, but it's misleading and a bad idea. Better to do gcc -o hello hello.c.
However, if you are building the object file (i.e. compile only, not link) you would use the -c option, as in gcc -c hello.c, which will create the object file hello.o.
(Just summarizing what's already in the comments.)
By convention extension (in linux at least) .o implies an Object File, not an executable. So, yes, you can use this extension, as in gcc -o hello.o hello.c, but it's misleading and a bad idea. Better to do gcc -o hello hello.c.

What I get after I compile the c file?

I use gcc compiled the hello.c:
dele-MBP:temp ldl$ ls
a.out hello.c
now, when I cat a.out:
$ cat a.out
??????? H__PAGEZERO?__TEXT__text__TEXTP1P?__stubs__TEXT??__stub_helper__TEXT???__cstring__TEXT??__unwind_info__TEXT?H??__DATA__nl_symbol_ptr__DATA__la_symbol_ptr__DATH__LINKEDIT ?"? 0 0h ? 8
P?
/usr/lib/dyldס??;K????t22
?*(?P
8??/usr/lib/libSystem.B.dylib&`)h UH??H?? ?E??}?H?u?H?=5??1ɉE??H?? ]Ð?%?L?yAS?%i?h?????Hello
P44?4
there shows the messy code.
I want to know what type of the a.out? is it assembly language? if is why there have so many ??? or %%%?
There are several intermediate file formats, depending on the compiler system you use. Most systems use the following steps, here shown with GCC as example:
Preprocessed C source (gcc -E test.c -o test.i), but this is before compilation, strictly speaking
Assembly source (gcc -S test.c -o test.s)
Object file containing machine code, not executable because calls to external functions are not resolved (gcc -c test.c -o test.o)
Executable file containing machine code (gcc test.c -o test)
Only the first two steps generate text files that you could read by cat or in a text editor. This is BTW a valuable source for insight. However, you can use objdump to see most informations contained in the other formats. Please read its documentation.
Each step does also all steps before it. So (gcc test.c -o test) generates assembly source and object file in temporary files that are removed automatically. You can watch that process by giving GCC the option -v.
Use gcc --help to see some entry points for further investigations.
There is at lot more to say about this process but it would fill a book.

How to compile c program into elf format? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am new to Linux. I am trying to compile my c program into an elf file so I can use read elf to find information about the function,etc. Whenever I try to use the readelf with the output file (after compiling my c program), it says it is not an elf file. So how do I compile my C program so it compiles into an elf file. Or maybe I m not understanding? I am using gcc to compile
Here's my command line for compiling:
gcc -Wall main.c a.out
and then readelf -a a
Ok, so I compiled it with gcc -o test -Wall main.c and it compiled with no errors and then did the readelf with : readelf -a test and it still says it's not an elf and when I do file it comes up with: PE32+ executable (console) x86-64, for MS Windows, so whats's going on here?
Let's try it:
gcc -Wall test.c a.out
gcc: error: a.out: No such file or directory
that's a strong hint that something went wrong... So gcc doesn't produce any a.out file (and if the a.out file already exists, passing it like this tells gcc to try to compile it, and since it's not a valid text/c file, you'll get a.out: file not recognized: File truncated and it will end badly as well).
You need to specify the executable output with -o switch (if you need a.out on unix/linux, just don't type it)
gcc -Wall test.c
will create a.out executable (.elf) if no compilation error is found.
gcc -o myexe -Wall test.c
allows to change executable name.
EDIT: you're not running Linux but Cygwin (on Windows). That doesn't make the answer above invalid, but Cygwin is creating native windows executables, not .elf files. You cannot create .elf files using gcc there (unless you get a windows -> Linux cross-compiler if it exists)
readelf command is present in the Cygwin distro, but won't read programs compiled with gcc. It can analyze .elf files from Linux or other systems using that executable format, but certainly not Windows PE format.

In C how do I compile and then create an executable file with a header and two c files?

I have three C files in total. One is a header [.h] file, two are source [.c] files.
The .h file is called encryption.h and the corresponding source file is encryption.c. The encryption.c has logic, but no main() function. My second c file is called main.c. There I have the main() function that calls methods from encryption.c.
I am compiling these files within terminal on Mac OSx. I am confused on how to compile this, I have tried the following:
gcc -c main.c
gcc -c encryption.c
gcc -c encryption.h
gcc main.o encryption.o encryption.g.gch -o encrypt
This doesn't seem to work though, it says I have a precompiled-header already. I tried finding the answer online, I know it has to be simple, but I haven't had much luck. What is the issue here?
Don't compile the header file. Header files are meant to be included to the source files (using #include directive, in c). Just compile the source files and link them together. Something like
gcc -c main.c
gcc -c encryption.c
gcc main.o encryption.o -o encrypt
or, for shorthand,
gcc main.c encryption.c -o encrypt
Note: If you're bothered about the presence (or absence) of header files while compilation, check the pre-processed output of each source files using gcc -E option.

Linux Novice Question: GCC Compiler output

I am a complete novice with Linux. I have Mint on a laptop and have recently been playing around with it.
I wrote a simple C program and saved the file.
Then in the command line I typed
gcc -c myfile
and out popped a file called a.out. I naively (after years of Windows usage) expected a nice .exe file to appear. I have no idea what to do with this a.out file.
Name it with -o and skip the -c:
gcc -Wall -o somefile myfile
You should name your sourcefiles with a .c extension though.
The typical way of compiling e.g. two source files into an executable:
#Compile (the -c) a file, this produces an object file (file1.o and file2.o)
gcc -Wall -c file1.c
gcc -Wall -c file2.c
#Link the object files, and specify the output name as `myapp` instead of the default `a.out`
gcc -o myapp file1.o file2.o
You can make this into a single step:
gcc -Wall -o myapp file1.c file2.c
Or, for your case with a single source file:
gcc -Wall -o myapp file.c
The -Wall part means "enable (almost) all warnings" - this is a habit you should pick up from the start, it'll save you a lot of headaches debugging weird problems later.
The a.out name is a leftover from older unixes where it was an executable format. Linkers still name files a.out by default, event though they tend to produce ELF and not a.out format executables now.
a.out is the executable file.
run it:
./a.out

Resources