"SO" file conversion to readable format - c

Is there any way to convert ".so" file into source code or some in readable format.

Source code is probably hard, since the .so doesn't "know" which language it was written in.
But you can browse around in the assembly code by doing something like this:
$ objdump --disassemble my_secret.so | less

Related

what are dump and auxillary files?

Im a newcommer to Linux and the gcc commands. I was reading the
gcc documentation particularly about the -o flag where it mentions the following:
Though -o names only the primary output, it also affects the naming of
auxiliary and dump outputs. See the examples below. Unless overridden,
both auxiliary outputs and dump outputs are placed in the same
directory as the primary output. In auxiliary outputs, the suffix of
the input file is replaced with that of the ...
They mention it quite a lot following this paragraph but don't explain it. I've skimmed the document and also looked online but haven't found any satisfactory explanation. If someone could provide me some explanation or even link me to some resources where I can learn about these terms it would be greatly appreciated. Thanks!
-o file
Place the output in file. This applies regardless of the type of output produced, whether it is an executable file, an object file, an assembler file or preprocessed C code.
Since only one output file can be specified, it makes no sense to use -o when compiling more than one input file, unless you want to output an executable file.
If -o is not specified, the default behavior is to produce an executable file named a.out, an object file for source.suffix named source.o, its assembler file in source.s, and all C source code preprocessed on standard output.
source: http://www.linuxcertif.com/man/1/gcc/
hope it will be useful

mruby: generating readable c code

I am beginning with mruby, and I need a little in generating readable .c code using mrbc. I was following this article :
Here it is mentioned :
$ mruby/bin/mrbc -Cinit_tester test_program.rb
will produce test_program.c with some content.
but on my machine when I run this command it says :
mrbc: output file should be specified to compile multiple files
Then I tried
$ mruby/bin/mrbc -Binit_tester test_program.rb
which works , generates c files but its contents are only bytecode:
#include <stdint.h>
const uint8_t init_tester[] = {0x45,0x54,0x49,0x52,0x30,0x30,0x30,0x33,0x73,0x0d,0x00,0x00,0x00,0x65,0x4d,0x41,0x54,0x5a,0x30,0x30,0x30,0x30,0x49,0x52,0x45,0x50,0x00,0x00,0x00,0x47,0x30,0x30,0x30,0x30,0x00,0x00,0x00,0x3f,0x00,0x01,0x00,0x04,0x00,0x00,0x00,0x00,0x00,0x04,0x06,0x00,0x80,0x00,0x3d,0x00,0x00,0x01,0xa0,0x00,0x80,0x00,0x4a,0x00,0x00,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x0b,0x68,0x65,0x6c,0x6c,0x6f,0x20,0x77,0x6f,0x72,0x6c,0x64,0x00,0x00,0x00,0x01,0x00,0x04,0x70,0x75,0x74,0x73,0x00,0x45,0x4e,0x44,0x00,0x00,0x00,0x00,0x08, };
Which is basically byte code of the mruby code that we have put in c code.
If you look at the blog m under section Readable C Code (.c), this should have actually generated c code.
why is the mrbc not generating readable c code ?
why is the mrbc not generating readable c code?
Well, mrbc is a compiler to generate the binary format of ruby code with RiteVM understands so there is no way of generating a readable C code.
Instead with -v option you can see AST and VM codes of your code
(I prefer to pass -c option too since mrbc will generate *.mrb files without it) .

how to find source file name from executable?

IN LINUX:
Not sure if it is possible. I have 100 source file, and 100 respective executable files.
Now, given the executable file, is it possible to determine, respective source file.
I guess you can give this a try.
readelf -s a.out | grep FILE
I think you can add some grep and sed magic to the above command and get the source file name.
No, since your assumption, that a single binary comes from exactly one source file, is very false.
Most real applications consist of hundreds, if not thousands, of individual source files that are all compiled separately, with the results liked together to form the binary.
If you have non-stripped binaries, or (even better) binaries compiled with debugging information present, then there might (or will, for the case of debugging info) be information left in the file to allow you to figure out the names of the source files, but in general you won't have such binaries unless you build them yourself.
If source filenames are present in an executable, you can find them with:
strings executable | grep '\.c'
But filenames may or may not be present in the executable and they may or may not represent the source filenames.
Change .c to whatever extension you assume the program has been written in.
Your question only makes sense if we presume that it is a given fact that every single one of these 100 executables comes from a single source file, and that you have all those source files and are capable of compiling them all.
What you can do is to declare within each source file a string that looks like "HERE!HERE!>>>" + __FILE__ and then write a utility which searches for "HERE!HERE!>>>" inside the executable and parses the string which follows it. __FILE__ is a preprocessor directive which expands to the full pathname of the source file being compiled.
This kind of help falls in the 'close the barn door after the horse has run away' kind of thing, but it might help future posters.
This is an old problem. UNIX and Linux support the what command which was invented by Mark Rochkind (if I remember correctly), for his version of SCCS. Handles exactly this type of problem. It is only 100% reliable for one source file -> one exectuable (or object file ) kind of thing. There are other more important uses.
char unique_id[] = "#(#)identification information";
The #(#) is called a "what string" and does not occur as a by-product of compiling source into an executable image. Use what from the command line. Inside code use maybe something like this (assumes you get only one file name as an answer, therefore choose your what strings carefully):
char *foo(char *whoami, size_t len_whoami)
{
char tmp[80]={0x0};
FILE *cmd;
sprintf(tmp, "/usr/bin/grep -F -l '%s' /path/to/*.c", unique_id);
cmd=popen(tmp, "r");
fgets(whoami, len_whoami, cmd);
pclose(cmd);
return whoami;
}
will return the source code file name with the same what string from which your executable was built. In other words, exactly what you asked, except I'm sure you never heard of what strings, so they do not exist in your current code base.

linking *.o files in Windows

When I'm linking .o files with the LD linker using MinGW on Windows, it gives me the error "file.o: File not recognized: file format not recognized". I've tried to do it with cygwin instread, but the same thing happens. Any suggestions?
Most likely you have a object file in a format that the linker does not understand. There are lots of different formats out there: COFF, OMF, ELF (the list goes on..)
Fortunately there is a free tool that lets you convert from one format to another. It also lets you take a look into the internals of the object format and tells you in which format a object file is encoded.
http://www.agner.org/optimize/#objconv
That little command line utility solved all the object format problems I ever had. It can even disassemble libs, object files, DLLs and executables.

Which program creates a C array given any file?

I remember seeing in the past a program that would take any file and generate a C array representing that file as output; it would prevent distribution of a separate file in some cases. Which Unix/Linux program does that?
xxd -i
For large files, converting to text and then making the compiler parse it all over again is inefficient and unnecessary. Use objcopy instead:
objcopy -I binary -O elf32-i386 stuff stuff.o
(Adjust the output architecture as necessary for non-x86 platforms.) Then once you link it into your program, you can access it like so:
extern char _binary_stuff_start[], _binary_stuff_end[];
#define SIZE_OF_STUFF (_binary_stuff_end - _binary_stuff_start)
...
foo(_binary_stuff_start[i]);
hexdump -v -e '16/1 "0x%x," "\n"'
would generate a C like array from stdin, but there is no declaration, no braces or good formatting.
I know this is Unix/Linux question, but anyone viewing this that wants to do the same in Windows can use Bin2H.
The easiest way is:
xxd -i -a filename

Resources