Checking if an ELF is packed with UPX in Linux - c

I have zero knowledge of how the ELF format works or how to access its headers and data via code, however I need to check whether an ELF binary has been compressed (packed?) with UPX for Linux.
Checking the binary with strings I saw the string UPX! so I guess I can use that. Hexediting the binary shows the string and for the position in the binary I can assume it's part of one of ELF's headers (please correct me if I am wrong). This is a dump of that:
00000000 .ELF........................4...
00000020 ........4. ...(.................
00000040 ........................#...#...
00000060 #.....................[.UPX!....
00000080 ............T............?d..ELF
I don't know if this looks good, sorry.
Does anyone know how to detect UPX on Linux? If not, how to access the headers and get that UPX! string (name of the header?)?
I did look into the UPX source code but everything is C++, I am looking to code this in C, and it's really hard to follow.
Thank, any help is welcomed.
EDIT: About the bounty. They answer must give a solid example that works since I've tried different approaches and they not always work, like the sample below.
Thank you

These are the tests to detect an UPX compressed file:
>>>>(0x3c.l+0xf8) string UPX0 \b, UPX compressed
>>>>(0x3c.l+0xf8) search/0x140 UPX2
>>>(&0x7c.l+0x26) string UPX \b, UPX compressed
>>>&0x26 string UPX \b, UPX compressed
>>85 string UPX \b, UPX compressed
use
man 5 magic
to see how the offsets inside the file are specified.
For example in you program you should:
open the file under test for reading
skip to one of these offsets
check if the expected string is there
repeat until no more offsets
Interestingly enough, in my ubuntu 64bit, UPX compressed files are not detected because this test is missing from /usr/share/misc/magic:
>>180 string UPX! UPX compressed (64-bit)

In the source code to UPX, there's a function int PackW32Pe::canUnpack() which is first ran as a test right when you do a upx -d <file> (unpack executable). It shows which offsets are to be tested to detect if a file was packed with UPX. I found the code clear and easy to follow. I recommend an editor with syntax highlighting.
You can download the source code for UPX on the project site.

Related

How to check the given file is binary file or not in C programming?

I'm trying to check the given file is binary or not.
I refer the link given below to find the solution,
How can I check if file is text (ASCII) or binary in C
But the given solutions is not working properly, If I pass the .c file as argument, Its not working, It gives wrong output.
The possible files I may pass as argument:
a.out
filename.c
filename.txt
filename.pl
filename.php
So I need to know whether there is any function or way to solve the problem?
Thanks...
Note : [ Incase of any query, Please ask me before down vote ]
You need to clearly define what a binary file is for you, and then base your checking on that.
If you just want to filter based on file extensions, then create a list of the ones you consider binary files, and a list of the ones you don't consider binary files and then check based on that.
If you have a list of known formats rather then file extensions, attempt to parse the file in the formats, and if it doesn't parse / the parse result makes no sense, it's a binary file (for your purposes).
Depending on your OS, binary files begin with a header specifying that they are an executable and contain several informations about them (architecture, size, byte order etc.). So by trying to parse this header, you should know if a file is binary or not. If you are on mac, check the Mach-O file format, if you are on Linux, it should be ELF format. But beware, it's a lot of documentation.

How to save IAR IDE disassembly window contents to a file?

Using IAR IDE for building ARM executables from C source, I can see the disassembly, including labels, addresses, opcode and instructions in the relevant window.
I am trying to dump the contents of a range of addresses to a text file, but can't find a way to do that. The window text is not selectable so I cannot use copy/paste. There is no menu associated that enables this.
As an alternative, I can generate the list and assembly files, but these seem to be limited to my code, and do not contain the CRT code or any ROM sections, which I am interested in.
Any way to dump a selected address range?
You want to use ielfdumparm located in your Workbench directory under arm/bin. Here's the help for the tool.
Usage: IElfDump input_file [output_file]
Available command line options:
--all Dump all sections
--code Dump only code sections
--no_header Do not produce a list header
--no_rel_sections
Do not output associated .rel sections
--no_strtab Do not include strtab sections
--output file
-o file Name of text file to create
--raw Use raw text format
--section #|name[,...]
-s #|name[,...] Dump only section(s) with given numbers/names
--source Include source in disassembled code in executables
--use_full_std_template_names
Don't use short names for standard C++ templates
-a All sections, except strtab sections
-f file Read command line options from file
To get a similar output to the debug view, I would suggest --code to avoid dumping your data space, and --source to have it embed your original C woven in with the assembly.
You can specify sections, but it doesn't look like you can specify address range. You may be able to pair this with some of the other ELF tools to extract just a specific address range, and then run this tool on that. Alternatively, this dumps in address order so you could dump the entire ELF file and then just look at the address range you want after the fact.
I use Snagit to capture text that is not selectable.
Snagit is a screen snapshot tool (a very good one). Besides making classic screen shots it supports to capture text and save it as ASCII text. It can also automatically scroll windows to capture long texts.
Maybe it is worth a try. There is a 30-day trial version available.

how to find source file name from executable?

IN LINUX:
Not sure if it is possible. I have 100 source file, and 100 respective executable files.
Now, given the executable file, is it possible to determine, respective source file.
I guess you can give this a try.
readelf -s a.out | grep FILE
I think you can add some grep and sed magic to the above command and get the source file name.
No, since your assumption, that a single binary comes from exactly one source file, is very false.
Most real applications consist of hundreds, if not thousands, of individual source files that are all compiled separately, with the results liked together to form the binary.
If you have non-stripped binaries, or (even better) binaries compiled with debugging information present, then there might (or will, for the case of debugging info) be information left in the file to allow you to figure out the names of the source files, but in general you won't have such binaries unless you build them yourself.
If source filenames are present in an executable, you can find them with:
strings executable | grep '\.c'
But filenames may or may not be present in the executable and they may or may not represent the source filenames.
Change .c to whatever extension you assume the program has been written in.
Your question only makes sense if we presume that it is a given fact that every single one of these 100 executables comes from a single source file, and that you have all those source files and are capable of compiling them all.
What you can do is to declare within each source file a string that looks like "HERE!HERE!>>>" + __FILE__ and then write a utility which searches for "HERE!HERE!>>>" inside the executable and parses the string which follows it. __FILE__ is a preprocessor directive which expands to the full pathname of the source file being compiled.
This kind of help falls in the 'close the barn door after the horse has run away' kind of thing, but it might help future posters.
This is an old problem. UNIX and Linux support the what command which was invented by Mark Rochkind (if I remember correctly), for his version of SCCS. Handles exactly this type of problem. It is only 100% reliable for one source file -> one exectuable (or object file ) kind of thing. There are other more important uses.
char unique_id[] = "#(#)identification information";
The #(#) is called a "what string" and does not occur as a by-product of compiling source into an executable image. Use what from the command line. Inside code use maybe something like this (assumes you get only one file name as an answer, therefore choose your what strings carefully):
char *foo(char *whoami, size_t len_whoami)
{
char tmp[80]={0x0};
FILE *cmd;
sprintf(tmp, "/usr/bin/grep -F -l '%s' /path/to/*.c", unique_id);
cmd=popen(tmp, "r");
fgets(whoami, len_whoami, cmd);
pclose(cmd);
return whoami;
}
will return the source code file name with the same what string from which your executable was built. In other words, exactly what you asked, except I'm sure you never heard of what strings, so they do not exist in your current code base.

"SO" file conversion to readable format

Is there any way to convert ".so" file into source code or some in readable format.
Source code is probably hard, since the .so doesn't "know" which language it was written in.
But you can browse around in the assembly code by doing something like this:
$ objdump --disassemble my_secret.so | less

Detecting UPX programmatically

I'm trying to figure out how to detect whether a binary has been compressed with UPX. I am using a simple CRC to detect whether my app was in any way changed and if the CRC failed on the size due to a packer I would like to detect that as OK.
Right now I am starting with UPX.
So, is there any marker on the binary? are there any specific JMP or other instructions that I should search?
This will mainly be tested in Windows, but in the future I might add it to Linux as well.
Any help (and code) is appreciated.
ADDED:
I found that in the 10 binaries I checked the
AddressOfEntryPoint
Import Directory RVA
Resouce Directory RVA
either point to UPX or have an offset that is set by UPX. Any information on this?
Thanks
Download upx source code from UPX Homepage and open src/p_w32pe.cpp file; the function you are looking for is;
int PackW32Pe::canUnpack()
This function checks if the file is compressed with win32 upx.
You might try checking the section names of the executable. UPX changes them to UPX0, UPX1, UPX2, I believe.

Resources