Optimized code on Unix? - c

What is the best and easiest method to debug optimized code on Unix which is written in C?
Sometimes we also don't have the code for building an unoptimized library.

This is a very good question. I had similar difficulties in the past where I had to integrate 3rd party tools inside my application. From my experience, you need to have at least meaningful callstacks in the associated symbol files. This is merely a list of addresses and associated function names. These are usually stripped away and from the binary alone you won't get them... If you have these symbol files you can load them while starting gdb or afterward by adding them. If not, you are stuck at the assembly level...
One weird behavior: even if you have the source code, it'll jump forth and back at places where you would not expect (statements may be re-ordered for better performance) or variables don't exist anymore (optimized away!), setting breakpoints in inlined functions is pointless (they are not there but part of the place where they are inlined). So even with source code, watch out these pitfalls.
I forgot to mention, the symbol files usually have the extension .gdb, but it can be different...

This question is not unlike "what is the best way to fix a passenger car?"
The best way to debug optimized code on UNIX depends on exactly which UNIX you have, what tools you have available, and what kind of problem you are trying to debug.
Debugging a crash in malloc is very different from debugging an unresolved symbol at runtime.
For general debugging techniques, I recommend this book.
Several things will make it easier to debug at the "assembly level":
You should know the calling
convention for your platform, so you
can tell what values are being passed
in and returned, where to find the
this pointer, which registers are "caller saved" and which are "callee saved", etc.
You should know your OS "calling convention" -- what a system call looks like, which register a syscall number goes into, the first parameter, etc.
You should
"master" the debugger: know how to
find threads, how to stop individual
threads, how to set a conditional
breakpoint on individual instruction, single-step, step into or skip over function calls,
etc.
It often helps to debug a working program and a broken program "in parallel". If version 1.1 works and version 1.2 doesn't, where do they diverge with respect to a particular API? Start both programs under debugger, set breakpoints on the same set of functions, run both programs and observe differences in which breakpoints are hit, and what parameters are passed.

Write small code samples by the same interfaces (something in its header), and call your samples instead of that optimized code, say simulation, to narrow down the code scope which you debug. Furthermore you are able to do error enjection in your samples.

Related

How can I know where function ends in memory(get the address)- c/c++

I'm looking for a simple way to find function ending in memory. I'm working on a project that will find problems on run time in other code, such as: code injection, viruses and so fourth. My program will run with the code that is going to be checked on run time, so that I will have access to memory. I don't have access to the source code itself. I would like to examine only specific functions from it. I need to know where functions start and end in stack. I'm working with windows 8.1 64 bit.
In general, you cannot find where the function is ending in memory, because the compiler could have optimized, inlined, cloned or removed that function, split it in different parts, etc. That function could be some system call mostly implemented in the kernel, or some function in an external shared library ("outside" of your program's executable)... For the C11 standard (see n1570) point of view, your question has no sense. That standard defines the semantics of the language, i.e. properties on the behavior of the produced program. See also explanations in this answer.
On some computers (Harvard architecture) the code would stay in a different memory, so there is no point in asking where that function starts or ends.
If you restrict your question to a particular C implementation (that is a specific compiler with particular optimization settings, for a specific operating system and instruction set architecture and ABI) you might (in some cases, not in all of them) be able to find the "end of a function" (but that won't be simple, and won't be failproof). For example, you could post-process the assembler code and/or the object file produced by the compiler, inspect the ELF executable and its symbol table, examine DWARF debug information, etc...
Your question smells a lot like some XY problem, so you should motivate it, whith a lot more explanation and context.
I need to know where functions start and end in stack.
Functions don't sit on the stack, but mostly in the code segment of your executable (or library). What is on the call stack is a sequence of call frames. The organization of the call frames is specific to your ABI. Some compiler options (e.g. -fomit-frame-pointer) would make difficult to explore the call stack (without access to the source code and help from the compiler).
I don't have access to the source code itself. I would like to examine only specific functions from it.
Your problem is still ill-defined, probably undecidable, much more complex than what you believe (since related to the halting problem), and there is considerable literature related to it (read about decompiler, static code analysis, anti-virus & malware analysis). I recommend spending several months or years learning more about compilers (start with the Dragon Book), linkers, instruction set architecture, ABIs. Then look into several proceedings of conferences related to ACM SIGPLAN etc. On a practical side, study the assembler code generated by compilers (e.g. use GCC with gcc -O2 -S -fverbose-asm....); the CppCon 2017 talk: Matt Godbolt “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid” is a nice introduction.
I'm working on a project that will find problems on run time in other code, such as: code injection, viruses and so fourth.
I hope you can dedicate several years of full time work to your ambitious project. It probably is much more difficult than what you thought, because optimizing compilers are much more complex than what you believe (and malware software uses various complex tricks to hide itself from inspection). Malware research is really difficult, but interesting.

What files need to be modified to compile for a custom architecture of an existing cpu with gcc?

I've been looking at examples of C code that is compiled for some lesser known processors (like ZPU) using the gcc cross compiler.
Most of the working examples I see assume a certain arquitecture (Memory map and set of peripherals) and simply give you a recipe to compile for these and they work.
However I can find very little information on what needs to modified if you use the same cpu with a different memory map and set of peripherals.
From what I've read. There are two main files that I need to make sure that are done "right". The linker script that is used and the crt0.o (Which if I need to modify means recompiling the crt0.S which is assembler). On this last one, especially I find very little information on what is actually supposed to do (other that setting up reset there is no clear info, and I'm talking conceptually not for an specific processor. Although something for this would also be useful).
Can any one tell me what is the relationship between a the c files for the code of program (bare metal development), the crt0.S (specially why it is needed) and it's relationship with a working linker script?
PD: Answers of the form "read this book" are welcome and I would love them.
PD: I realize this kind of question is usually vague and closed quickly but I don't know where else to turn, so I ask for a bit of leniency.

Getting type information of C symbols

Let me try to give some background first. I'm working on some project with some micro controller (AVR) which I'm accessing through some interface (UART). I'm doing direct writes to its global variables and I'm also able to directly execute functions (write args, trigger execution, read back return values).
AVR code is in C compiled with GCC toolchain. PC, that is communicating with it, is running python code. As of now I have imported adress & size information into python easily by parsing 'objdump -x' output. Now what would greatly boost my development would be information about types of the symbols (types & sizes of structs elements, enums values, functions arguments & return values, ...).
Somehow this seemed like a common thing that people do daily, and I was naively expecting ready-made python tools at start. Well, not so easy. By now I've spend many hours looking into various ways how to accomplish that.
One approach would be to just parse the C code (using e.g. pycparser). But seems like I would have to at least 'pre-parse' the code to exclude various unsupported constructs and various ordering problems and so on. Also, in theory, the problem would be if compiler would do some optimizations, like struct or enum reordering and so on.
I've been also looking into various gcc, gdb and objdump options to get such information. Have spent some time looking for tools for extracting information from various debugging formats (dwarf, stabs).
The closest I get so far is to dump stabs debugging information with objdump -g option. This outputs C-like information, which I would then parse using pycparser or on my own.
But before I spent my time doing that, I decided to raise a question here, strongly hoping that someone will hit me with possibly totally different approach I just haven't think of.
There's a quite nice tool called c2ph that dumps a parsable descripton of the types and sizes (using debug info as the source)
To answer myself... this is what I found:
http://code.google.com/p/pydevtools/
Actually I knew about it before, but it didn't really work for me at first.
So basically I made it Python 3 compatible and did few other fixes/changes also - here you can get it all:
http://code.google.com/p/pydevtools/source/checkout
Actually there is some more code which actually uses this module, but it is not finished yet. I will probably add it when finished.

How do i compile a c program without all the bloat?

I'm trying to learn x86. I thought this would be quite easy to start with - i'll just compile a very small program basically containing nothing and see what the compiler gives me. The problem is that it gives me a ton of bloat. (This program cannot be run in dos-mode and so on) 25KB file containing an empty main() calling one empty function.
How do I compile my code without all this bloat? (and why is it there in the first place?)
Executable formats contain a bit more than just the raw machine code for the CPU to execute. If you want that then the only option is (I think) a DOS .com file which essentially is just a bunch of code loaded into a page and then jumped into. Some software (e.g. Volkov commander) made clever use of that format to deliver quite much in very little executable code.
Anyway, the PE format which Windows uses contains a few things that are specially laid out:
A DOS stub saying "This program cannot be run in DOS mode" which is what you stumbled over
several sections containing things like program code, global variables, etc. that are each handled differently by the executable loader in the operating system
some other things, like import tables
You may not need some of those, but a compiler usually doesn't know you're trying to create a tiny executable. Usually nowadays the overhead is negligible.
There is an article out there that strives to create the tiniest possible PE file, though.
You might get better result by digging up older compilers. If you want binaries that are very bare to the bone COM files are really that, so if you get hold of an old compiler that has support for generating COM binaries instead of EXE you should be set. There is a long list of free compilers at http://www.thefreecountry.com/compilers/cpp.shtml, I assume that Borland's Turbo C would be a good starting point.
The bloated module could be the loader (operating system required interface) attached by linker. Try adding a module with only something like:
void foo(){}
and see the disassembly (I assume that's the format the compiler 'gives you'). Of course the details vary much from operating systems and compilers. There are so many!

Find unused functions in a C project by static analysis

I am trying to run static analysis on a C project to identify dead code i.e functions or code lines that are never ever called. I can build this project with Visual Studio .Net for Windows or using gcc for Linux. I have been trying to find some reasonable tool that can do this for me but so far I have not succeeded. I have read related questions on Stack Overflow i.e this and this and I have tried to use -Wunreachable-code with gcc but the output in gcc is not very helpful. It is of the following format
/home/adnan/my_socket.c: In function ‘my_sockNtoH32’:
/home/adnan/my_socket.c:666: warning: will never be executed
but when I look at line 666 in my_socket.c, it's actually inside another function that is being called from function my_sockNtoH32() and will not be executed for this specific instance but will be executed when called from some other functions.
What I need is to find the code which will never be executed. Can someone plz help with this?
PS: I can't convince management to buy a tool for this task, so please stick to free/open source tools.
If GCC isn't cutting it for you, try clang (or more accurately, its static analyzer). It (generally, your mileage may vary of course) has a much better static analysis than GCC (and produces much better output). It's used in Apple's Xcode but it's open-source and can be used seperately.
When GCC says "will never be executed", it means it. You may have a bug that, in fact, does make that dead code. For example, something like:
if (a = 42) {
// some code
} else {
// warning: unreachable code
}
Without seeing the code it's not possible to be specific, of course.
Note that if there is a macro at line 666, it's possible GCC refers to a part of that macro as well.
GCC will help you find dead code within a compilation. I'd be surprised if it can find dead code across multiple compilation units. A file-level declaration of a function or variable in a compilation unit means that some other compilation unit might reference it. So anything declared at the top level of a file, GCC can't eliminate, as it arguably only sees one compilation unit at a time.
The problem gets get harder. Imagine that compilation unit A declares function a, and compilation unit B has a function b that calls a. Is a dead? On the face of it, no. But in fact, it depends; if b is dead, and the only reference to a is in b, then a is dead, too. We get the same problem if b merely takes &a and puts it into an array X. Now to decide if a is dead, we need a points-to analysis across the entire system, to see if that pointer to a is used anywhere.
To get this kind of accurate "dead" information, you need a global view of the entire set of compilation units, and need to compute a points-to analysis, followed by the construction of a call-graph based on that points-to analysis. Function a is dead only if the call graph (as a tree,
with main as the root) doesn't reference it somewhere.
(Some caveats are necessary: whatever the analysis is, as a practical matter it must be conservative, so even a full-points to analysis may not identify a function correctly as dead. You also have to worry about uses of a C artifact from outside the set of C functions, e.g., a call to a from some bit of assembler code).
Threading makes this worse; each thread has some root function which is probably at the top of the call DAG. Since how a thread gets started isn't defined by C compilers, it should be clear that to determine if a multithreaded C application has dead code, somehow the analysis has to be told the thread root functions, or be told how to discover them by looking for thread-initialization primitives.
You aren't getting a lot responses on how to get a correct answer. While it isn't open source, our DMS Software Reengineering Toolkit with its C Front End has all the machinery to do this, including C parsers, control- and dataflow- analysis, local and global points-to analysis, and global call graph construction. DMS is easily customized to include extra information such as external calls from assembler, and/or a list of thread roots or specific source-patterns that are thread initialization calls, and we've actually done that (easily) for some large embedded engine controllers with millions of lines of code. DMS has been applied to systems as large as 26 million lines of code (some 18,000 compilation units) for the purpose of building such calls graphs.
[An interesting aside: in processing individual comilation units, DMS for scaling reasons in effect deletes symbols and related code that aren't used in that compilation unit. Remarkably, this gets rid of about 95% of code by volume when you take into account the iceberg usually hiding in the include file nest. It says C software typically has poorly factored include files. I suspect you all know that already.]
Tools like GCC will remove dead code while compiling. That's helpful, but the dead code is still lying around in your compilation unit source code using up developer's attention (they have to figure out if it is dead, too!). DMS in its program transformation mode can be configured, modulo some preprocessor issues, to actually remove that dead code from the source. On very large software systems, you don't really want to do this by hand.

Resources