Read instructions being executed - c

As the title suggests, is there any way to read the machine code instructions as/after they have been executed? For example, if I had an arbitrary block of C code and I wanted to know what instructions were compiled and executed when that block was entered then would there be a way to do that? Thank you in advance for any pointers on the subject.
Edit: Some motivation as to what I'm trying to do: I want to have a program that roughly figures out how it has been compiled or what instructions it is currently running without actually needing to know how the machine code is made. I.e. I want to use the hard work that some compiler previously did in compiling a program so that I can copy and later use the machine code being executed.

Little-known fact: GDB has a curses interface built in.
Use gdbtui or gdb,Ctrl+X,Ctrl+A to enter, then Ctrl+X,2 to start showing assembly and source together. The current instruction is highlighted, and you can navigate using the arrow keys.

Almost every debugger can do this.
For gdb, a useful trick to remember is: display/i $pc
Do that once, and then set a breakpoint on a function, the step through the function with stepi and nexti.
The instruction at the PC will be automatically displayed each time.
Ross-Harveys-MacBook-Pro:so ross$ cat > deb.c
int main(void) { return (long)main + 0x123; }
Ross-Harveys-MacBook-Pro:so ross$ cc -O deb.c
Ross-Harveys-MacBook-Pro:so ross$ gdb -q a.out
Reading symbols for shared libraries .. done
(gdb) break main
Breakpoint 1 at 0x100000f30
(gdb) display/i $pc
(gdb) r
Starting program: /Users/ross/so/a.out
Reading symbols for shared libraries +. done
Breakpoint 1, 0x0000000100000f30 in main ()
1: x/i $pc 0x100000f30 <main+4>: lea -0xb(%rip),%rax # 0x100000f2c <main>
(gdb) stepi
0x0000000100000f37 in main ()
1: x/i $pc 0x100000f37 <main+11>: add $0x123,%eax
(gdb) stepi
0x0000000100000f3c in main ()
1: x/i $pc 0x100000f3c <main+16>: leaveq

I can't tell if you're asking about doing this at runtime, or if you want to see a textfile containing the assembly code of your compiled C code.
If the former, just use a debugger (use disassemble in gdb with gcc, or the integrated debugger in Microsoft Visual Studio).
If the latter, you'll have to look up the specific commands for your compiler. With Visual Studio, for example, just use the flag /FAs; this will output the assembly code with your source code. For gcc:
gcc -c -g -Wa,-a,-ad foo.c > foo.lst

Most debuggers have options to view the disassembly of the code you are executing.
Ex: in gdb use the disassemble command.

If you want to know what the execution path was for a particular function, perhaps some processors have such a feature, but generally no. Now what you can do is run in an emulator and modify the emulator to print out the addresses of the fetches or reads or whatever.
If this is just a disassembly question using the gcc/binutils tools objdump -D filename > out.list and not bother executing or using a debugger

Related

GDB doesn't step into a function when use command "s"

I have a problem with GDB. When I use "s" to step into a function called from the main function, GDB jumps to another function without showing me the function that I need.
To be clear, I use step here:
In file main.c:
short c = get(a, b);
Now get has a 36 lines code and at line 27 it calls an other function "swap" here:
In file get.s:
call _swap;
When I use step (s) with GDB on "get", it jumps all of the get function and it shows me the _swap function.
These are three different files: main.c, get.s, and *swap.c compiled in this way:
gcc -g -m32 main.c swap.c get.s -o IA-main
-m32 because get.s is IA-32 assembly. Why does it jump the "get" function and show me only "_swap"?
I work on Mac OS X v10.12.6 (Sierra), so GDB is a little annoying.
From Continuing and Stepping (emphasis mine)
step
Continue running your program until control reaches a different source line, then stop it and return control to GDB. This command is abbreviated s.
Warning: If you use the step command while control is within a function that was compiled without debugging information, execution proceeds until control reaches a function that does have debugging information. Likewise, it will not step into a function which is compiled without debugging information. To step through functions without debugging information, use the stepi command, described below.
You can use the stepi command instead:
stepi
stepi arg
si
Execute one machine instruction, then stop and return to the debugger.
It is often useful to do ‘display/i $pc’ when stepping by machine instructions. This makes GDB automatically display the next instruction to be executed, each time your program stops. See Automatic Display.
An argument is a repeat count, as in step.

GDB doesn't recognize some C functions

So I'm new to Linux and just got Ubuntu 16.04.2 running on a VM. I've installed gcc/g++ on here in the terminal, but when I run my program in GDB, as soon as I get to a strcmp function, this pops up for many lines.
strcmp_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:24
24 ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: No such file or directory.
And when I go further down:
strlen () at ../sysdeps/x86_64/strlen.S:66
66 ../sysdeps/x86_64/strlen.S: No such file or directory.
So I'm guessing it just doesn't recognize my C library..
I realize I can step through this after a couple of tries, but this comes up for all my c functions and when I use GDB on my school server, I don't run into this issue. Any help would be appreciated.
I get to a strcmp function, this pops up for many lines.
When you does s (single step) or si (Step single instruction), what you see for string and memory functions like strcmp, memcpy, memcmp, strlen etc is correct, and GDB does recognize your C library (Ubuntu 16.04.2 amd64 started from iso in VM already has libc6-dbg debugging package preinstalled for your libc - C library).
strcmp_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:24
24 ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S: No such file or directory.
strlen () at ../sysdeps/x86_64/strlen.S:66
66 ../sysdeps/x86_64/strlen.S: No such file or directory.
What we see here is that GDB was able to find debugging information for both functions strcmp and strlen to get line numbers, but these functions of standard C library are not C functions! They are assembler functions (one is optimiezed with SSE2), we can see this from .S suffix of their source reference. You can try to do several s or si after entering to them to see incrementing source file lines.
it just doesn't recognize
GDB did all what it can do: it finds debugging info for your system C library (it is not easiest as debug info is separated to other file somewhere in /usr/lib/debug/lib/x86_64-linux-gnu/ with other name), and finds which instruction comes from which line of source. What it can't do is to open source file, as it is not part of preinstalled ubuntu image not part of any ubuntu (debian) binary package.
What can you do if you want to look inside this system library function:
1) Check disassembly of the function with GDB command disassemble (by default it will print current function). It will be very close to the source of this function implementation as it was originally written in assembler and what you lose are comments and structure of macro:
Dump of assembler code for function strlen:
0x000address70 <+0>: pxor %xmm0, %xmm0
=> 0x000address74 <+0>: pxor %xmm1, %xmm1
0x000address78 <+0>: pxor %xmm2, %xmm2
0x000address7c <+0>: pxor %xmm3, %xmm3
...
2) Or you can see instructions as they are executed with "display" command like display/i $pc or disp/2i $pc (print one instruction at current PC which is universal just name of EIP or RIP; or print two instructions: current and next)
3) Or you can create the path required by gdb and copy original source to it: mkdir -p ../sysdeps/x86_64/ and save to this directory assembler source for your version of library. There is glibc-2.23 version for strlen.S (github mirror of authors GIT): https://github.com/bminor/glibc/blob/glibc-2.23/sysdeps/x86_64/strlen.S#L66
4) Or you can download ubuntu source for libc with apt source libc (in some stable path like ~/src after mkdir ~/src) and point gdb to this directory (adding some real subdirectory accounting to ../ relative part of libc build in ubuntu) with directory ~/src/glibc-2.23/sysdeps)
this comes up for all my c functions
No, for your c functions you have other kind of output (not ... something.S: No such file or directory). And you should enable debugging symbols when you built your program by adding -g argument to gcc (or other compiler).

combination of objdump and gdb

It is about reverse engineering in linux: if I have a .c file and I compile it with gdb all it's fine. But how can I obtain the same result starting from an executable file?
I tried objdump -M intel -D file to disassemble but then I would like to assemble it again in order to open it with gdb (instead if I directly open the executable with gdb I can't do things like putting breakpoints and viewing registers); I tried with nasm and gcc but they found errors in the syntax.
If the symbol table has been stripped off, you cannot get it back.
Anyway, you can set breakpoints in GDB on a specific code address with:
break *address
If you have a hex address, you must precede it with 0x e.g.:
break *0x400506
And to print the current register values, you can use info registers as also answered in How to print register values in gdb?
info registers
NASM and the GNU assembler use different syntax, that why you cannot easily dissamble with the first and assemble with the latter. NASM uses a variant of the Intel syntax. The GNU assembler prefers AT&T syntax.

lldb not stopping on my breakpoint

I have built the Clang program from sources with full debugging information (the default build type for Clang IIUC). I check that debug information is available in the executable by noting that there are compile units in the module:
$ lldb /opt/bin/clang++
(lldb) script lldb.target.module['/opt/bin/clang++'].GetNumCompileUnits()
1341
I have instrumented a file in the Clang source tree, lib/Sema/SemaExpr.cpp with a printf statement in the Sema::DiagnoseAssignmentResult method (which is at line 10853 in my copy). I know this method gets called on my test file test.cc, but I can't get the debugger to stop on breakpoints for this method! I have tried setting the breakpoints two ways,
$ lldb /opt/bin/clang++
(lldb) breakpoint set -m DiagnoseAssignmentResult
Breakpoint 2: where = clang++`clang::Sema::DiagnoseAssignmentResult(clang::Sema::AssignConvertType, clang::SourceLocation, clang::QualType, clang::QualType, clang::Expr*, clang::Sema::AssignmentAction, bool*) + 87 at SemaExpr.cpp:10858, address = 0x0000000100ab9947
(lldb) process launch -- ./test.cc
<< message from my printf statement >>
... then clang++ runs to completion and exits, no breakpoint hit ...
(lldb)
I note that lldb did find the correct place in the source code, but didn't stop when it passed through the method. I also tried setting the breakpoint by specifiying the file and line number,
(lldb) breakpoint set -f SemaExpr.cpp -l 10853
Breakpoint 3: where = clang++`clang::Sema::DiagnoseAssignmentResult(clang::Sema::AssignConvertType, clang::SourceLocation, clang::QualType, clang::QualType, clang::Expr*, clang::Sema::AssignmentAction, bool*) + 87 at SemaExpr.cpp:10858, address = 0x0000000100ab9947
Again it "worked", but does not stop. Am I doing something fundamentally wrong here? How can I get the breakpoint to trigger?
You are debugging the clang driver, which is not what actually does the parsing. Instead, clang spawns off another process to do the compilation, then ld if linking is needed, etc. lldb wasn't stopping at your breakpoints because that code was actually getting run by a child process. The confusing bit here is that clang actually uses the same binary for the driver and the parser, so the breakpoints took, just not in the version of clang that was going to invoke that code.
The way to debug the compilation part of clang is first to run it like this:
$ clang++ -### <all your other arguments>
Note the weird -### argument. That tells clang not to do the compilation but to emit the command line that it will run to do the compilation. It will look something like:
/usr/bin/clang" "-cc1" ...
So that is the command line that you want to use in lldb to debug clang as a compiler rather than clang as a compiler driver...

How to run command line version of lc3 with an input .asm program and analyzie it using gdb?

I am a CS student learning how to program in C.
LC3 is a fake assembly language for teaching purposes.
computer-name> gdb mysim -norun testfde.obj
This yields a problem, the command is not recognized.
mysim is the c executable, testfde.obj is the lc3 assembly executable, -norun means make the mysim execution be command line.
I want to run mysim -norun with testfde.obj and analyze it using gdb, how would I do this?
I want to run mysim -norun with testfde.obj and analyze it using gdb, how would I do this?
gdb --args mysim -norun testfde.obj
(gdb) run
Alternatively:
gdb mysim
(gdb) run -norun testfde.obj

Resources