Following, and saving, the flow of code

Following, and saving, the flow of code - c

I was wondering if there is any way of compiling a program (my own program, or an open source program), with which I can follow the flow of that program when I execute it. Ideally, I would like to output the specific methods which the program goes through when it executes. Each time it calls a specific method, I would like to output that it has done so, which I would like to save to a file for later analysis.
For example, I am trying to better understand the flow within KVM (an open source hypervisor) but there are obviously many lines of code, and would be impossible for me to know where the code goes unless I dedicated possibly weeks to finding out.
The code I am looking at is written mostly in C, but also uses other languages. Any ideas please?

KVM is a subsystem of Linux kernel, so you should use ftrace (http://lwn.net/Articles/322666/) for tracing kernel-space code.

Related

ptrace usage in ARM

I'm trying to analyse how a third party software is controlling some hardware. The board is i.mx7 based running i.MX Linux kernel 3.14.52.
The board is a development board and is running some demo software which I do not have the code for. Most of the configuration is done via ioctl calls and I've trying to use strace to learn more about information is being set/get.
As an example I get the following from strace:
ioctl(4, FBIOPUT_VSCREENINFO, 0x19fcebc)
I would like in some way to dereference the pointer in the third argument to see the data. I know the structure of the data, and from what I've read, if strace doesn't already know your structure, then you have no luck.
I have also read about writing my own code using ptrace to do something similar, but every single example is not for ARM. The code I've seen uses ORIG_EAX and EAX registers and ARM I believe uses orig_r and r7, but I have no idea how to access the "registers". I'm quite new to programming for Linux.
What's surprising is I can't even easily find anything in exhaustive Googling. Some threads allude to it but I cannot find specifics. I can't be the only person who needs to use ptrace on ARM? Admittedly I might be the only person trying to use it you don't know how!
Failing getting ptrace working, I would be happy just peeking at the memory where the strace indicates the structures are and I could rebuild them manually. How would I go about this?
Thanks for any hints or pointers, I'm really hitting a brick wall.

How can I inject or dynamically load an c function into another c program

I want to build an interface in a c program which is running on an embedded system. This should accept some bytecode that represents a c function. This code will then be loaded into the memory and executed. This will then be something like remotely inject code into a running app. The only difference here is that i can implement, or change the running code and provide an interface.
The whole thing should be used to inject test code on a target system.
My current problem is that I do not know how to build such a byte code out of an existing c function. Mapping and executing this is no problem if I would knew the start address of the function.
Currently I am working with Ubuntu for testing purposes, this allows me to try some techniques which are not possible in the embedded system (according to missing operating system libs).
I build an shared object and used dlopen() and dlsym() to run this function. This works fine, the problem is just that i do not have such functions in the embedded system. I read something about loading a shared object into memory and run it, but i could not find examples for that. (see http://www.nologin.org/Downloads/Papers/remote-library-injection.pdf)
I also took a simple byte code that just print hello world in stdout. I stored this code in memory using mmap() and execute it. This also worked fine. Here the problem is that I don't know how to create such a byte code, I just used an hello world example form the internet. (see https://www.daniweb.com/programming/software-development/threads/353077/store-binary-code-in-memory-then-execute-it)
I also found something here: https://stackoverflow.com/a/12139145/2479996 which worked very well. But here i need a additional linker script, already for such a simple program.
Further I looked at this post: https://stackoverflow.com/a/9016439/2479996
According to that answer my problem would be solved with the "X11 project".
But I did not really find much about that, maybe some of you can provide me a link.
Is there another solution to do that? Did I miss something? Or can someone provide me another solution to this?
I hope I did not miss something.
Thanks in advance

I see no easy solution. The closest that I am aware of is GCC's JIT backend (libgccjit). Here is a blog post about it.
As an alternative, you could using a scripting language for that code that needs to be injected. For instance, ChaiScript or Lua. In this question, there is a summary of options. As you are on an embedded device, the overhead might be significant, though.
If using an LLVM based backend instead of GCC is possible, you can have a look at Cling. It is a C++ interpreter based on LLVM and Clang. In my personal experience, it was not always stable, but it is used in production in CERN. I would except that the dynamic compilation features are more advanced in LLVM than in GCC.

Replicating execve in c (Linux)?

I am doing a project to learn how a program is executed in Linux. Basically, I am trying to replicate the functionality of execve by running a series of system calls in a c program to take an executable binary, load it into memory, and successfully run it.
Are there any relatively easy-to-understand online resources (or tips) I can use to learn how to do this? I don't have much experience with this, and I'm trying to learn. It seems like a fairly complicated task, and I'm completely stuck at the moment.
Thank you.

Your main problem here is that part of the exec system call is overriding the process descriptor in the kernel. It's something you can't do in userspace.
Even if you close all file descriptors there are still plenty of other values you can't reach, nor can you free up dynamically loaded libraries and release you own program's code pages (since they would be write protected).
The basic approach to loading and running a code file would be to mmap it into the memory, then clear the stack, parse the ELF headers and jump to the program start function (assembly jmp instruction, mind you) But there's much more to an ELF file so it might not work without other initializations and dynamic linkage...

File in both KLM and user space

I remembering reading this concept somewhere. I do not remember where though.
I have a file say file.c, which along with other files I compile along with some other files as a library for use by applications.
Now suppose i compile the same file and build it with a Kernel module. Hence now the same file object is in both user space and kernel space and it allows me to access kernel data structures without invoking a system call. I mean i can have api's in the library by which applications can access kernel data structures without system calls. I am not sure if I can write anything into the kernel (which i think is impossile in this manner), but reading some data structures from kernel this way would be fine?
Can anyone give me more details about this approach. I could not find anything in google regarding this.

I believe this is a conceptually flawed approach, unless I misunderstand what you're talking about.
If I understand you correctly, you want to take the same file and compile it twice: once as a module and once as a userspace program. Then you want to run both of them, so that they can share memory.
So, the obvious problem with that is that even though the programs come from the same source code, they would still exist as separate executables. The module won't be its own process: it only would get invoked when the kernel get's going (i.e. system calls). So by itself, it doesn't let you escape the system call nonsense.
A better solution depends on what your goal is: do you simply want to access kernel data structures because you need something that you can't normally get at? Or, are you concerned about performance and want to access these structures faster than a system call?
For (1), you can create a character device or a procfs file. Both of these allow your userspace programs to reach their dirty little fingers into the kernel.
For (2), you are in a tough spot, and the problem gets a lot nastier (and more insteresting). To solve the speed issue, it depends a lot on what exact data you're trying to extract.
Does this help?

There are two ways to do this, the most common being what's called a Character Device, and the other being a Block Device (i.e. something "disk-like").
Here's a guide on how to create drivers that register chardevs.

listing all calls to my library

I'm building a shared library in C, which other programs use. Sometimes, these other programs crash because of some error in my shared library. While reproducing these sort of bugs, it is very useful for me to know which functions of my library are being called, with what arguments and in what order. Of course I can add printf() calls to all my functions, or add breakpoints to all of them, but I figure there just has to be a better way to determine this.
Edit: since I'm doing this on OSX, dtrace and the related script dapptrace seem promising. However, after digging through some documentation I'm still a bit lost.
Say, my library is /path/to/libmystuff.so and I've got a program test which links to this library. Using dtrace, how would I bring up a list of all the function calls that reside in libmystuff.so?

You could use ltrace for that purpose if you work on a Linux system. The original poster shows, in the comments below, a solution that works on Mac OS X using dtrace.

I am assuming that you are working on Unix.
Use gdb for debugging purposes.
If your program has crashed.
you can use the core file generated for looking into the stack trace.
It will give all information that you have asked for.
for more information for checking the stacktrace using gdb with the core file see here.

You can also log the functions call on file system with all details like function name, arguments etc.
(Usually logging is help in Server-Clients application but I am not sure about your application).
This way You can trace all calls. You can also enable logging in debugging mode only. I hope this reply will be useful to you.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight