how to read all parameters from a function - ebpf - c

So I have these macros
#define PT_REGS_PARM1(x) ((x)->di)
#define PT_REGS_PARM2(x) ((x)->si)
#define PT_REGS_PARM3(x) ((x)->dx)
#define PT_REGS_PARM4(x) ((x)->cx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->sp)
#define PT_REGS_FP(x) ((x)->bp)
#define PT_REGS_RC(x) ((x)->ax)
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->ip)
But the above does not say how to get specific parameter from function say `__sys_write
consider sys_write as
long sys_write(unsigned int fd, const char __user *buf,
size_t count);
so I need buffer, I have been trying different macros but not really sure which one giving me what?
So can anyone please clearify it
If will also read buffer if I am reading buffer then count needed too so my ebpf program get loaded and not give out of bounds access error. can anyone tell

Use the PT_REGS_PARM*(x) macros
PARM in PT_REGS_PARM1(x) stands for “parameter”. These macros give you access to the parameters of the function on which your kprobe or tracepoint is hooking to. So for example, PT_REGS_PARM1(ctx), where ctx is the struct pt_regs *ctx context passed as an argument to your eBPF program, will give you access to the first parameter, which is the file descriptor fd. Similarly, PT_REGS_PARM3(ctx) will give you the count, as you can confirm by looking at this kernel sample (write_size).
... But use bpf_probe_read_*() to stay safe with kernel memory
Similarly, you can point to the buffer buf with PT_REGS_PARM2(ctx). However, this one is a pointer; if you want to manipulate the data contained in this buffer, you need another step, or the kernel may reject your program as unsafe. To read and copy some or all of the data from this buffer, you should use one of the eBPF helpers bpf_probe_read_*(void *dst, u32 size, const void *unsafe_ptr) (see relevant documentation). In your case, the data contained in that buffer comes from user space, so you want bpf_probe_read_user().
Notes on CO-RE
This does not really apply to your example, because your pointer is just a buffer. But if one of your arguments were a pointer to a struct, you would need similar precautions to dereference it and access its fields.
And in such case you might want to leverage CO-RE, to make sure that you would access the correct offsets when reading the fields. If you have CO-RE support, libbpf also provides bpf_core_read*() wrappers around the eBPF helpers, which make access relocatable. See the BPF CO-RE reference guide for more information.
Also with CO-RE (technically, just BTF this time), certain types for tracing programs, in particular BPF_PROG_TYPE_TRACING, allow you to access struct fields without any helper (See the initial CO-RE article).

Related

Qemu plugin functions - how to access guest memory and registers

Background
Qemu version 4.2.0, released Dec '19, included a new functionality for something called TCG Plugins. They have a few examples in the tests/plugins directory, and the API is more or less defined in qemu-plugin.h.
This file defines two enumerated types, qemu_plugin_cb_flags and qemu_plugin_mem_rw, which are passed into functions that register callbacks. These enums seem to indicate whether the callbacks will read or write CPU registers or memory. However, all of the example plugins use QEMU_PLUGIN_CB_NO_REGS, and only 2 of the plugins use the memory access enum. hotpages.c and mem.c use QEMU_PLUGIN_MEM_RW as the default for registering a memory callback (qemu_plugin_register_vcpu_mem_cb). mem.c has an argument when the plugin is loaded to choose if it's read or write, however, it doesn't seem to make any difference in the callback function.
Question
My question is, how do I access the guest memory and registers from the plugin callback function? The API seems to indicate that it is possible, since the callback registering requires you to say if you will access them, and if it's RW or just read.
Are there any examples of using this part of the API? I realize this is a very new part of Qemu functionality.
Code
When you register a callback on an instruction, like in insn.c, you can get the virtual address of the instruction.
uint64_t insn_vaddr = qemu_plugin_insn_vaddr(insn);
I am running a baremetal ARM program, and this virtual address seems to correlate to the address of the instruction in the ELF file.
Inside memory callback functions, you can call qemu_plugin_get_hwaddr to get the hardware address of the memory access, but I'm not sure exactly what that struct represents.
Related
Write to QEMU guest system registers & memory?
This answer is 7 years old, and suggests using the GDB interface. My question is specifically related to using the TCG plugin functionality.
I just got to that exact same problem. It seems like the qemu team really tried to prevent us from using CPU or memory stuff from a plugin. I tried including the headers I needed by modifying the Makefiles, but those headers aren't supposed to be included from external code like plugins. I couldn't manage to make it compile.
As you said, there are flags that suggest that this is possible. My guess is that the feature wasn't implemented completely, maybe this will be possible soon enough.
In the meantime, as we wait for a proper method to do this, here's how I hacked it:
Getting a register
In my case, the CPU is an ARM. I'll show the code first and then explain.
void *qemu_get_cpu(int index);
static uint32_t get_cpu_register(unsigned int cpu_index, unsigned int reg) {
uint8_t* cpu = qemu_get_cpu(cpu_index);
return *(uint32_t*)(cpu + 33488 + 5424 + reg * 4);
}
I first declare the qemu_get_cpu function, since we can't include its header. That function returns a CPUState*. Since my CPU is an ARM, I know that pointer is actually an ARMCPU*. As inheritance is implemented in qemu, a cast from that CPUState* to ARMCPU* is a no-op, so nothing to do there.
Then, looking in target/arm/cpu.h, we can see that struct:
struct ARMCPU {
/*< private >*/
CPUState parent_obj;
/*< public >*/
CPUNegativeOffsetState neg;
CPUARMState env;
// ...
I used this compiler trick to get the size of CPUState and CPUNegativeOffsetState, which are in my case 33488 and 5424, respectively. This gives us the offset of the CPUARMState which starts as follow:
typedef struct CPUARMState {
/* Regs for current mode. */
uint32_t regs[16];
// ...
So the registers are just at the beginning, that's why I use reg * 4.
Now that we can read our register, the next step is...
Reading from memory
This one is easier, I got it from gdbstub.c in qemu itself:
void cpu_physical_memory_rw(uint64_t addr, uint8_t *buf,
uint64_t len, int is_write);
// and in my function:
char name[9] = {0};
cpu_physical_memory_rw(name_addr, name, 8, 0);
We just declare the method we need and call it. It seems that the method never fails, reading from unmapped memory does nothing.
I just found out, that in user-mode, you can just dereference vaddr to get the value from memory.
char *val = (char*) vaddr;
Additionally, I found the following method in cpu.c which can also be used, when gdbstub is not available.
int cpu_memory_rw_debug(void *cpu, uint64_t addr,
void *ptr, uint64_t len, bool is_write);
Nevertheless, it is a workaround and I am hoping for more features of the TCG plugins.

Calling function from ARM microcontroller's ROM by using pointer to this function?

I am working with cec1702 (cortex m4 from microchip), this device a has cryptographic engine, but it is accessible only through microchip's API. I don't want to use official API, because I am trying to use minimal setup only with gcc and for API I would need to download the whole IDE, also I need just a couple of functions..
In document about device's ROM I've founded this list of API functions and their symdef table, for example:
api_rng_mode = 0x00007425;
api_rng_get_random_bytes = 0x00007441;
Prototypes of this two functions look like this:
void api_rng_mode(uint8_t tmode_pseudo);
uint32_t api_rng_get_random_bytes ( uint8_t *pbuff8,uint32_t num_bytes);
MY PLAN is to use those functions only by defining pointers to them, but am Am I actually doing it right?
For those 2 examples I am using this definitions:
#define api_rng_mode (void(*)(uint8_t))0x00007425
#define api_get_random_bytes (uint32_t(*(uint8_t,uint32_t))0x00007441
And in code I call them like this:
(*api_rng_mode)(1);
(*api_get_random_bytes)(p, numberofbytes);
But so far it doesn't look working... Any ideas?
EDIT: Thank you all for your help:) It works now and I learned something new:)
The code before jumped somewhere into ROM and stucked there, but it was solved by resetting RNG during the process of setting it up.
The code may well be correct, but it is borderline unreadable with the added problem of macros that don't respect scopes. Therefore:
// alias the function types by just adding a `typedef`
typedef void api_rng_mode_function(uint8_t tmode_pseudo);
typedef uint32_t api_rng_get_random_bytes_function( uint8_t *pbuff8,uint32_t num_bytes);
That also makes it easier to e.g. mock the functions for unit tests. Then, define pointers to these functions:
// TODO: Add reference to documentation where these magic numbers came from!
api_rng_mode_function*const api_rng_mode = 0x00007425;
api_rng_get_random_bytes_function*const api_rng_get_random_bytes = 0x00007441;
You can then call the according functions using plain function syntax:
api_rng_mode(42);
uint8_t buffer[17];
uint32_t res = api_rng_get_random_bytes(buffer, sizeof buffer);
Note that the explicit dereferencing (with *) isn't necessary, although it can be done.
The only thing that might still cause problems now is "calling conventions", which is the stack protocol, i.e. the way how arguments are passed on the stack. If the calling function pushes them in the wrong order or if the called function expects some arguments in registers instead, you may need to tell the compiler or even write some inline assembly.
If you want yo use defines you need to pas the parameters as well
#define api_rng_mode(p) ((void(*)(uint8_t))0x00007425)(p)
#define api_get_random_bytes(p,s) ((uint32_t (*)(uint8_t *,uint32_t))0x00007441)(p,s)
But actually I would wrap them into the inline functions like this:
inline uint32_t api_get_random_bytes(uint8_t *buff, uint32_t size)
{
return ((uint32_t (*)(uint8_t *,uint32_t))0x00007441)(buff,size);
}

trigger function before file write operation

Let say we have a function:
void persist_result(FILE* to, unsigned char* b, int b_len) {...}
which would save some result in the given FILE* to.
Now I would like to get the data before the data is written to to, do something with it (assume encrypt it, etc..) and then call the actual IO operation, directly or indirectly.
One solution could be setting a buffer, but I don't know how to trigger my method for the encryption operation.
Also I was thinking to get some handle of file in memory, but don't know if there is any ISO way to do that?
Or any better solution?
Consider the following:
Size of the data need to be written by the persist_result is unknown, it could be 1 or more bytes.
I cannot change the source of persist_result.
No C++; it must be a portable C solution.
What you are looking for is the Observer Pattern.
When your function is called, actually you can first capture that call, do whatever you prefer and then continue with what you were doing. You could implement it in C using pointer to functions.
You can get inspiration from the following example
There is no way to capture every operation in standard C without changing the calls. Things like encryption need context (like key) to work; that complicates life in general, but maybe persist_result() handles that automatically. How will you handle things like fseek() or rewind()?
I think you are in for a world of pain unless you write your I/O operations to a non-standard C API that allows you to do what's necessary cleanly. For example, your code might be written to call functions such as pr_fwrite(), pr_putc(), pr_fprintf(), pr_vfprintf(), pr_fseek(), pr_rewind(), etc — you probably wouldn't be applying this to either stdin or stdout — and have those do what's necessary.
If I were going to try this, I'd adopt prefixes (pr_ and PR) and create a header "prstdio.h" to be used in place of, or in addition to, <stdio.h>. It could contain (along with comments and header guards, etc):
#include <stdarg.h>
// No need for #include <stdio.h>
typedef struct PRFILE PRFILE;
extern PRFILE *pr_fopen(const char *name, const char *mode);
extern int pr_fclose(PRFILE *fp);
extern int pr_fputc(char c, PRFILE *fp);
extern size_t pr_fwrite(const void *buffer, size_t size, size_t number, PRFILE *fp);
extern int pr_fprintf(PRFILE *fp, char *fmt, ...);
extern int pr_vfprintf(PRFILE *fp, char *fmt, va_list args);
extern int pr_fseek(PRFILE *fp, long offset, int whence);
extern void pr_rewind(PRFILE *fp);
…
and all the existing I/O calls that need to work with the persist_result() function would be written to use the prstdio.h interface instead. In your implementation file, you actually define the structure struct PRFILE, which would include a FILE * member plus any other information you need. You then write those pr_* functions to do what's necessary, and your code that needs to persist results is changed to call the pr_* functions (and use the PRFILE * type) whenever you currently use the stdio.h functions.
This has the merit of being simply compliant with the C standard and can be made portable. Further, the changes to existing code that needs to use the 'persistent result' library are very systematic.
In a comment to the main question — originally responding to a now-deleted comment of mine (the contents of which are now in this answer) — the OP asked:
I need to do the encryption operation before the plain data write operation. The encryption context is ready for work. I was thinking using the disk on memory, but is it ISO and can be used in Android NDK and iOS too?
Your discussion so far is in terms of encrypting and writing the data. Don't forget the other half of the I/O equation — reading and decrypting the data. You'd need appropriate input functions in the header to be able to handle that. The pr_ungetc() function could cause some interesting discussions.
The scheme outlined here will be usable on other systems where you can write the C code. It doesn't rely on anything non-standard. This is a reasonable way of achieving data hiding in C. Only the implementation files for the prstdio library need know anything about the internals of the PRFILE structure.
Since 'disk in memory' is not part of standard C, any code using such a concept must be using non-standard C. You'd need to consider carefully what it means for portability, etc. Nevertheless, the external interface for the prstdio library could be much the same as described here, except that you might need one or more control functions to manipulate the placement of the data in memory. Or you might modify pr_fopen() to take extra arguments which control the memory management. That would be your decision. The general I/O interface need not change, though.

Reentrancy or not with this netbsd code

I am studying on "reading code" by reading pieces of NetBSD source code.
(for whoever is interested, it's < Code Reading: The Open Source Perspective > I'm reading)
And I found this function:
/* convert IP address to a string, but not into a single buffer
*/
char *
naddr_ntoa(naddr a)
{
#define NUM_BUFS 4
static int bufno;
static struct {
char str[16]; /* xxx.xxx.xxx.xxx\0 */
} bufs[NUM_BUFS];
char *s;
struct in_addr addr;
addr.s_addr = a;
strlcpy(bufs[bufno].str, inet_ntoa(addr), sizeof(bufs[bufno].str));
s = bufs[bufno].str;
bufno = (bufno+1) % NUM_BUFS;
return s;
#undef NUM_BUFS
}
It introduces 4 different temporary buffers to wrap inet_ntoa function since inet_ntoa is not re-entrant.
But seems to me this naddr_ntoa function is also not re-entrant:
the static bufno variable can be manipulated by other so the temporary buffers do not seem work as expected here.
So is it a potential bug?
Yes, this is a potential bug. If you want a similar function that most likely reentrant you could use e.g. inet_ntop (which incidentally handles IPv6 as well).
That code comes from src/sbin/routed/trace.c and it is not a general library routine, but just a custom hack used only in the routed program. The addrname() function in the same file makes use of the same trick, for the same reason. It's not even NetBSD code per se, but rather it comes from SGI originally, and is maintained by Vernon Schryver (see The Routed Page).
It's just a quick hack to allow use of multiple calls within the same expression, such as where the results are being used in one printf() call: E.g.:
printf("addr1->%s, addr2->%s, addr3->%s, addr4->%s\n",
naddr_ntoa(addr1), naddr_ntoa(addr2), naddr_ntoa(addr3), naddr_ntoa(addr4));
There are several examples of similar uses in the routed source files (if.c, input.c, rdisc.c).
There is no bug in this code. The routed program is not multi-threaded. Reentrancy is not being addressed at all in this hack. This trick has been done by design for a very specific purpose that has nothing to do with reentrancy. The Code Reading author(s) is wrong to associate this trick with reentrancy.
It's simply a way to hide the saving of multiple results in an array of static variables instead of having to individually copy those results from one static variable into separate storage in the calling function when multiple results are required for a single expression.
Remember that static variables have all the properties of global variables except for the limited scope of their identifier. It is of course true that unprotected use of global (or static) variables inside a function make that function non-reentrant, but that's not the only problem global variables cause. Use of a fully-reentrant function would not be appropriate in routed because it would actually make the code more complex than necessary, whereas this hack keeps the calling code clean and simple. It would though have been better for the hack to be properly documented such that future maintainers would more easily spot when NUM_BUFS has to be adjusted.

How can I write a generic C function for calling a Win32 function?

To allow access to the Win32 API from a scripting language (written in C), I would like to write a function such as the following:
void Call(LPCSTR DllName, LPCSTR FunctionName,
LPSTR ReturnValue, USHORT ArgumentCount, LPSTR Arguments[])
which will call, generically, any Win32 API function.
(the LPSTR parameters are essentially being used as byte arrays - assume that they have been correctly sized to take the correct data type external to the function. Also I believe that some additional complexity is required to distinguish between pointer and non-pointer arguments but I'm ignoring that for the purposes of this question).
The problem I have is passing the arguments into the Win32 API functions. Because these are stdcall I can't use varargs so the implementation of 'Call' must know about the number of arguments in advance and hence it cannot be generic...
I think I can do this with assembly code (by looping over the arguments, pushing each to the stack) but is this possible in pure C?
Update: I've marked the 'No it is not possible' answer as accepted for now. I will of course change this if a C-based solution comes to light.
Update: ruby/dl looks like it may be implemented using a suitable mechanism. Any details on this would be appreciated.
First things first: You cannot pass a type as a parameter in C. The only option you are left with is macros.
This scheme works with a little modification (array of void * for arguments), provided you are doing a LoadLibrary/GetProcAddress to call Win32 functions. Having a function name string otherwise will be of no use. In C, the only way you call a function is via its name (an identifier) which in most cases decays to a pointer to the function. You also have to take care of casting the return value.
My best bet:
// define a function type to be passed on to the next macro
#define Declare(ret, cc, fn_t, ...) typedef ret (cc *fn_t)(__VA_ARGS__)
// for the time being doesn't work with UNICODE turned on
#define Call(dll, fn, fn_t, ...) do {\
HMODULE lib = LoadLibraryA(dll); \
if (lib) { \
fn_t pfn = (fn_t)GetProcAddress(lib, fn); \
if (pfn) { \
(pfn)(__VA_ARGS__); \
} \
FreeLibrary(lib); \
} \
} while(0)
int main() {
Declare(int, __stdcall, MessageBoxProc, HWND, LPCSTR, LPCSTR, UINT);
Call("user32.dll", "MessageBoxA", MessageBoxProc,
NULL, ((LPCSTR)"?"), ((LPCSTR)"Details"),
(MB_ICONWARNING | MB_CANCELTRYCONTINUE | MB_DEFBUTTON2));
return 0;
}
No, I don't think its possible to do with without writing some assembly. The reason is you need precise control over what is on the stack before you call the target function, and there's no real way to do that in pure C. It is, of course, simple to do in Assembly though.
Also, you're using PCSTR for all of these arguments, which is really just const char *. But since all of these args aren't strings, what you actually want to use for return value and for Arguments[] is void * or LPVOID. This is the type you should use when you don't know the true type of the arguments, rather than casting them to char *.
The other posts are right about the almost certain need for assembly or other non-standard tricks to actually make the call, not to mention all of the details of the actual calling conventions.
Windows DLLs use at least two distinct calling conventions for functions: stdcall and cdecl. You would need to handle both, and might even need to figure out which to use.
One way to deal with this is to use an existing library to encapsulate many of the details. Amazingly, there is one: libffi. An example of its use in a scripting environment is the implementation of Lua Alien, a Lua module that allows interfaces to arbitrary DLLs to be created in pure Lua aside from Alien itself.
A lot of Win32 APIs take pointers to structs with specific layouts. Of these, a large subset follow a common pattern where the first DWORD has to be initialized to have the size of the struct before it is called. Sometimes they require a block of memory to be passed, into which they will write a struct, and the memory block must be of a size that is determined by first calling the same API with a NULL pointer and reading the return value to discover the correct size. Some APIs allocate a struct and return a pointer to it, such that the pointer must be deallocated with a second call.
I wouldn't be that surprised if the set of APIs that can be usefully called in one shot, with individual arguments convertable from a simple string representation, is quite small.
To make this idea generally applicable, we would have to go to quite an extreme:
typedef void DynamicFunction(size_t argumentCount, const wchar_t *arguments[],
size_t maxReturnValueSize, wchar_t *returnValue);
DynamicFunction *GenerateDynamicFunction(const wchar_t *code);
You would pass a simple snippet of code to GenerateDynamicFunction, and it would wrap that code in some standard boilerplate and then invoke a C compiler/linker to make a DLL from it (there are quite a few free options available), containing the function. It would then LoadLibrary that DLL and use GetProcAddress to find the function, and then return it. This would be expensive, but you would do it once and cache the resulting DynamicFunctionPtr for repeated use. You could do this dynamically by keeping pointers in a hashtable, keyed by the code snippets themselves.
The boilerplate might be:
#include <windows.h>
// and anything else that might be handy
void DynamicFunctionWrapper(size_t argumentCount, const wchar_t *arguments[],
size_t maxReturnValueSize, wchar_t *returnValue)
{
// --- insert code snipped here
}
So an example usage of this system would be:
DynamicFunction *getUserName = GenerateDynamicFunction(
"GetUserNameW(returnValue, (LPDWORD)(&maxReturnValueSize))");
wchar_t userName[100];
getUserName(0, NULL, sizeof(userName) / sizeof(wchar_t), userName);
You could enhance this by making GenerateDynamicFunction accept the argument count, so it could generate a check at the start of the wrapper that the correct number of arguments has been passed. And if you put a hashtable in there to cache the functions for each encountered codesnippet, you could get close to your original example. The Call function would take a code snippet instead of just an API name, but would otherwise be the same. It would look up the code snippet in the hashtable, and if not present, it would call GenerateDynamicFunction and store the result in the hashtable for next time. It would then perform the call on the function. Example usage:
wchar_t userName[100];
Call("GetUserNameW(returnValue, (LPDWORD)(&maxReturnValueSize))",
0, NULL, sizeof(userName) / sizeof(wchar_t), userName);
Of course there wouldn't be much point doing any of this unless the idea was to open up some kind of general security hole. e.g. to expose Call as a webservice. The security implications exist for your original idea, but are less apparent simply because the original approach you suggested wouldn't be that effective. The more generally powerful we make it, the more of a security problem it would be.
Update based on comments:
The .NET framework has a feature called p/invoke, which exists precisely to solve your problem. So if you are doing this as a project to learn about stuff, you could look at p/invoke to get an idea of how complex it is. You could possibly target the .NET framework with your scripting language - instead of interpreting scripts in real time, or compiling them to your own bytecode, you could compile them to IL. Or you could host an existing scripting language from the many already available on .NET.
You could try something like this - it works well for win32 API functions:
int CallFunction(int functionPtr, int* stack, int size)
{
if(!stack && size > 0)
return 0;
for(int i = 0; i < size; i++) {
int v = *stack;
__asm {
push v
}
stack++;
}
int r;
FARPROC fp = (FARPROC) functionPtr;
__asm {
call fp
mov dword ptr[r], eax
}
return r;
}
The parameters in the "stack" argument should be in reverse order (as this is the order they are pushed onto the stack).
Having a function like that sounds like a bad idea, but you can try this:
int Call(LPCSTR DllName, LPCSTR FunctionName,
USHORT ArgumentCount, int args[])
{
void STDCALL (*foobar)()=lookupDLL(...);
switch(ArgumentCount) {
/* Note: If these give some compiler errors, you need to cast
each one to a func ptr type with suitable number of arguments. */
case 0: return foobar();
case 1: return foobar(args[0]);
...
}
}
On a 32-bit system, nearly all values fit into a 32-bit word and shorter values are pushed onto stack as 32-bit words for function call arguments, so you should be able to call virtually all Win32 API functions this way, just cast the arguments to int and the return value from int to the appropriate types.
I'm not sure if it will be of interest to you, but an option would be to shell out to RunDll32.exe and have it execute the function call for you. RunDll32 has some limitations and I don't believe you can access the return value whatsoever but if you form the command line arguments properly it will call the function.
Here's a link
First, you should add the size of each argument as an extra parameter. Otherwise, you need to divine the size of each parameter for each function to push onto the stack, which is possible for WinXX functions since they have to be compatible with the parameters they are documented, but tedious.
Secondly, there isn't a "pure C" way to call a function without knowing the arguments except for a varargs function, and there is no constraint on the calling convention used by a function in a .DLL.
Actually, the second part is more important than the first.
In theory, you could set up a preprocessor macro/#include structure to generate all combinations of parameter types up to, say, 11 parameters, but that implies that you know ahead of time which types will be passed through you function Call. Which is kind of crazy if you ask me.
Although, if you really wanted to do this unsafely, you could pass down the C++ mangled name and use UnDecorateSymbolName to extract the types of the parameters. However, that won't work for functions exported with C linkage.

Resources