trigger function before file write operation - c

Let say we have a function:
void persist_result(FILE* to, unsigned char* b, int b_len) {...}
which would save some result in the given FILE* to.
Now I would like to get the data before the data is written to to, do something with it (assume encrypt it, etc..) and then call the actual IO operation, directly or indirectly.
One solution could be setting a buffer, but I don't know how to trigger my method for the encryption operation.
Also I was thinking to get some handle of file in memory, but don't know if there is any ISO way to do that?
Or any better solution?
Consider the following:
Size of the data need to be written by the persist_result is unknown, it could be 1 or more bytes.
I cannot change the source of persist_result.
No C++; it must be a portable C solution.

What you are looking for is the Observer Pattern.
When your function is called, actually you can first capture that call, do whatever you prefer and then continue with what you were doing. You could implement it in C using pointer to functions.
You can get inspiration from the following example

There is no way to capture every operation in standard C without changing the calls. Things like encryption need context (like key) to work; that complicates life in general, but maybe persist_result() handles that automatically. How will you handle things like fseek() or rewind()?
I think you are in for a world of pain unless you write your I/O operations to a non-standard C API that allows you to do what's necessary cleanly. For example, your code might be written to call functions such as pr_fwrite(), pr_putc(), pr_fprintf(), pr_vfprintf(), pr_fseek(), pr_rewind(), etc — you probably wouldn't be applying this to either stdin or stdout — and have those do what's necessary.
If I were going to try this, I'd adopt prefixes (pr_ and PR) and create a header "prstdio.h" to be used in place of, or in addition to, <stdio.h>. It could contain (along with comments and header guards, etc):
#include <stdarg.h>
// No need for #include <stdio.h>
typedef struct PRFILE PRFILE;
extern PRFILE *pr_fopen(const char *name, const char *mode);
extern int pr_fclose(PRFILE *fp);
extern int pr_fputc(char c, PRFILE *fp);
extern size_t pr_fwrite(const void *buffer, size_t size, size_t number, PRFILE *fp);
extern int pr_fprintf(PRFILE *fp, char *fmt, ...);
extern int pr_vfprintf(PRFILE *fp, char *fmt, va_list args);
extern int pr_fseek(PRFILE *fp, long offset, int whence);
extern void pr_rewind(PRFILE *fp);
…
and all the existing I/O calls that need to work with the persist_result() function would be written to use the prstdio.h interface instead. In your implementation file, you actually define the structure struct PRFILE, which would include a FILE * member plus any other information you need. You then write those pr_* functions to do what's necessary, and your code that needs to persist results is changed to call the pr_* functions (and use the PRFILE * type) whenever you currently use the stdio.h functions.
This has the merit of being simply compliant with the C standard and can be made portable. Further, the changes to existing code that needs to use the 'persistent result' library are very systematic.
In a comment to the main question — originally responding to a now-deleted comment of mine (the contents of which are now in this answer) — the OP asked:
I need to do the encryption operation before the plain data write operation. The encryption context is ready for work. I was thinking using the disk on memory, but is it ISO and can be used in Android NDK and iOS too?
Your discussion so far is in terms of encrypting and writing the data. Don't forget the other half of the I/O equation — reading and decrypting the data. You'd need appropriate input functions in the header to be able to handle that. The pr_ungetc() function could cause some interesting discussions.
The scheme outlined here will be usable on other systems where you can write the C code. It doesn't rely on anything non-standard. This is a reasonable way of achieving data hiding in C. Only the implementation files for the prstdio library need know anything about the internals of the PRFILE structure.
Since 'disk in memory' is not part of standard C, any code using such a concept must be using non-standard C. You'd need to consider carefully what it means for portability, etc. Nevertheless, the external interface for the prstdio library could be much the same as described here, except that you might need one or more control functions to manipulate the placement of the data in memory. Or you might modify pr_fopen() to take extra arguments which control the memory management. That would be your decision. The general I/O interface need not change, though.

Related

how to read all parameters from a function - ebpf

So I have these macros
#define PT_REGS_PARM1(x) ((x)->di)
#define PT_REGS_PARM2(x) ((x)->si)
#define PT_REGS_PARM3(x) ((x)->dx)
#define PT_REGS_PARM4(x) ((x)->cx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->sp)
#define PT_REGS_FP(x) ((x)->bp)
#define PT_REGS_RC(x) ((x)->ax)
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->ip)
But the above does not say how to get specific parameter from function say `__sys_write
consider sys_write as
long sys_write(unsigned int fd, const char __user *buf,
size_t count);
so I need buffer, I have been trying different macros but not really sure which one giving me what?
So can anyone please clearify it
If will also read buffer if I am reading buffer then count needed too so my ebpf program get loaded and not give out of bounds access error. can anyone tell
Use the PT_REGS_PARM*(x) macros
PARM in PT_REGS_PARM1(x) stands for “parameter”. These macros give you access to the parameters of the function on which your kprobe or tracepoint is hooking to. So for example, PT_REGS_PARM1(ctx), where ctx is the struct pt_regs *ctx context passed as an argument to your eBPF program, will give you access to the first parameter, which is the file descriptor fd. Similarly, PT_REGS_PARM3(ctx) will give you the count, as you can confirm by looking at this kernel sample (write_size).
... But use bpf_probe_read_*() to stay safe with kernel memory
Similarly, you can point to the buffer buf with PT_REGS_PARM2(ctx). However, this one is a pointer; if you want to manipulate the data contained in this buffer, you need another step, or the kernel may reject your program as unsafe. To read and copy some or all of the data from this buffer, you should use one of the eBPF helpers bpf_probe_read_*(void *dst, u32 size, const void *unsafe_ptr) (see relevant documentation). In your case, the data contained in that buffer comes from user space, so you want bpf_probe_read_user().
Notes on CO-RE
This does not really apply to your example, because your pointer is just a buffer. But if one of your arguments were a pointer to a struct, you would need similar precautions to dereference it and access its fields.
And in such case you might want to leverage CO-RE, to make sure that you would access the correct offsets when reading the fields. If you have CO-RE support, libbpf also provides bpf_core_read*() wrappers around the eBPF helpers, which make access relocatable. See the BPF CO-RE reference guide for more information.
Also with CO-RE (technically, just BTF this time), certain types for tracing programs, in particular BPF_PROG_TYPE_TRACING, allow you to access struct fields without any helper (See the initial CO-RE article).

Interface for I/O library

When someone wants to build a C library for dealing with I/O (dealing with a specific file format), they pretty much have to provide the following:
/* usual opaque struct setup */
struct my_context;
typedef struct my_context my_context_t;
/* Open context for reading from user specified callbacks */
my_context_t* my_open_callback(void* userdata,
size_t(*read_cb)(void* data, size_t size, size_t count, void* userdata),
int(*close_cb)(void* userdata),
void(*error_cb)(const char* error_msg)
);
And then later provide some common ones:
/* Open directly from file */
my_context_t* my_open_file(const char * filename);
/* Open from an existing memory block */
my_context_t* my_open_memory(const char* buf, size_t len);
As far as as understand there are possibly others, but is this one considered to reduce inconsistencies, unsafe practices and inefficiencies in the design, or is there something else considered best practice ? Is there a name for this convention/best practice ?
These are interface design questions. A good interface provides a useful abstraction and hides implementation details. In your example, my_context_t elides some of the implementation details from your user base, provided you don't fully define the type in a public header. This provides you with the freedom to make substantial changes to your implementation without forcing your entire user base to rewrite their code. It is a very good practice, provided the rest of your abstraction is a good fit to the problem space. Sometimes you just have to commit to exposing additional detail at the interface level.

What do FILE struct members mean exactly in C?

typedef struct _iobuf{
char* _ptr;
int _cnt;
char* _base;
int _flag;
int _file;
int _charbuf;
int _bufsiz;
char* _tmpfname;
} FILE;
I try to print them but I don't understand what they mean.
Here What exactly is the FILE keyword in C? He said "Some believe that nobody in their right mind should make use of the internals of this structure." but he didn't explain what they mean.
Look at the source code for your system's run-time libraries that implement the FILE-based IO calls if you want to know what those fields mean.
If you write code that depends on using those fields, it will be non-portable at best, utterly wrong at worst, and definitely easy to break. For example, on Solaris there are at least three different implementations of the FILE structure in just the normal libc runtime libraries, and one of those implementations (the 64-bit one) is opaque and you can't access any of the fields. Simply changing compiler flags changes which FILE structure your code uses.
And that's just one one version of a single OS.
_iobuf::_file can be used to get the internal file number, useful for functions that require the file no. Ex: _fstat().

C: Public aliases to hide a static function

In C, I have a function that implements both the encryption and decryption routines of a block cipher. In order to both maintain a common naming and use convention, and to leave open the possibility of separating the routines into two different functions later, I've done the following:
void cipher(char *out, const char *in);
#define encrypt cipher
#define decrypt cipher
That works fine, except that I'd really like to hide the actual function (cipher) so people have to use encrypt or decrypt. Right now, cipher is part of the public interface, so if I decide to separate it into two different functions later and delete cipher, strictly speaking, I'm breaking the interface. But if I can hide cipher so only encrypt and decrypt are part of the interface, I'll be fine.
The only option I've come up with so far is make cipher static, and implement actual functions for encrypt an decrypt to call cipher, but I'm not sure that the added overhead is actually worth it (I'm trying to keep the code size as tight as possible, and I have multiple occurrences of this same problem).
Is there something I can do with function pointers? Any other ideas?
You could use function pointers:
static void cipher(...);
void (*encrypt)(...) = cipher;
void (*decrypt)(...) = cipher;
At least in typical use (the user just uses encrypt(whatever);) this wouldn't normally be visible. The only obvious problem would be that as defined above, the pointers remain writable, so you might want to make them const so the user can't accidentally overwrite them with the address of some other function.
Another possibility would be to live with the name cipher being public (or rename it to something like private_cipher_ to avoid accidental name collisions) and then just use a couple of macros:
#define encrypt(x, y) private_cipher_((x), (y))
#define decrypt(x, y) private_cipher_((x), (y))
This should ensure against any overhead.
If you're using GCC, you can use the alias attribute to make two aliases that point to your cipher function.
But take Steve Jessop's comment in mind and consider just writing two wrapper functions. It shouldn't cause noticeable overhead. The compiler might even write those wrapper functions as a single jump instruction.
In GCC (and compatible) lands, you can also use the attribute fluff for controlling symbol visibility, versioning and aliasing:
static void xxx_encrypt_decrypt(char *y, const char *x) { ... }
void encrypt(char *, const char *) __attribute__((alias("xxx_encrypt_decrypt")));
void decrypt(char *, const char *) __attribute__((alias("xxx_encrypt_decrypt")));

An easy way to replace fread()'s with reading from a byte array?

I have a piece of code that needs to be run from a restricted environment that doesn't allow stdio (Flash's Alchemy compiler). The code uses standard fopen/fread functions and I need to convert it to read from a char* array. Any ideas on how to best approach this? Does a wrapper exist or some library that would help?
Thanks!
EDIT: I should also mention that it's reading in structs. Like this:
fread(&myStruct, 1, sizeof(myStruct), f);
I don't know of any such wrapper, but I don't think it would be too difficult to make your own. That's because C's approach to file I/O hides everything behind the FILE* interface, which actually makes it nicely object-oriented.
Since you're using C rather than C++, I would suggest using preprocessor macros to replace every instance of fopen(), fclose() and fread() with MEM_fopen() etc. which are routines that you will define. You will need to define your own FILE type, for which you could simply use the following:
typedef unsigned char *FILE;
(If you need to manage EOF, you will instead need FILE to be a struct with an additional length field.)
Then your MEM_fread() function will look something like:
int MEM_fread(unsigned char *buf, size_t size, size_t n, FILE *f) {
memcpy(buf, *f, size * n);
*f += size * n;
return n;
}
The signature for the MEM_fopen() "constructor" may need to change slightly, since the identifier you need is now a memory address instead of a filename.
glibc has fmemstream, open_memstream, and open_wmemstream which all return a FILE * that you can use with the stdio file IO functions directly and also call fclose on.
man 3 fmemopen
Is memcpy insufficient? It should be pretty easy to write a wrapper around it that has a signature similar to fread.
Just write your own version of fread(). Pass the .obj or .lib to the linker before the CRT library and the linker will pick your definition instead of the one from the CRT library.

Resources