How to determine if a pointer is in rodata [duplicate] - c

This question already has answers here:
How can I prevent (not react to) a segmentation fault?
(3 answers)
Closed 2 years ago.
Can I tell if a pointer is in the rodata section of an executable?
As in, editing that pointer's data would cause a runtime system trap.
Example (using a C character pointer):
void foo(char const * const string) {
if ( in_rodata( string ) ) {
puts("It's in rodata!");
} else {
puts("That ain't in rodata");
}
}
Now I was thinking that, maybe, I could simply compare the pointer to the rodata section.
Something along the lines of:
if ( string > start_of_rodata && string < end_of_rodata ) {
// it's in rodata!
}
Is this a feasible plan/idea?
Does anyone have an idea as to how I could do this?
(Is there any system information that one might need in order to answer this?)
I am executing the program on a Linux platform.

I doubt that it could possibly be portable
If you don't want to mess with linker scripts or using platform-specific memory map query APIs, a proxy approach is fairly portable on platforms with memory protection, if you're willing to just know whether the location is writable, read-only, or neither. The general idea is to do a test read and a test write. If the first succeeds but the second one fails, it's likely .rodata or code segment. This doesn't tell you "it's rodata for sure" - it may be a code segment, or some other read-only page, such as as read-only file memory mapping that has copy-on-write disabled. But that depends on what you had in mind for this test - what was the ultimate purpose.
Another caveat is: For this to be even remotely safe, you must suspend all other threads in the process when you do this test, as there's a chance you may corrupt some state that code executing on another thread may happen to refer to. Doing this from inside a running process may have hard-to-debug corner cases that will stop lurking and show themselves during a customer demo. So, on platforms that support this, it's always preferable to spawn another process that will suspend the first process in its entirety (all threads), probe it, write the result to the process's address space (to some result variable), resume the process and terminate itself. On some platforms, it's not possible to modify a process's address space from outside, and instead you need to suspend the process mostly or completely, inject a probe thread, suspend the remaining other threads, let the probe do its job, write an answer to some agreed-upon variable, terminate, then resume everything else from the safety of an external process.
For simplicity's sake, the below will assume that it's all done from inside the process. Even though "fully capable" self-contained examples that work cross-process would not be very long, writing this stuff is a bit tedious especially if you want it short, elegant and at least mostly correct - I imagine a really full day's worth of work. So, instead, I'll do some rough sketches and let you fill in the blanks (ha).
Windows
Structured exceptions get thrown e.g. due to protection faults or divide by zero. To perform the test, attempt a read from the address in question. If that succeeds, you know it's at least a mapped page (otherwise it'll throw an exception you can catch). Then try writing there - if that fails, then it was read-only. The code is almost boring:
static const int foo;
static int bar;
#if _WIN32
typedef struct ThreadState ThreadState;
ThreadState *suspend_other_threads(void) { ... }
void resume_other_threads(ThreadState *) { ... }
int check_if_maybe_rodata(void *p) {
__try {
(void) *(volatile char *)p;
} __finally {
return false;
}
volatile LONG result = 0;
ThreadState *state = suspend_other_threads();
__try {
InterlockedExchange(&result, 1);
LONG saved = *(volatile LONG*)p;
InterlockedExchange((volatile LONG *)p, saved);
InterlockedExchange(&result, 0); // we succeeded writing there
} __finally {}
resume_other_threads(state);
return result;
}
int main() {
assert(check_if_maybe_rodata(&foo));
assert(!check_if_maybe_rodata(&bar));
}
#endif
Suspending the threads requires traversing the thread list, and suspending each thread that's not the current thread. The list of all suspended threads has to be created and saved, so that later the same list can be traversed to resume all the threads.
There are surely caveats, and WoW64 threads have their own API for suspension and resumption, but it's probably something that would, in controlled circumstances, work OK.
Unix
The idea is to leverage the kernel to check the pointer for us "at arms length" so that no signal is thrown. Handling POSIX signals that result from memory protection faults requires patching the code that caused the fault, inevitably forcing you to modify the protection status of the code's memory. Not so great. Instead, pass a pointer to a syscall you know should succeed in all normal circumstances to read from the pointed-to-address - e.g. open /dev/zero, and write to that file from a buffer pointed-to by the pointer. If that fails with EFAULT, it is due to buf [being] outside your accessible address space. If you can't even read from that address, it's not .rodata for sure.
Then do the converse: from an open /dev/zero, attempt a read to the address you are testing. If the read succeeds, then it wasn't read-only data. If the read fails with EFAULT that most likely means that the area in question was read-only since reading from it succeeded, but writing to it didn't.
In all cases, it'd be most preferable to use native platform APIs to test the mapping status of the page on which the address you try to access resides, or even better - to walk the sections list of the mapped executable (ELF on Linux, PE on Windows), and see exactly what went where. It's not somehow guaranteed that on all systems with memory protection the .rodata section or its equivalent will be mapped read only, thus the executable's image as-mapped into the running process is the ultimate authority. That still does not guarantee that the section is currently mapped read-only. An mprotect or a similar call could have changed it, or parts of it, to be writable, even modified them, and then perhaps changed them back to read-only. You'd then have to either checksum the section if the executable's format provides such data, or mmap the same binary somewhere else in memory and compare the sections.
But I smell a faint smell of an XY problem: what is it that you're actually trying to do? I mean, surely you don't just want to check if an address is in .rodata out of curiosity's sake. You must have some use for that information, and it is this application that would ultimately decide whether even doing this .rodata check should be on the radar. It may be, it may be not. Based on your question alone, it's a solid "who knows?"

Related

Buffering expectations using `printf`

Say there exists a C program that executes in some Linux process. Upon start, the C program calls setvbuf to disable buffering on stdout. The program then alternates between two "logical" calls ("logical" in this sense to avoid consideration of the compiler possibly reordering instructions) - the first to printf() and the second incrementing a variable.
int main (int argc, char **argv)
{
setvbuf(stdout, NULL, _IONBF, 0);
unsigned int a = 0;
for (;;) {
printf("hello world!");
a++;
}
}
At some point assume the program receives a signal, e.g. via kill, that causes the program to terminate. Will the contents of stdout always be complete after the signal is received, in the sense that they include the result of all previous invocations to printf(), or is this dependent on other levels of buffering/other behavior not controllable via setvbuf (e.g. kernel buffering)?
The broader context of this question is, if using a synchronous logging mechanism in a C application (e.g. all threads log with printf()), can the log be trusted to be "complete" for all calls that have returned from printf() upon receiving some application-terminating signal?
Edit: I've edited the code snippet and question to remove undefined behavior for clarity.
Any sane interpretation of the expression "unbuffered stream" means that the data has left the stream object when printf returns. In the case of file-descriptor backed streams, that means the data has entered kernel-space, and the kernel should continue sending the data to its final destination (assuming no kernel panic, power loss etc).
But a problem with segfaults is that they may not happen when you think they do. Take for instance the following code:
int *p = NULL;
printf("hello world\n");
*p = 1;
A dumb non-optimizing compiler may create code that segfaults at *p=1;. But that is not the only possibility according to the c-standard. A compiler may for instance, if it can prove that printf doesn't depend on the contents of *p, reorganize the code like this:
int *p = NULL;
*p = 1;
printf("hello world\n");
In that case printf would never be called.
Another possibility is that, since p==NULL, *p=1 is invalid, the compiler may scrap that expression all together.
EDIT: The poster has changed the question from "Segfaulting" to being killed. In that case, it should all depend on if the kernel closes open file descriptors on exit the same way as close does, or not.
Given a construct like:
fprintf(file1, "whatever"); fflush(file1);
file2 = fopen(someExistingFile, "w");
there are some circumstances where it may be essential that fopen doesn't overwrite the existing file unless or until the write to file1 can be guaranteed successful, but there are others where waiting until success of the fflush can be assured before starting the fopen would needlessly degrade performance. In order to allow designers of C implementations to weigh such considerations however they see fit, and also avoid requiring that implementations provide semantic guarantees beyond those offered by the underlying OS (e.g. if an OS reports that the fflush() is complete before data is written to disk, and offers no way of finding out when all pending writes are complete, there would be no way the Standard could usefully require that an implementation which targets that OS must not allow fflush to return at any time when the write could still fail).
So, it appears that there's a basic misunderstanding in your question, and I think it's important to go through the basics of what printf is -> if your stdout buffer size is 0, then the question of "will all data be sent out of the buffer" is always yes, since there isn't a hardware buffer to save data, in theory. That is, somewhere in your computer hardware there's a something like a UART chip, that has a small buffer for transferring data. Most programs I've seen do not use this hardware buffer, so It's not surprising that your program does this.
However, the printf function has an upper layer buffer (in my application ~150 characters), and I'm assuming that this is the buffer you're asking about, note that this is not the same thing as the stdout buffer, its just an allocated piece of memory that stores messages before they're sent to wherever you want them to go. Think about it - if there were no printf-specific buffer you would only be able to send 1 character per function call
Now it really depends on the implementation of printf on your system, if it's nonblocking or blocking. If it's nonblocking, that could mean that data is being transferred by an interrupt or a DMA, probably a combination of both. In which case it depends on if your system stops these transfer mechanisms in the middle of a transfer, or allows them to complete. It's impossible for me to say based on the information you've given
However, in my experience, printf is usually a blocking function; that is it locks up the rest of your code while it's transferring things out of the buffer and moves to the next command only once it's completed, in which case if you have stopped the code from running (again, I'm not certain on the specifics of "kill" in your system) then you have also stopped the transfer.
Your system most likely has blocking PRINTF calls, and considering you say a "kill" signal it sounds like you're not even really sure what you mean by that. I think it's safe to assume that whatever signal you're talking about is not internally stopping your printf function from completing, so your full message will probably be sent before exiting, even if it arrives mid-printf. If your printf is being called it most likely is completing and sending the full message, unless this "kill" signal does something odd. That's the best answer I can give you from a "C" standpoint - if you would like a more absolute answer you would have to give us information that lets us see the implementation of "printf" on your operating system, and/or give us more specifics on how this "kill signal" you mentioned works

How do I get a function to execute in a different address space? Writing a clone function

I have this code that gives me a segmentation fault. My understanding of the clone function is that the parent process has to allocate space for the child process and clone calls a function that runs in that stack space. Am I misunderstanding something or does my code just not make sense?
char *stack;
char *stackTop;
stack = malloc(STACK_SIZE);
if (stack == NULL)
fprintf(stderr, "malloc");
stackTop = stack + STACK_SIZE;
myClone(childFunc, stackTop, CLONE_FILES, NULL);
int myClone(int (*fn)(void *), void *child_stack,int flags, void *arg){
int* space = memcpy(child_stack, fn, sizeof(fn));
typedef int func(void);
func* f = (func*)&space;
f();
}
There are two main reasons why this wouldn't work.
Memory protection: the relevant memory pages must be executable. Data pages, you got from malloc are not. "Normal" memory-management functions can't do this. On the other hand, the existing code pages are not writable, so you can't move one piece of code onto another. This is a fundamental memory-protection mechanism. You have to either go back to DOS or to use some advanced "debugging" interface.
Position-independent code: all memory addresses in your code must be either relative ones, or be fixuped manually. It may be too tricky to do this in C.
The clone() function is a system call. It cannot be replicated by C code running within your process.
There's a fundamental misunderstanding there. So you're getting a segmentation fault, that tells me you're trying to run this code in user space (in a process created by the operating system).
An address space is an abstraction available to the operating system. It typically uses hardware support (that of an MMU [memory management unit]) which provides means to use virtual addresses. These are addresses that are, when accessed, automatically translated to the real physical addresses according to some data structures that only the OS can manage.
I don't think it makes much sense to go into great detail here, you have enough key words to google for. The essence is: There is no way you can create an address space from user space code. That functionality is reserved to the OS and to do it, clone() on linux issues a syscall, invoking the OS.
edit: concerning the stack, providing a stack means to reserve space for it (by mapping an appropriate amount of pages to the address space) and setting the necessary processor registers when context is switched to the process (e.g. esp/ebp on i386). This, too, is something only the operating system can do.

C overwrite return address to a function cause kernel panic

I'm doing a bit reverse engineering practice and I got stuck at this problem. The general idea is that having a process P1 call a function, f1(). At the beginning of f1() I let it sleep so our evil process P2 can kick in. In P2, I overwrite the return address on f1()'s stack to our evil function, fevil(). But when f1() wakes up, it crashes before jump to fevil().
More detail:
I'm using a kind of OS without any memory protection. Every process can read/write the
whole memory range.
The whole thing runs on x86 architecture.
The way I do it is locate the return address on the call stack of f1(), let's say 0xffeecc, and do *((int*) 0xffeecc) = fevil;
I'm using gcc and all sort of standard C stuff.
The OS is single thread, and these two processes are the only two running, in additional to main process.
So the question is why the whole thing crashes, and if it's the correct way to jump to a function by the address of the function.
I can provide more details upon request. Thank you.
Actually your compiler might implement some memory protection : like canaries against buffer overflow/stack-smashing.
It inserts magic words before the return address and check its integrity before jumping at it.
You may have overwritten this marker.

Debugging a clobbered static variable in C (gdb broken?)

I've done a lot of programming but not much in C, and I need advice on debugging. I have a static variable (file scope) that is being clobbered after about 10-100 seconds of execution of a multithreaded program (using pthreads on OS X 10.4). My code looks something like this:
static float some_values[SIZE];
static int * addr;
addr points to valid memory address for a while, and then gets clobbered with some value (sometimes 0, sometimes nonzero), thereby causing a segfault when dereferenced. Poking around with gdb I have verified that addr is being layed out in memory immediately after some_values as one would expect, so my first guess would be that I have used an out-of-bounds index to write to some_values. However, this is a tiny file, so it is easy to check this is not the problem.
The obvious debugging technique would be to set a watchpoint on the variable addr. But doing so seems to create erratic and inexplicable behavior in gdb. The watchpoint gets triggered at the first assignment to addr; then after I continue execution, I immediately get a nonsensical segfault in another thread...supposedly a segfault on accessing the address of a static variable in a different part of the program! But then gdb lets me read from and write to that memory address interactively.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x001d5bd0
0x0000678d in receive (arg=0x0) at mainloop.c:39
39 sample_buf_cleared ++;
(gdb) p &sample_buf_cleared
$17 = (int *) 0x1d5bd0
(gdb) p sample_buf_cleared
$18 = 1
(gdb) set sample_buf_cleared = 2
(gdb)
gdb is obviously confused. Does anyone know why? Or does anyone have any suggestions for debugging this bug without using watchpoints?
You could put an array of uint's between some_values and addr and determine if you are overruning some_values or if the corruption affects more addresses then you first thought. I would initialize padding to DEADBEEF or some other obvious pattern that is easy to distinguish and unlikely to occur in the program. If a value in the padding changes then cast it to float and see if the number makes sense as a float.
static float some_values[SIZE];
static unsigned int padding[1024];
static int * addr;
Run the program multiple times. In each run disable a different thread and see when the problems goes away.
Set the programs process affinity to a single core and then try the watchpoint. You may have better luck if you don't have two threads simultaneously modifying the value. NOTE: This solution does not preclude that from happening. It may make it easier to catch in a debugger.
static variables and multi-threading generally do not mix.
Without seeing your code (you should include your threaded code), my guess is that you have two threads concurrently writing to addr variable. It doesn't work.
You either need to:
create separate instances of addr for each thread; or
provide some sort of synchronisation around addr to stop two threads changing the value at the same time.
Try using valgrind; I haven't tried valgrind on OS X, and I don't understand your problem, but "try valgrind" is the first thing I think of when you say "clobbered".
One thing you could try would be to create a separate thread whose only purpose is to watch the value of addr, and to break when it changes. For example:
static int * volatile addr; // volatile here is important, and must be after the *
void *addr_thread_proc(void *arg)
{
while(1)
{
int *old_value = addr;
while(addr == old_value) /* spin */;
__asm__("int3"); // break the debugger, or raise SIGTRAP if no debugger
}
}
...
pthread_t spin_thread;
pthread_create(&spin_thread, NULL, &addr_thread_proc, NULL);
Then, whenever the value of addr changes, the int3 instruction will run, which will break the debugger, stopping all threads.
gdb often acts weird with multithreaded programs. Another solution (if you can afford it) would be to put printf()s all over the place to try and catch the moment where your value gets clobbered. Not very elegant, but sometimes effective.
I have not done any debugging on OSX, but I have seen the same behavior in GDB on Linux: program crashes, yet GDB can read and write the memory which program just tried to read/write unsuccessfully.
This doesn't necessarily mean GDB is confused; rather the kernel allowed GDB to read/write memory via ptrace() which the inferior process is not allowed to read or write. IOW, it was a (recently fixed) kernel bug.
Still, it sounds like GDB watchpoints aren't working for you for whatever reason.
One technique you could use is to mmap space for some_values rather than statically allocating space for them, arrange for the array to end on a page boundary, and arrange for the next page to be non-accessible (via mprotect).
If any code tries to access past the end of some_values, it will get an exception (effectively you are setting a non-writable "watch point" just past some_values).

Reading Other Process' Memory in OS X?

I've been trying to understand how to read the memory of other processes on Mac OS X, but I'm not having much luck. I've seen many examples online using ptrace with PEEKDATA and such, however it doesn't have that option on BSD [man ptrace].
int pid = fork();
if (pid > 0) {
// mess around with child-process's memory
}
How is it possible to read from and write to the memory of another process on Mac OS X?
Use task_for_pid() or other methods to obtain the target process’s task port. Thereafter, you can directly manipulate the process’s address space using vm_read(), vm_write(), and others.
Matasano Chargen had a good post a while back on porting some debugging code to OS X, which included learning how to read and write memory in another process (among other things).
It has to work, otherwise GDB wouldn't:
It turns out Apple, in their infinite wisdom, had gutted ptrace(). The OS X man page lists the following request codes:
PT_ATTACH — to pick a process to debug
PT_DENY_ATTACH — so processes can stop themselves from being debugged
[...]
No mention of reading or writing memory or registers. Which would have been discouraging if the man page had not also mentioned PT_GETREGS, PT_SETREGS, PT_GETFPREGS, and PT_SETFPREGS in the error codes section. So, I checked ptrace.h. There I found:
PT_READ_I — to read instruction words
PT_READ_D — to read data words
PT_READ_U — to read U area data if you’re old enough to remember what the U area is
[...]
There’s one problem solved. I can read and write memory for breakpoints. But I still can’t get access to registers, and I need to be able to mess with EIP.
I know this thread is 100 years old, but for people coming here from a search engine:
xnumem does exactly what you are looking for, manipulate and read inter-process memory.
// Create new xnu_proc instance
xnu_proc *Process = new xnu_proc();
// Attach to pid (or process name)
Process->Attach(getpid());
// Manipulate memory
int i = 1337, i2 = 0;
i2 = process->memory().Read<int>((uintptr_t)&i);
// Detach from process
Process->Detach();
It you're looking to be able to share chunks of memory between processes, you should check out shm_open(2) and mmap(2). It's pretty easy to allocate a chunk of memory in one process and pass the path (for shm_open) to another and both can then go crazy together. This is a lot safer than poking around in another process's address space as Chris Hanson mentions. Of course, if you don't have control over both processes, this won't do you much good.
(Be aware that the max path length for shm_open appears to be 26 bytes, although this doesn't seem to be documented anywhere.)
// Create shared memory block
void* sharedMemory = NULL;
size_t shmemSize = 123456;
const char* shmName = "mySharedMemPath";
int shFD = shm_open(shmName, (O_CREAT | O_EXCL | O_RDWR), 0600);
if (shFD >= 0) {
if (ftruncate(shFD, shmemSize) == 0) {
sharedMemory = mmap(NULL, shmemSize, (PROT_READ | PROT_WRITE), MAP_SHARED, shFD, 0);
if (sharedMemory != MAP_FAILED) {
// Initialize shared memory if needed
// Send 'shmemSize' & 'shmemSize' to other process(es)
} else handle error
} else handle error
close(shFD); // Note: sharedMemory still valid until munmap() called
} else handle error
...
Do stuff with shared memory
...
// Tear down shared memory
if (sharedMemory != NULL) munmap(sharedMemory, shmemSize);
if (shFD >= 0) shm_unlink(shmName);
// Get the shared memory block from another process
void* sharedMemory = NULL;
size_t shmemSize = 123456; // Or fetched via some other form of IPC
const char* shmName = "mySharedMemPath";// Or fetched via some other form of IPC
int shFD = shm_open(shmName, (O_RDONLY), 0600); // Can be R/W if you want
if (shFD >= 0) {
data = mmap(NULL, shmemSize, PROT_READ, MAP_SHARED, shFD, 0);
if (data != MAP_FAILED) {
// Check shared memory for validity
} else handle error
close(shFD); // Note: sharedMemory still valid until munmap() called
} else handle error
...
Do stuff with shared memory
...
// Tear down shared memory
if (sharedMemory != NULL) munmap(sharedMemory, shmemSize);
// Only the creator should shm_unlink()
You want to do Inter-Process-Communication with the shared memory method. For a summary of other commons method, see here
It didn't take me long to find what you need in this book which contains all the APIs which are common to all UNIXes today (which many more than I thought). You should buy it in the future. This book is a set of (several hundred) printed man pages which are rarely installed on modern machines.
Each man page details a C function.
It didn't take me long to find shmat() shmctl(); shmdt() and shmget() in it. I didn't search extensively, maybe there's more.
It looked a bit outdated, but: YES, the base user-space API of modern UNIX OS back to the old 80's.
Update: most functions described in the book are part of the POSIX C headers, you don't need to install anything. There are few exceptions, like with "curses", the original library.
I have definitely found a short implementation of what you need (only one source file (main.c)).
It is specially designed for XNU.
It is in the top ten result of Google search with the following keywords « dump process memory os x »
The source code is here
but from a strict point of virtual address space point de vue, you should be more interested with this question: OS X: Generate core dump without bringing down the process? (look also this)
When you look at gcore source code, it is quite complex to do this since you need to deal with treads and their state...
On most Linux distributions, the gcore program is now part of the GDB package. I think the OSX version is installed with xcode/the development tools.
UPDATE: wxHexEditor is an editor which can edit devices. IT CAN also edit process memory the same way it does for regular files. It work on all UNIX machines.
Manipulating a process's memory behind its back is a Bad Thing and is fraught with peril. That's why Mac OS X (like any Unix system) has protected memory, and keeps processes isolated from one another.
Of course it can be done: There are facilities for shared memory between processes that explicitly cooperate. There are also ways to manipulate other processes' address spaces as long as the process doing so has explicit right to do so (as granted by the security framework). But that's there for people who are writing debugging tools to use. It's not something that should be a normal — or even rare — occurrence for the vast majority of development on Mac OS X.
In general, I would recommend that you use regular open() to open a temporary file. Once it's open in both processes, you can unlink() it from the filesystem and you'll be set up much like you would be if you'd used shm_open. The procedure is extremely similar to the one specified by Scott Marcy for shm_open.
The disadvantage to this approach is that if the process that will be doing the unlink() crashes, you end up with an unused file and no process has the responsibility of cleaning it up. This disadvantage is shared with shm_open, because if nothing shm_unlinks a given name, the name remains in the shared memory space, available to be shm_opened by future processes.

Resources