I have one program that forks other programs. When the forked programs receive SIGUSR2, a variable in them is supposed to change. I'm not sure how to do that because that variable isn't in the scope of the function that SIGUSR2 calls.
In C, a function can not see/manipulate the value of a variable local to another function (ignoring the possibility of a visible pointer pointing to a local variable which is either static or in an active call frame).
Your question's setup is not very clear, but to answer generically (and perhaps a bit pedantically), code does not change variables, code changes memory.
That is... a variable is simply a convenient way to refer to a memory location. "Changing a variable" is really just changing the value at its position in memory. This is relevant because while it's very convenient to execute x = 5;, that's not the only way to change x. Any code that knows where x lives in memory, and has permission to write to that location, may therefore change x.
In your specific case you're starting a second process. Initially this second process has a copy of the memory of the first one, letting it read the same data, but it's typical that any post-fork changes to memory are only visible in the process that made the change.
Your wording suggests that you're not only calling fork(), but that you may also then be exec'ing to another program altogether... making even the copy of the parent's memory go away.
In short, what you're trying to do is probably not possible without going through some rather ugly hacks, and it would definitely be worth finding a different solution (such as shared memory).
Related
I was wondering. If I have an int variable which I want to be synced across all my threads - wouldn't I be able to reserve one bit to know whether the value is being updated or not?
To avoid the write operation being executed in chunks, which would mean threads could potentially be accessing mid-written value, which is not correct, or even worse, overwrite it, causing it to be totally wrong, I want the threads to first be informed that the variable is being written to. I could simply use an atomic operation to write the new value so that the other threads don't interfere, but this idea does not seem that dumb and I would like to use the basic tools first.
What if I just make one operation, which is small enough to keep it in one chunk, an operation like changing a single bit (which will still result in the whole byte(s) changing, but it's not the whole value changing, right), and let the bit indicate the variable is being written to or not? Would that even work, or would the whole int be written to?
I mean, even if the whole int was to change, this still would have a chance of working - if the bit indicating if the value is changing was written first.
Any thoughts on this?
EDIT: I feel like I did not specify what I am actually planning to do, and why I thought of this in the first place.
I am trying to implement a timeout function, similarly to setTimeout in JavaScript. It is pretty straightforward for a timeout that you don't want to ever cancel - you create a new thread, tell it to sleep for given amount of time, then give it a function to execute, eventually with some data. Piece of cake. Finished writing it in maybe half an hour, while being totally new to C.
The hard part comes when you want to set a timeout which might be canceled in the future. So you do exactly the same as a timeout without canceling, but when the thread wakes up and the CPU's scheduler puts it on, the thread must check if a value in the memory it was given when it started does not say 'you should stop executing'. The value could potentially be modified by other thread, but it would only be done once, at least in the best case scenario. I will worry about different solutions when it comes down to trying to modify the value from multiple threads at the same time. The base assumption right now is that only the main thread, or one of other threads, can modify the value, and it will happen only once. Control of it happening only once can be by setting up other variable, which might change multiple times, but always to the same value (that is, initial value is 0 and it means not-yet-canceled, but then when it must be canceled, the value changes to 1, so there is no worrying about the value being fragmented into multiple write operations and only chunk of it being updated at the time of reading it by different thread).
Given this assumption, I think the text I initially wrote at the beginning of this post should be more clear. In a nutshell, no need to worry about the value being written multiple times, only once, but by any thread, and the value must be available to be read by any other thread, or it must be indicated that it cannot be read.
Now as I am thinking of it, since the value itself will only ever be 0 or 1, the trick with knowing when it's already been canceled should work too, shouldn't it? Since the 0 or 1 will always be in one operation, so there is no need to worry about it being fragmented and read incorrectly. Please correct me if I'm wrong.
On the other hand, what if the value is being written from the end, not the beginning? If it's not possible then no need to worry and the post will be resolved, but I would like to know of every danger that might come with overcoming atomic operations like this, in this specific context. In case it is being written from the end, and a thread wants to access the variable to know if it should continue executing, it will notice that it indeed should, while the expected behaviour would be to stop executing. This should have completely minimal chance of being possible, but still is, which means it is dangerous, and I want it to be 100% predictable.
Another edit to explain what steps I imagine the program to make.
Main thread spawns a new thread, aka 'cancelable timeout'. It passes a function to execute along with data, time to sleep, and memory address, pointing to a value. After the thread wakes up after given time, it must check the value to see if it should execute the function it has been given. 0 means it should continue, 1 means it should stop and exit. The value (thread's 'state', canceled or not canceled) can be manipulated by either the main thread, or any other thread, 'timeout', which's job is to cancel the first thread.
Sample code:
struct Timeout {
void (*function)(void* data);
void* data;
int milliseconds;
int** base;
int cancelID;
};
DWORD WINAPI CTimeout(const struct Timeout* data) {
Sleep(data->milliseconds);
if(*(*(data->base) + sizeof(int) * data->cancelID) == 0) {
data->function(data->data);
}
free(data);
return 0;
}
Where CTimeout is a function provided to the newly-spawned thread. Please note that I have written some of this code on go and haven't tested it. Ignore any potential errors.
Timeout.base is pointer to a pointer to an array of ints, since many timeouts can exists at the same time. Timeout.cancelID is the ID of current thread on the list of timeouts. The same ID has a value if treated as index in the base array. If the value is 0, the thread should execute its function, else, clean up the data it has been given and nicely return. The reason behind base being pointer to a pointer, is because at any time, the array of states of timeouts can be resized. In case place of the array changes, there is no option to pass its initial place. It might potentially cause a segmentation fault (if not, correct me please), for accessing memory which does not belong to us anymore.
Base can be accessed from the main thread or other threads if necessary, and the state of our thread can be changed to cancel its execution.
If any thread wants to change the state (the state as state of the timeout we spawned at the beginning and want to cancel), it should change the value in the 'base' array. I think this is pretty straightforward so far.
There would be a huge problem if the values for continuing and stopping would be something bigger than just 1 byte. Operation to write to the memory could actually take multiple operations, and thus, accessing the memory too early would cause unexpected results to occur, which is not what I am fond of. Though, as I earlier mentioned out, what if the value is very small, 0 or 1? Wouldn't it matter at all at what time the value is accessed at? We are interested only in 1 byte, or even 2 or 4 bytes or the whole number, even 8 bytes wouldn't make any difference in this case, would they? In the end, there is no worry about receiving an invalid value, since we don't care about 32bit value, but just 1 bit, no matter how many bytes we would be reading.
Maybe it isn't exactly understandable what I mean. Write/read operations do not consist of reading single bits, but byte(s). That is, if our value is not bigger than 255, or 65535, or 4 million million, whatever the amount of bytes we are writing/reading is, we shouldn't worry about reading it in middle of it being written. What we care about is only one chunk of what is being written, the last or the first byte(s). The rest is completely useless to us, so no need to worry about it all being synced at the time we access the value. The real problem starts when the value is being written to, but the first byte written to is at the end, which is useless to us. If we read the value at that moment, we will receive what we shouldn't - no cancel state instead of cancel. If the first byte, given little endian, was to be read first, we would receive valid value even if reading in the middle of write.
Perhaps I am mangling and mistaking everything. I am not a pro, you know. Perhaps I have been reading trashy articles, whatever. If I am wrong about anything at all, please correct me.
Except for some specialised embedded environments with dedicated hardware, there is no such thing as "one operation, which is small enough to keep it in one chunk, an operation like changing a single bit". You need to keep in mind that you do not want to simply overwrite the special bit with "1" (or "0"). Because even if you could do that, it might just coincide with some other thread doing the same. What you need in fact to do is to check whrther it is already 1 and ONLY if it is NOT write a 1 yourself and KNOW that you did not overwrite an existing 1 (or that writing your 1 failed because of a 1 already being there).
This is called the critical section. And this problem can only be solved by the OS, which happens to know or be able to prevent about other parallel threads. This is the reason for the existence of the OS-supported synchronisation methods.
There is no easy way around this.
I have an array of bytes that is used as an emulated system RAM. I want to make a bullet-proof patch for a given cell, that detects when it's being written to, and overwrites it instantly. Using a loop like
for (;;) {
address = x;
sleep(y);
}
has a flaw that there's a minimum possible value for sleep, which appears to be nearly identical to the emulated frame length, so it'd only patch the address once per frame. So, if it's written to 100 times per frame by a game, such a patch will make little sense.
I have some hooks on writing, but those only catch writes by reading the game's code being executed, while I want to make such patches work for any memory region, not just RAM, hence I can't rely on interpreting the emulated code too much (it simply doesn't match for all regions I want to patch).
So I need some pragrammatical watchpoint, having a pointer to the array, and a byte I want to watch change.
Although C is not an object-oriented language, I would use an object-oriented approach here:
Wrap the emulated memory up in an opaque pointer that can only be read and written to with a specific set of functions (e.g. memory_write_byte and memory_read_byte).
Make the memory object maintain a list of function pointers that point to callback functions for handling write events. Whenever a write happens, make it call all those callbacks.
The part of the code that wants to monitor that spot in memory can register a callback with the memory object, and whenever the callback gets called it can modify the memory if needed.
I'd look into shared memory ala mmap. Using mmap you can have the same page shared by two processes and one of the processes can be read only.
When a write on this memory region occurs a SIGSEGV would be generated, which you can catch, and then take some sort of an action. This is using UNIX terminology, but you can do the same thing with windows it is just slightly more involved.
I know that declaring a static variable within a function in C means that this variable retains its state between function invocations. In the context of threads, will this result in the variable retaining its state over multiple threads, or having a separate state between each thread?
Here is a past paper exam question I am struggling to answer:
The following C function is intended to be used to allocate unique identifiers (UIDs) to its callers:
get_uid()
{
static int i = 0;
return i++;
}
Explain in what way get_uid() might work incorrectly in an environment where it is being called by multiple threads. Using a
specific example scenario, give specific detail on why and how such
incorrect behaviour might occur.
At the moment I am assuming that each thread has a separate state for the variable, but I am not sure if that is correct or if the answer is more to do with the lack of mutual exclusion. If that is the case then how could semaphores be implemented in this example?
Your assumption (threads have their own copy) is not correct. The main problem with code is when multiple threads call that function get_uid(), there's a possible race condition as to which threads increments i and gets the ID which may not be unique.
All the threads of a process share the same address space. Since i is a static variable, it has a fixed address. Its "state" is just the content of the memory at that address, which is shared by all the threads.
The postfix ++ operator increments its argument and yields the value of the argument before the increment. The order in which these are done is not defined. One possible implementation is
copy i to R1
copy R1 to R2
increment R2
copy R2 to i
return R1
If more than one thread is running, they can both be executing these instructions simultaneously or interspersed. Work out for yourself sequences where various results obtain. (Note that each thread does have its own register state, even for threads running on the same CPU, because registers are saved and restored when threads are switched.)
A situation like this where there are different results depending on the indeterministic ordering of operations in different threads is called a race condition, because there's a "race" among the different threads as to which one does which operation first.
No, if you want a variable which value depends upon the thread in which it is used, you should have a look at Thread Local Storage.
A static variable, you can imagine it really like a completely global variable. It's really much the same. So it's shared by the whole system that knows its address.
EDIT: also as a comment reminds it, if you keep this implementation as a static variable, race conditions could make that the value i is incremented at the same time by several threads, meaning that you don't have any idea of the value which will be returned by the function calls. In such cases, you should protect access by so called synchronization objects like mutexes or critical sections.
Since this looks like homework, I'll answer only part of this and that is each thread will share the same copy of i. IOW, threads do not get their own copies. I'll leave the mutual exclusion bit to you.
Each thread will share the same static variable which is mostly likely a global variable. The scenario where some threads can have wrong value is the race condition (increment isn't done in one single execution rather it is done in 3 assembly instructions, load, increment, store). Read here and the diagram at the link explains it well.
Race Condition
If you are using gcc you can use the atomic builtin functions. I'm not sure what is available for other compilers.
int get_uid()
{
static int i = 0;
return __atomic_fetch_add(&i, 1, __ATOMIC_SEQ_CST);
}
This will ensure that the variable cannot be acted on by more than one thread at a time.
As started here
I need to know how to read the start address and length (virtual memory map) of a process.
I would like to map a process memory. I would like to read values of a process memory and write values to them.
I'm curious about how programs like Cheat-O'matic (cheat-o-matic.softonic.com.br) work. First thing I thought was that the process would be loaded in a contiguous memory location. But that seems not right.
Call repeatedly VirtualQueryEx, starting with address zero and increasing each time of the value obtained in the RegionSize member of the MEMORY_BASIC_INFORMATION structure you passed to it. To obtain a meaningful map obviously the process should be paused.
Still, even after you got this memory map, I'm not sure what you can do with it: unless you know (by other means) the internals of the process you are accessing all you get to know is locations where you can read or write without triggering an access violation, not the meaning of their content. You should really clarify what you are trying to achieve, Read/WriteProcessMemory usually aren't a solution for "normal" problems.
What is a re entrant procedure and can you give an example scenario of when it is used?
Edit: Also, can multiple processes access a re entrant procedure in parallel?
Please provide a different way of explaining than wikipedia as I don't totally understand their description hence my question here
The idea behind re-entrancy is that the routine may be called while it is in the middle of executing already and it will still work right.
Generally this is achieved by it using only parameters and local variables declared on the stack (in C terms, no static locals). It would also be important that it not lock any global resources during execution.
Now, you may ask, "How would such a weird thing as a routine being run multiple times at once happen?" Well, some ways this could happen are:
The routine is recursive (or mutually-recursive with some other set of routines).
It gets called by another thread.
It gets called by an interrupt.
If any of these happen, and the routine is modifying a global (or C static local), then the new execution could potentially wipe out the changes the first execution made. As an example, if that global was used as a loop control variable, it might cause the first execution, when it finally gets to resume, to loop the wrong number of times.
It is a subroutine which can be called when it is already active. For instance recursive functions are often reentrant. Functions which are called from signal handlers must be reentrant as well. A reentrant function is thread-safe but not all thread-safe one are reentrant.
A reentrant procedure is one in which a single copy of the program code can be shared by multiple users during the same period of time. Re entrance has two key aspects: The program code cannot modify itself and the local data for each user must be stored separately.
In a shared system, reentrancy allows more efficient use of main memory: One copy of the program code is kept in main memory, but more than one application can call the procedure. Thus, a reentrant procedure must have a permanent part( the instructions that make up the procedure) and a temporary part(a pointer back to the calling program as well as memory for local variables used by the program).
Each execution instance, called activation, of a procedure will execute the code in the permanent part but must have its own copy of local variables and parameters. The temporary part associated with a particular activation is referred to as an activation record.
The most convenient way to support reentrant procedures is by means of a stack. When a reentrant procedure is called, the activation record becomes part of the stack frame that is created on procedure call