I have 2 programs (processes). One process writes to shared memory while the other reads it. So, my approach was like this:
Initially, value of shared memory is 0. So, process 1 writes only when value is 0. Now process 1 has written some other value to shm and "waits" till value becomes 0. Process 2 reads the shm and writes 0 to it. By wait, I mean in while(1) loop.
My question is if this approach fine or can I do better with some other approach in terms of CPU usage and memory usage?
Mentioned problem known as Process Synchronization problem and given logic is nothing but Busy Waiting approach of the problem which is very primary solution.
Read Producer-Consumer Problem which is similiar to given problem.
There are some better solutions to this than Busy Waiting like:
Spinlock, Semaphore etc.
You can get basic knowledge of all of this from here
Hope it will help!!
I think this is fine but the problem occurs when both the process write to a shared memory block.
At that time you could use a semaphore to synchronize the two process allowing one at a time to write to shared resource/memory block.
You can find regarding Semaphores Click [here](https://en.wikipedia.org/wiki/Semaphore_(programming)
I'm using a simple fork() parent-child example to have the child generate some data, and write() it for the parent. The child will atomically write less than 64kib (65536 bytes) of data atomically to the pipe.
The parent reads from the pipe, and when it receives EOF (ie: assuming that the remote side has been closed), it carries on with some processing logic and closes at its own convenience, and doesn't care how long it takes the child to terminate.
Is the parent guaranteed to be able to read all of the client data that was sent before EOF is encountered, or does any potential OS-level logic kick in to trigger the EOF early before all of the data is read?
I have found a very similar question on SO, but it didn't receive an authoritative/cited answer.
Thank you.
Yes, the parent will be able to read all the data. To put your mind at ease, try the following in a shell:
echo test | (sleep 1; cat)
The echo command is done immediately; the other side of the pipe will wait one second and then tries to read from it. This just works.
The child can also write more than 64 kiB without problems, as long as the parent will keep on reading in a loop, although then it won't be atomic any longer.
I know for pthreads, if they're modifying the same variables or files, you can use pthread_mutex_lock to prevent simultaneous writes.
If I'm using fork() to have multiple processes, which are editing the same file, how can I make sure they're not writing simultaneously to that file?
Ideally I'd like to lock the file for one writer at a time, and each process would only need to write once (no loops necessary). Do I need to do this manually or will UNIX do it for me?
Short answer: you have to do it manually. There are certain guarantees on the atomicity of each write, but you'll still need to synchronize the processes to avoid interleaving writes. There are a lot of techniques for synchronizing processes. Since all of your writers are descendants of a common process, probably the easiest thing to do is to pass a token on a common pipe. Before you fork, create a pipe and write a single byte into it. Any time a process wants to write to the file, it will do a blocking read on the pipe. If it gets a byte, then it proceeds to write to the file. When it is done, it writes a byte back into the pipe. If any other process wants to access the file, it will block on the pipe read until the other process is done writing. This is often simpler than using a semaphore, which is another excellent technique.
I am sorry if this sounds like I am repeating this question, but I have a couple additions that I am hoping someone can explain for me.
I am trying to implement a 'packet queueing system' with pipes. I have 1 thread that has a packet of data that it needs to pass to a second thread (Lets call the threads A and B respectively). Originally I did this with a queueing structure that I implemented using linked lists. I would lock a Mutex, write to the queue, and then unlock the Mutex. On the read side, I would do the same thing, lock, read, unlock. Now I decided to change my implementation and make use of pipes (so that I can make use of blocking when data is not available). Now for my question:
Do I need to use Mutexs to lock the file descriptors of the pipe for read and write operations?
Here is my thinking.
I have a standard message that gets written to the pipe on writes, and it is expected to be read on the read side.
struct pipe_message {
int stuff;
short more_stuff;
char * data;
int length;
};
// This is where I read from the pipe
num_bytes_read = read(read_descriptor, &buffer, sizeof(struct pipe_message));
if(num_bytes_read != sizeof(struct pipe_message)) // If the message isn't full
{
printe("Error: Read did not receive a full message\n");
return NULL;
}
If I do not use Mutexs, could I potentially read only half of my message from the pipe?
This could be bad because I would not have a pointer to the data and I could be left with memory leaks.
But, if I use Mutexs, I would lock the Mutex on the read, attempt to read which would block, and then because the Mutex is locked, the write side would not be able to access the pipe.
Do I need to use Mutexs to lock the file descriptors of the pipe for read and write operations?
It depends on the circumstances. Normally, no.
Normality
If you have a single thread writing into the pipe's write file descriptor, no. Nor does the reader need to use semaphores or mutexes to control reading from the pipe. That's all taken care of by the OS underneath on your behalf. Just go ahead and call write() and read(); nothing else is required.
Less Usual
If you have multiple threads writing into the pipe's write file descriptor, then the answer is maybe.
Under Linux calling write() on the pipe's write file descriptor is an atomic operation provided that the size of data being written is less than a certain amount (this is specified in the man page for pipe(), but I recall that it's 4kbytes). This means that you don't need a mutex or semaphore to control access to the pipe's write file descriptor.
If the size of the data you're writing is too large then then the call to write() on the pipe is not atomic. So if you have multiple threads writing to the pipe and the size is too large then you do need a mutex to control access to the write end of the pipe.
Using a mutex with a blocking pipe is actually dangerous. If the write side takes the mutex, writes to the pipe and blocks because the pipe is full, then the read side can't get the mutex to read the data from the pipe, and you have a deadlock.
To be safe, on the write side you'd probably need to do something like take the mutex, check if the pipe has space for what you want to write, if not then release the mutex, yield and then try again.
The Story
There is a writer thread, periodically gathering data from somewhere (in real-time, but that doesn't matter much in the question). There are many readers then reading from these data. The usual solution for this is with two reader-writer's lock and two buffers like this:
Writer (case 1):
acquire lock 0
loop
write to current buffer
acquire other lock
free this lock
swap buffers
wait for next period
Or
Writer (case 2):
acquire lock 0
loop
acquire other lock
free this lock
swap buffers
write to current buffer
wait for next period
The Problem
In both methods, if the acquire other lock operation fails, no swap is done and writer would overwrite its previous data (because writer is real-time, it can't wait for readers) So in this case, all readers would lose that frame of data.
This is not such a big deal though, the readers are my own code and they are short, so with double buffer, this problem is solved, and if there was a problem I could make it triple buffer (or more).
The problem is the delay that I want to minimize. Imagine case 1:
writer writes to buffer0 reader is reading buffer1
writer can't acquire lock1 because reader is still reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) <- **this point**
|
|
writer wakes up, and again writes to buffer0
At **this point**, other readers in theory could have read data of buffer0 if only the writer could do the swap after the reader finishes instead of waiting for its next period. What happened in this case is that just because one reader was a bit late, all readers missed one frame of data, while the problem could have been totally avoided.
Case 2 is similar:
writer writes to buffer0 reader is idle
| |
| reader finishes reading,
| (writer waiting for next period)
|
| reader starts reading buffer1
writer wakes up |
it can't acquire lock1 because reader is still reading buffer1
overwrites buffer0
I tried mixing the solutions, so the writer tries swapping buffers immediately after writing, and if not possible, just after waking up in the next period. So something like this:
Writer (case 3):
acquire lock 0
loop
if last buffer swap failed
acquire other lock
free this lock
swap buffers
write to current buffer
acquire other lock
free this lock
swap buffers
wait for next period
Now the problem with delay still holds:
writer writes to buffer0 reader is reading buffer1
writer can't acquire lock1 because reader is still reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) <- **this point**
|
|
writer wakes up
swaps buffers
writes to buffer1
Again at **this point**, all the readers could start reading buffer0, which is a short delay after buffer0 has been written, but instead they have to wait until the next period of the writer.
The Question
The question is, how do I handle this? If I want the writer to execute precisely at desired period, it needs to wait for the period using RTAI function and I can't do it like
Writer (case 4):
acquire lock 0
loop
write to current buffer
loop a few times or until the buffer has been swapped
sleep a little
acquire other lock
free this lock
swap buffers
wait for next period
This introduces jitter. because the "few times" could happen to become longer than the "wait for next period" so the writer might miss the start of its period.
Just to be more clear, here's what I want to happen:
writer writes to buffer0 reader is reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) As soon as all readers finish reading,
| the buffer is swapped
| readers start reading buffer0
writer wakes up |
writes to buffer1
What I Found Already
I found read-copy-update which as far as I understood keeps allocating memory for buffers and frees them until the readers are done with them, which is impossible for me for many reasons. One, the threads are shared between kernel and user space. Second, with RTAI, you can't allocate memory in a real-time thread (because then your thread would be calling Linux's system calls and hence break the real-time-itivity! (Not to mention using Linux's own RCU implementation is useless due to the same reasons)
I also thought about having an extra thread that at a higher frequency tries swapping buffers, but that doesn't sound like such a good idea. First, it would itself need to synchronize with the writer, and second, well I have many of these writer-readers working in different parts in parallel and one extra thread for each writer just seems too much. One thread for all writers seems very complicated regarding synchronization with each writer.
What API are you using for reader-writer locks? Do you have a a timed lock, like pthread_rwlock_timedwrlock? If yes, I think the it's a solution to your problem, like in the following code:
void *buf[2];
void
writer ()
{
int lock = 0, next = 1;
write_lock (lock);
while (1)
{
abs_time tm = now() + period;
fill (buf [lock]);
if (timed_write_lock (next, tm))
{
unlock (lock);
lock = next;
next = (next + 1) & 1;
}
wait_period (tm);
}
}
void
reader ()
{
int lock = 0;
while (1)
{
reade_lock (lock);
process (buf [lock]);
unlock (lock);
lock = (lock + 1) & 1;
}
}
What happens here, is that it does not really matter for the writer whether it waits for a lock or for the next period, as long as it is sure to wake up before the next period has come. The absolute timeout ensures this.
Isn't this exactly the problem triple buffering is supposed to solve. So you have 3 buffers, lets call them write1, write2, and read. The write thread alternates between writing to write1 and write2, ensuring that they never block, and that the last complete frame is always available. Then in the read threads, at some appropriate point (say, just before or after reading a frame), the read buffer is flipped with the available write buffer.
While this would ensure that writers never block (the buffer flipping can be done very quickly just by flipping two pointers, perhaps even with a CAS atomic instead of a lock), there is still the issue of readers having to wait for other readers to finish with the read buffer before flipping. I suppose this could be solved slightly RCU-esque with a pool of read buffers where an available one can be flipped.
Use a Queue (FIFO linked list)
The real-time writer will always append (enqueue) to the end of the queue
The readers will always remove (dequeue) from the beginning of the queue
The readers will block if the queue is empty
edit to avoid dynamic allocation
I would probably use a circular queue...
I would use the built in __sync atomic operations.
http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html#Atomic-Builtins
Circular queue (FIFO 2d array)
ex: byte[][] Array = new byte[MAX_SIZE][BUFFER_SIZE];
Start and End index pointers
Writer overwrites buffer at Array[End][]
Writer can increment Start if it ends up looping all the way around
Reader gets buffer from Array[Start][]
Reader blocks if Start == End
If you don't want the writer to wait, perhaps it shouldn't acquire a lock that anybody else might hold. I would have it perform some sort of synchronisation, though, to make sure that what it writes really is written out - typically, most synchronisation calls will cause a memory flush or barrier instruction to be executed, but the details will depend on the memory model of your cpu and the implementation of your threads package.
I would have a look to see if there is any other synchronisation primitive around that fits things better, but if push comes to shove I would have the writer lock and unlock a lock that nobody else ever uses.
Readers must then be prepared to miss things now and then, and must be able to detect when they have missed stuff. I would associate a validity flag and a long sequence count with each buffer, and have the writer do something like "clear validity flag, increment sequence count, sync, write to buffer, increment sequence count, set validity flag, sync." If the reader reads a sequence count, syncs, sees the validity flag true, reads the data out, syncs, and re-reads the same sequence count, then perhaps there is some hope that it did not get garbled data.
If you are going to do this, I would test it exhaustively. It looks plausible to me, but it might not work with your particular implementation of everything from compiler to memory model.
Another idea, or a way to check this one, is to add a checksum to your buffer and write it last of all.
See also searches on lock free algorithms such as http://www.rossbencina.com/code/lockfree
To go with this, you probably want a way for the writer to signal to sleeping readers. You might be able to use Posix semaphores for this - e.g. have the reader ask the writer to call sem_post() on a particular semaphore when it reaches a given sequence number, or when a buffer becomes valid.
Another option is to stick with locking, but ensure that readers never hang too long holding a lock. Readers can keep the time taken holding a lock short and predictable by doing nothing else while they hold that lock but copying the data from the writer's buffer. The only problem then is that a low priority reader can be interrupted by a higher priority task halfway through a write, and the cure for that is http://en.wikipedia.org/wiki/Priority_ceiling_protocol.
Given this, if the writer thread has a high priority, the worst case work to be done per buffer is for the writer thread to fill the buffer and for each reader thread to copy the data out of that buffer to another buffer. If you can afford that in each cycle, then the writer thread and some amount of reader data copying will always be completed, while readers processing the data they have copied may or may not get their work done. If they do not, they will lag behind and will notice this when they next grab a lock and look round to see which buffer they want to copy.
FWIW, my experience with reading real time code (when required to show that the bugs are there, and not in our code) is that it is incredibly and deliberately simple-minded, very clearly laid out, and not necessarily any more efficient than it needs to be to meet its deadlines, so some apparently pointless data-copying in order to get straightforward locking to work might be a good deal, if you can afford it.