Swapping buffers in single-writer-multiple-reader threads

Swapping buffers in single-writer-multiple-reader threads - c

The Story
There is a writer thread, periodically gathering data from somewhere (in real-time, but that doesn't matter much in the question). There are many readers then reading from these data. The usual solution for this is with two reader-writer's lock and two buffers like this:
Writer (case 1):
acquire lock 0
loop
write to current buffer
acquire other lock
free this lock
swap buffers
wait for next period
Or
Writer (case 2):
acquire lock 0
loop
acquire other lock
free this lock
swap buffers
write to current buffer
wait for next period
The Problem
In both methods, if the acquire other lock operation fails, no swap is done and writer would overwrite its previous data (because writer is real-time, it can't wait for readers) So in this case, all readers would lose that frame of data.
This is not such a big deal though, the readers are my own code and they are short, so with double buffer, this problem is solved, and if there was a problem I could make it triple buffer (or more).
The problem is the delay that I want to minimize. Imagine case 1:
writer writes to buffer0 reader is reading buffer1
writer can't acquire lock1 because reader is still reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) <- **this point**
|
|
writer wakes up, and again writes to buffer0
At **this point**, other readers in theory could have read data of buffer0 if only the writer could do the swap after the reader finishes instead of waiting for its next period. What happened in this case is that just because one reader was a bit late, all readers missed one frame of data, while the problem could have been totally avoided.
Case 2 is similar:
writer writes to buffer0 reader is idle
| |
| reader finishes reading,
| (writer waiting for next period)
|
| reader starts reading buffer1
writer wakes up |
it can't acquire lock1 because reader is still reading buffer1
overwrites buffer0
I tried mixing the solutions, so the writer tries swapping buffers immediately after writing, and if not possible, just after waking up in the next period. So something like this:
Writer (case 3):
acquire lock 0
loop
if last buffer swap failed
acquire other lock
free this lock
swap buffers
write to current buffer
acquire other lock
free this lock
swap buffers
wait for next period
Now the problem with delay still holds:
writer writes to buffer0 reader is reading buffer1
writer can't acquire lock1 because reader is still reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) <- **this point**
|
|
writer wakes up
swaps buffers
writes to buffer1
Again at **this point**, all the readers could start reading buffer0, which is a short delay after buffer0 has been written, but instead they have to wait until the next period of the writer.
The Question
The question is, how do I handle this? If I want the writer to execute precisely at desired period, it needs to wait for the period using RTAI function and I can't do it like
Writer (case 4):
acquire lock 0
loop
write to current buffer
loop a few times or until the buffer has been swapped
sleep a little
acquire other lock
free this lock
swap buffers
wait for next period
This introduces jitter. because the "few times" could happen to become longer than the "wait for next period" so the writer might miss the start of its period.
Just to be more clear, here's what I want to happen:
writer writes to buffer0 reader is reading buffer1
| |
| reader finishes reading,
| (writer waiting for next period) As soon as all readers finish reading,
| the buffer is swapped
| readers start reading buffer0
writer wakes up |
writes to buffer1
What I Found Already
I found read-copy-update which as far as I understood keeps allocating memory for buffers and frees them until the readers are done with them, which is impossible for me for many reasons. One, the threads are shared between kernel and user space. Second, with RTAI, you can't allocate memory in a real-time thread (because then your thread would be calling Linux's system calls and hence break the real-time-itivity! (Not to mention using Linux's own RCU implementation is useless due to the same reasons)
I also thought about having an extra thread that at a higher frequency tries swapping buffers, but that doesn't sound like such a good idea. First, it would itself need to synchronize with the writer, and second, well I have many of these writer-readers working in different parts in parallel and one extra thread for each writer just seems too much. One thread for all writers seems very complicated regarding synchronization with each writer.

What API are you using for reader-writer locks? Do you have a a timed lock, like pthread_rwlock_timedwrlock? If yes, I think the it's a solution to your problem, like in the following code:
void *buf[2];
void
writer ()
{
int lock = 0, next = 1;
write_lock (lock);
while (1)
{
abs_time tm = now() + period;
fill (buf [lock]);
if (timed_write_lock (next, tm))
{
unlock (lock);
lock = next;
next = (next + 1) & 1;
}
wait_period (tm);
}
}
void
reader ()
{
int lock = 0;
while (1)
{
reade_lock (lock);
process (buf [lock]);
unlock (lock);
lock = (lock + 1) & 1;
}
}
What happens here, is that it does not really matter for the writer whether it waits for a lock or for the next period, as long as it is sure to wake up before the next period has come. The absolute timeout ensures this.

Isn't this exactly the problem triple buffering is supposed to solve. So you have 3 buffers, lets call them write1, write2, and read. The write thread alternates between writing to write1 and write2, ensuring that they never block, and that the last complete frame is always available. Then in the read threads, at some appropriate point (say, just before or after reading a frame), the read buffer is flipped with the available write buffer.
While this would ensure that writers never block (the buffer flipping can be done very quickly just by flipping two pointers, perhaps even with a CAS atomic instead of a lock), there is still the issue of readers having to wait for other readers to finish with the read buffer before flipping. I suppose this could be solved slightly RCU-esque with a pool of read buffers where an available one can be flipped.

Use a Queue (FIFO linked list)
The real-time writer will always append (enqueue) to the end of the queue
The readers will always remove (dequeue) from the beginning of the queue
The readers will block if the queue is empty
edit to avoid dynamic allocation
I would probably use a circular queue...
I would use the built in __sync atomic operations.
http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html#Atomic-Builtins
Circular queue (FIFO 2d array)
ex: byte[][] Array = new byte[MAX_SIZE][BUFFER_SIZE];
Start and End index pointers
Writer overwrites buffer at Array[End][]
Writer can increment Start if it ends up looping all the way around
Reader gets buffer from Array[Start][]
Reader blocks if Start == End

If you don't want the writer to wait, perhaps it shouldn't acquire a lock that anybody else might hold. I would have it perform some sort of synchronisation, though, to make sure that what it writes really is written out - typically, most synchronisation calls will cause a memory flush or barrier instruction to be executed, but the details will depend on the memory model of your cpu and the implementation of your threads package.
I would have a look to see if there is any other synchronisation primitive around that fits things better, but if push comes to shove I would have the writer lock and unlock a lock that nobody else ever uses.
Readers must then be prepared to miss things now and then, and must be able to detect when they have missed stuff. I would associate a validity flag and a long sequence count with each buffer, and have the writer do something like "clear validity flag, increment sequence count, sync, write to buffer, increment sequence count, set validity flag, sync." If the reader reads a sequence count, syncs, sees the validity flag true, reads the data out, syncs, and re-reads the same sequence count, then perhaps there is some hope that it did not get garbled data.
If you are going to do this, I would test it exhaustively. It looks plausible to me, but it might not work with your particular implementation of everything from compiler to memory model.
Another idea, or a way to check this one, is to add a checksum to your buffer and write it last of all.
See also searches on lock free algorithms such as http://www.rossbencina.com/code/lockfree
To go with this, you probably want a way for the writer to signal to sleeping readers. You might be able to use Posix semaphores for this - e.g. have the reader ask the writer to call sem_post() on a particular semaphore when it reaches a given sequence number, or when a buffer becomes valid.

Another option is to stick with locking, but ensure that readers never hang too long holding a lock. Readers can keep the time taken holding a lock short and predictable by doing nothing else while they hold that lock but copying the data from the writer's buffer. The only problem then is that a low priority reader can be interrupted by a higher priority task halfway through a write, and the cure for that is http://en.wikipedia.org/wiki/Priority_ceiling_protocol.
Given this, if the writer thread has a high priority, the worst case work to be done per buffer is for the writer thread to fill the buffer and for each reader thread to copy the data out of that buffer to another buffer. If you can afford that in each cycle, then the writer thread and some amount of reader data copying will always be completed, while readers processing the data they have copied may or may not get their work done. If they do not, they will lag behind and will notice this when they next grab a lock and look round to see which buffer they want to copy.
FWIW, my experience with reading real time code (when required to show that the bugs are there, and not in our code) is that it is incredibly and deliberately simple-minded, very clearly laid out, and not necessarily any more efficient than it needs to be to meet its deadlines, so some apparently pointless data-copying in order to get straightforward locking to work might be a good deal, if you can afford it.

Related

The readers/writers problem using semaphore

I'm a beginner in C and Multithreading programming.
My textbook describes a readers-writers problem, which favors readers, requires that no reader be kept waiting unless a writer has already been granted permission to use the object. In other words, no reader should wait simply because a writer is waiting, below is the code
where
void P(sem_t *s); /* Wrapper function for sem_wait */
void V(sem_t *s); /* Wrapper function for sem_post */
and
The w semaphore controls access to the critical sections that access the shared object. The mutex semaphore protects access to the shared readcnt variable, which counts the number of readers currently in the critical section.
I don't quite understand what the textbook mean. It seems to me like:if there is a reader, then the writer won't be able to update the share object. But my textbooks says 'no reader should wait simply because a writer is waiting', but when a writer is writing, it just locks the w which doesn't do anything to stop readers except for first reader?

Lets say there are four threads T1,T2,T3,T4.
Tl is writer and T2,T3,T4 are readers.
Now lets assume that T1 gets scheduled first and locks the semaphore w and once done it releases w.
Now after this, lets assume T2 gets scheduled and it locks the semaphore mutex,increments the global variable readcnt and since this is the first reader it locks the semaphore w as well.And releases the semaphore mutex.And enters the crticial section.
Now if T3 gets scheduled, it will acquire the semaphore mutex and increments the global variable readcntand releases the semaphore mutex then enters the critical section.
Now if T1 gets scheduled again it cannot acquire w since it held by the reader thread T2. T1 cannot acquire w until the last reader thread exits. If the writer T1 is waiting and at the same time T4 gets scheduled, T4 will execute as it does not require to lock w.
So the textbooks says no reader should wait simply because a writer is waiting.
But as Leonard said in the other answer, if the writer is already writing, you can't interrupt that process. So T2,T3,T4 has to wait for T1 to release w.

As in the first comment made, if the writer is already writing, you can't interrupt that process. The case you need to deal with is if something is happening (reading or writing) and then a writer request arrives, and then a reader request arrives while the original lock is being held, you need to ensure that the reader request gets serviced first. This is a tiny bit complex, but you're trying to learn, so thinking about how to do this is really the assignment. Go for it! Try some code. If you are truly stuck, ask again.

Race condition while trying to use "Readers Writer Lock"

I'm working on a project using pthreads and i made my own implementation of Readers Writer Lock which has the following methods:
Lock for reader (several can read simultaneously)
Lock for writer (only one can write)
Unlock (for both reader/writer)
I've tested it and it works well, my problem is more logical,
in my program i want several threads to do some testing on numbers in a specific range, and if a number was found that answers my criteria i want them to add them to a shared resource which is an array.
If the number is already found by another thread and exists in the array, continue searching.
Here is a pseudo code for my algorithm:
X = lowest number to search, X' = highest number to search,
func = the test for the number, ARR = the array shared between the threads,
RWL_R = Lock for reader, RWL_W Lock for writer, RWL_U = Unlock.
FOR each NUM from X to X' DO:
IF func(NUM) = true DO:
FOR each cell of ARR DO:
RWL_R // lock for reader since i'm reading from ARR
IF ARR[cell] contains NUM DO:
RWL_U // unlock since no longer reading from ARR
skip to the next iteration of the first FOR loop
ENDIF
END FOR
RWL_U // unlock since no longer reading from ARR
////// PROBLEM HERE ////////
RWL_W // lock to write to array since for loop ended without finding the same NUM
ARR[cell] <- NUM
RWL_U // unlock since finished writing into array
END IF
END FOR
As you can see the logic is fine since that little line i marked with the ugly caps "PROBLEM HERE". inside that little gap between the reader unlock and the writer lock a race condition might (and does) occur.
So i have 2 possible results:
(The Good)
Thread_A find number N, locks the array for reading.
Thread_B finds the same number N, waiting to check the array but it's currently locked by Thread_A.
Thread_A finishes going through the array and the number N is not there, unlocks the lock and locks it again as writer, adding N to the array, unlocks the lock and finishes his job.
Thread_B can now check the array, number N is there so it skips to number N2 and the rest works hot it's supposed to.
(The bad)
Thread_A find number N, locks the array for reading.
Thread_B finds the same number N, waiting to check the array but it's currently locked by Thread_A.
Thread_A finishes going through the array and the number N is not there, unlocks the lock.
Thread_B takes over the lock and locking it as reader, checking the array and number N is still not there (Thread_A haven't added it yet).
Thread_B unlocks the lock.
Either Thread_A or Thread_B now locks the lock for writing, adding number N, unlocking the lock and finishes.
The thread that waited now locks the lock, adds the same number N, unlocks and finishes.
So i'm now trying to find the best logical way to fix the issue, i can only think about locking as a writer when checking the array and not unlocking it until finishing writing, or to create a method that switches "atomically" from reader lock to writer lock, but that's kind of "Cheating" and not using the "Readers Writer Lock" mechanism as it's supposed to be used.
What's a better logical way to use it here?

The two options you give are suboptimal in most scenarios, since one prevents multiple readers from checking at the same time (and presumably, after a while, readers are much more common than writers), while the other is impossible; even if you switch "atomically" from reader lock to writer lock, two different readers could both determine a value X is not present, both request an upgrade to write lock, the first one writes X then unlocks, while the second waits its turn then writes X again and you end up violating the uniqueness invariant.
The real answer depends on the common usage patterns:
If writing is more likely to happen than reading in general (concurrent reading is uncommon), then you just drop the reader/writer lock and use a simple mutex. Reader/writer locks often have overhead that isn't worth it if you aren't regularly using concurrent readers, and even if the overhead is trivial (sometimes it is; Windows' SRW Locks used solely in "exclusive" (writer) mode where recursive acquisition isn't needed is as fast or faster than critical sections), if the program logic becomes more complicated, or you need to constantly switch from read to write locks and back, the additional logic and acquisition/release overhead can cost more than just a single lock and unlock with exclusive access.
If you have other frequently used code which reads but never writes, and readers that might need to write are less common and regularly have to write, then use the solution you suggested; keep the reader/writer lock, but have the "might write" code lock for write from the beginning; it eliminates concurrency in the "might write" code path, but "only read" users can operate concurrently.
If reading is much more common than writing (which is the usual use case for something like this if the code provided is the only code that accesses the array; otherwise your array will grow out of control), then perform a double check after "upgrading" to write lock; perform the same scan a second time after upgrading to a write lock to make sure no one grabbed the write lock and added the missing value, then write if the value is still missing. This can be optimized to avoid rescanning if the array only has new values added, with existing values never changing or being deleted. You'd just remember where you left off checking, and scan any new entries added between your original scan and when you acquired the write lock.
The pseudo-code for #3 would look like:
FOR each NUM from X to X' DO:
IF func(NUM) = true DO:
RWL_R // lock for reader outside inner loop since repeatedly reading from ARR
cell = 0
WHILE cell < ARR.length DO:
IF ARR[cell] contains NUM DO:
RWL_U // unlock since no longer reading from ARR
skip to the next iteration of the first FOR loop
ENDIF
cell += 1
END WHILE
RWL_U // unlock since no longer reading from ARR
RWL_W // lock to write to array since for loop ended without finding the same NUM
// NEW!!! Check newly added cells
WHILE cell < ARR.length DO:
IF ARR[cell] contains NUM DO:
RWL_U // unlock since no longer reading from ARR
skip to the next iteration of the first FOR loop
ENDIF
cell += 1
END WHILE
// If we got here, not in newly added cells either, so add to end
ARR[cell] <- NUM
RWL_U // unlock since finished writing into array
END IF
END FOR

i can only think about locking as a writer when checking the array and not unlocking it until finishing writing
That would certainly be viable, but it would prevent any concurrency of the array scans. If those consume a significant portion of the program's run time, then making the scans concurrent is highly desirable.
or to create a method that switches "atomically" from reader lock to writer lock, but that's kind of "Cheating" and not using the "Readers Writer Lock" mechanism as it's supposed to be used.
That's not viable, because you cannot promote a read lock to a write lock while more than one thread holds the read lock. A write lock must exclude not just other writers but also all readers. You'll end up with deadlocks in the case you describe, because two or more threads holding the read lock need to promote it to a write lock, but that cannot happen for any of them until all of the others release the lock.
In any event, even if you allowed a write lock to coexist with read locks, that would not reliably prevent two threads considering the same number both scanning the file to its current end, not seeing the number, and, in turn, appending it to the array.
If you want to provide for concurrent array scans but prevent duplicates being added to your array then you need at least a bit more communication between threads.
One relatively simple approach would be to implement a simple transactional system. Your reader / writer lock would support promoting a read lock to a write lock while there are multiple readers, but only for reader locks obtained since the last time a writer lock was released. The thread successfully performing such a promotion could then be confident that the data were not modified while it was reading them, and it could therefore safely update the array. Promotion failure would be analogous to failure to commit a transaction -- the thread experiencing such a failure would need to rescan from the beginning.
That will work reasonably well if there are comparatively many appearances of each number, for then the likelihood of rescans will decline as the cost of rescans increases.
You could also consider more sophisticated mechanisms whereby threads that acquire a write lock somehow inform readers what value they are writing, so that any readers scanning for that value can abort early. That will be trickier to write, however, and although it sounds efficient, it might actually be slower under some circumstances as a result of requiring more synchronization among threads.

Instead of a single readers/writer lock, use two locks: one for reading and one for writing. Each thread would acquire the read-lock and scan the array. If the array should be modified, then the thread would acquire the write-lock and re-scan the array to ensure that the number hasn't already been added.
Here's the pseudo-code:
acquire_read_lock();
if (!is_value_in_array(value)) {
acquire_write_lock();
if (!is_value_in_array(value)) {
add_value_to_array(value);
}
release_write_lock();
}
release_read_lock();

Unlocking a mutex after calling trylock()

I have a threaded server that can add/append/read files and relay data to the client.
If a file is being added, no other thread can append/read it. If a file is being appended, no threads can append/read it. If a file is being read, no other thread can append to it. However, if a file is being read, other files can read it.
Currently I have a mutex system that will do this, except it won't allow multiple reads.
To fix this, in the read method, I will change:
pthread_mutex_lock(&(fm->mutex));//LOCK
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
to
pthread_mutex_trylock(&(fm->mutex));//TRYLOCK [NonBlocking, so the thread can continue the read]
//do some things`
...
pthread_mutex_unlock(&(fm->mutex));
Question
How can I unlock the file without allowing the other methods (just append really) to begin writing to the file before all the other read()'s have finished?
Example
For example, if the reading thread that originally locked the file completes and unlocks the file and there are still other threads trying to read the file, then an appending thread gets the chance to lock the file and begin appending while the others are still reading, which is a no-no.
Idea
I want to keep a count of the number of threads currently reading a file. When a thread finishes, reduce the count. If the count is 0, meaning no threads are still reading, unlock the file. But, I'm worried that this would not be thread safe. If this is a viable solution, how could I make it thread safe? Another but, I believe only the original thread can successfully unlock the mutex.

It sounds like you may be looking for a read-write lock, which is provided by pthreads. It allows two modes of locking: a shared/read-lock mode, which can be locked by multiple threads at once, and an exclusive/write-lock mode, where the lock call won't return until all other threads (readers and writers) have given up their hold on the lock.

You could use a semaphore instead of the mutex (see this link about the differences). The semaphore does thread-safe synchronized counting for you.
You can live without an additional mutex to lock the file for writing if you limit the number of simultaneous read accesses to a (sufficient large) number N and require the semaphore to be increased by that number for write access. This way you can only gain write access if the number of readers is zero and all other readers will be locked out until your writer has finished.

Note that the POSIX documentation for pthread_mutex_lock() says:
If successful, the pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() functions shall return zero; otherwise, an error number shall be returned to indicate the error.
Since you don't show your code testing the return values, you don't know whether your lock operations (in particular) succeeded or not.
Separately, since you want a read/write lock, why not use one:
pthread_rwlock_rdlock()
pthread_rwlock_wrlock()
pthread_rwlock_unlock()
pthread_rwlock_init()
pthread_rwlock_destroy()
There are four pthread_rwlockattr_*() functions and a total of 9 pthread_rwlock_*() functions; I only listed the most important functions in the family.

Implementing pipe using shared memory & semaphores

I'm trying to implement a pipe using shared memory and semaphores (it may be that I need signals also, to complete my implementation)
I encountered the algorithmic problem of how to set the semaphores right.
Lets say I already allocated a piece of shared memory for the pipe buffer,
and a piece of shared memory for the pipe's info (such as how much bytes there are in the pipe, etc...)
I want to create mutual exclusion (only one reader/writer using the pipe at once)
If reader wants to read from an empty pipe, I should block him, till a writer writes something
Same thing like '2', but writer who writes to a full pipe
I tried to search for an answer but I didn't find any even though it seems like a common exercise...
I'm aware of a solution called "Bounded buffer problem" or "consumer producer problem"
which is implemented like this:
There are 3 semaphores:
mutex - initialized to 1
full - initialized to 0
empty - initialized to n (whilst n is the number of, lets say "bytes" I have in the pipe)
Consumer's code:
wait(full)
wait(mutex)
remove a byte from the pipe
signal(mutex)
signal(empty)
Producer's code:
wait(empty)
wait(mutex)
add a byte to the pipe
signal(mutex)
signal(full)
The problem in this solution (to use as a solution to my problem) is that in a given time, only one byte is read from the pipe, or write into it.
In my problem - Implementing a pipe, I don't know for sure how much bytes a writer will write. If he wants to write 'n' bytes, then he will write it only if there is a place in the pipe, and if not, he will write less then 'n' bytes...
That means that a writer must check how much free space there is in the pipe, before writing into it. This is a problem - because the writer will touch a critical section (the pipe's information) without mutual exclusion..
So I thought about putting this part inside the critical section, but then - if a writer wants to write and the pipe is full - how can I let only one reader inside, and then letting the writer to write more?
I've got confused...
Any help will be appreciated, Thanks!

There is no need to have so many mutexes or lock them for that amount of time. In single producer/consumer scenario, the producer never needs to worry about the free space reducing (it is the only one that can use up that space), and similarly for the consumer. Therefore your pseudocode should be:
Producer
while (lock_and_get_free_space() < bytes_to_write)
wait()
unlock()
write(bytes_to_write)
lock_and_update_free_space()
Consumer
while (lock_and_get_data() < bytes_to_read)
wait()
unlock()
read(bytes_to_read)
lock_and_update_free_space()

about synchronization with using multiple semaphores

hi there i'm working on an assignment about using POSIX threads with multi semaphores. the brief explanation of assignment is: there are 4 various data packets (char/video/audio/image), each of them carried by a different thread and also we have a shared buffer. maximum threads can work on system will be maintained by the user as an input. for example; if user enters 10 then maximum 10 thread could be created to transmit data packets over a buffer in a given time. now the confusing part for me is, this buffer can contains limited packets instantly. (for example it can contain maximum 10 char packets and 20 video packets etc.) so we have to have different semaphores for each data type. the issue i know how to control the buffer size with semaphore which is very simple, but cant set the correct idea of using semaphores of packets'. even i tried some different methods i always faced with deadlock errors. here is my pseudocode to understand more clearly of my program.
define struct packege
define semaphore list
main
initialize variables and semaphores
while threadCounter is less than MaxThreadNumber
switch(random)
case 0: create a character package
create a thread to insert the package in buffer
case 1: create a video package
create a thread to insert the package in buffer
case 2: create an image package
create a thread to insert the package in buffer
case 3: create an audio package
create a thread to insert the package in buffer
increment threadCounter by one
end of while
create only one thread which will make the dequeue operation
end of main
producer function
for i->0 to size_of_package
sem_wait(empty_buffer) // decrement empty_buffer semaphore by size of package
lock_mutex
insert item into queueu
decrement counter of the buffer by size of package
unlock_mutex
for i->0 to size_of_package
sem_post(full_buffer) // increment full_buffer semaphore by size of package
end of producer function
consumer function
while TRUE // Loops forever
lock_mutex
if queue is not empty
dequeue
increment counter of the buffer size of package
unlock_mutex
for i->0 to size_of_package // The reason why i making the sem_wait operation here is i cant make the dequeue in outer region of mutex.
sem_wait(full_buffer)
for i->0 to size_of_package
sem_post(empty_buffer)
end of consumer function
with this implementation programe works correctly. but i couldnt use semaphores properly which belongs to threads of packages. i can listen every recommandation and will be appreciated for every answer.

This is not how semaphores are used. The buffer's control variables/structures should count how many messages are contained in the buffer and of what types. The mutex protects the buffer and its control variables/structures against concurrent access by different threads. A semaphore, if used, just signals the state of the buffer to the consumer and has no connection to the sizes of the packets; it certainly doesn't get incremented by the size of the packet!
You would be better advised to use pthread condition variables instead of semaphores. These are used in connection with the pthread mutex to guarantee race-free signalling between threads. The producer loop does this:
locks the mutex,
modifies the buffer etc to add new packet(s),
signals the condition variable, and
unlocks the mutex.
The consumer loop does this:
locks the mutex,
processes all buffered data,
waits for the condition variable.
Read up on pthread_cond_init, pthread_cond_signal and pthread_cond_wait.

Since it's an assignment, you probably don't need to have real packets data read and write, but just simulate their handling.
In that case, the problem boils down to how to effectively block the producer threads when they reach the limit of packet they can write in the buffer. At the moment, you are using the semaphore to count the individual elements of a packet written in the buffer, as far as I understand.
Imagine that your writes in the buffer are atomic, and that you just want to count the packets, not the packet elements. Each time a producer writes a packet, it must signal it to the consumer, with the appropriate semaphore, and each time the consumer reads a packet, it must signal it to the appropriate producer.
Let me highlight a few other points:
The important property of a semaphore is that it will block when it reaches zero. For instance, if its initial value is 10, after 10 successive sem_get, the 11th will block.
You have 4 types of packets, each with a different threshold on the number that can be written in the buffer.
As I said, the producer must signal that it wrote a packet, but it must also be stopped once it reaches the threshold. To achieve that, you make it acquire the semaphore each time it posts a new packet, with sem_get. And you have the consumer do a sem_post each time it read a packet, the reverse of what you did with your single semaphore version. However, since you want the producer stop at the threshold, you initialize the semaphore with a capacity of N - 1, N being the threshold. Note that you have to signal that a new packet is available after you wrote it in the buffer, otherwise the consumer might block the buffer.
producer<type> function
write_packet() // put the packet in the buffer
sem_wait(type) // signal a new packet is available
// (if there's not enough space for another packet, the producer will block here)
end producer<type> function
consumer function
while TRUE // Loops forever
switch packet_available() // look if there's a new packet available
case video:
read_packet<video>()
sem_post(video)
(...)
default: // no packet available, just wait a little
sleep()
end if
end while
You still need to define the packet_read, packet_write, and packet_available functions, probably using a mutex to limit access to the buffer.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight