Forcing communication between threads - c

I am a bit of a novice with pthreads, and I was hoping someone could help with a problem I've been having. Say you have a collection of threads, all being passed the same function, which looks something like this:
void *func(void *args) {
...
while(...) {
...
switch(...)
{
case one:
do stuff;
break;
case two:
do other stuff;
break;
case three:
do more stuff;
break;
}
....
}
}
In my situation, if "case one" is triggered by ANY of the threads, I need for all of the threads to exit the switch and return to the start of the while loop. That said, none of the threads are ever waiting for a certain condition. If it happens that only "case two" and "case three" are triggered as each thread runs through the while loop, the threads continue to run independently without any interference from each other.
Since the above is so vague, I should probably add some context. I am working on a game server that handles multiple clients via threads. The function above corresponds to the game code, and the cases are various moves a player can make. The game has a global and a local component -- the first case corresponds to the global component. If any of the players choose case (move) one, it affects the game board for all of the players. In between the start of the while loop and the switch is the code that visually updates the game board for a player. In a two player game, if one player chooses move one, the second player will not be able to see this move until he/she makes a move, and this impacts the gameplay. I need the global part of the board to dynamically update.
Anyway, I apologize if this question is trivial, but some preliminary searching on the internet didn't produce anything valuable. It may just be that I need to change the whole structure of the code, but I'm kind of clinging to this because it's so close to working.

You need to have an atomic variable that acts as a counter for switch-case.
Atomic variables are guarantied to perform math operations atomically which is what multihreaded environments require.
Init the atomic variable at 1 in the main/dispatcher thread and pass it via Args or make it global.
Atmic variables are
volatile LONG counter = 1;
For windows use InterlockedAdd. The return value is the previous value.
Each thread does:
LONG val = InterlockedAdd(&counter, 1);
switch(val)
...
For GCC:
LONG val = __sync_fetch_and_add(&counter, 1);
switch(val)
...

May I also suggest, you inform yourself about standard task synchronization schemes first?
You might get an idea, how to change your program structure to be more flexible.
Task synchronization basics: (binary semaphores, mutexes)
http://www.chibios.org/dokuwiki/doku.php?id=chibios:articles:semaphores_mutexes
(if you are interested in InterProcessCommuncation (IPC) in more detail, like message passing, queues, ... just ask!)
Furthermore I recommend reading about implementation of state machines, which could help making your player code more flexible!
Bit complex (I know easier resources only in German - maybe a native speaker can help):
http://johnsantic.com/comp/state.html
Is there a typical state machine implementation pattern?
If you want to stick with what you have, a global variable, which can be changed & read by any other task will do.
Regards, Florian

Related

Implement Threading in C

I'm currently working on a mini game project, and I have a problem with this specific mechanic:
void monsterMove(){
//monster moves randomly
}
void playerMove(){
//accepting input as player movement using W, A, S, D
}
However, the project requires the monsters to keep moving at all times, even when the player is not moving.
After some research, I figured out that multithreading is needed to implement this mech since both monsterMove() and playerMove() needs to run concurrently even when playerMove() hasn't received any input from user.
Specifically there are 2 questions I want to address:
Which function needs to be made as a thread?
How to build the thread?
Sadly, I found no resource with specifically this type of question because everything on the Internet just seem to point out how multithreading can work as parallelism in a way, but not in such implementations.
I was thinking that monsterMove() will be run recursively while playerMove() will be made as the thread, since monsterMove() will need to run every n seconds even when playerMove() thread is not finished yet (no input yet). Although I might be wrong for the most part.
Thank you in advanced!
(P.S. Just to avoid misunderstandings, I am specifically asking about how thread and multithreading works, not the logic of the mech.)
Edit: Program is now working! However, any answer/code related to how this program is done with multithreading is immensely appreciated. :)
You don't need multithreading for this:
The basic structure is depicted in this pseudocode:
while (1)
{
check if input is available using kbhit
if (input available)
read user input using getch
move player depending on user input
move monsters
}
You could use multithreading using the CreatdThread function, but your code will just become overly complex.

Where does finite-state machine code belong in µC?

I asked this question on EE forum. You guys on StackOverflow know more about coding than we do on EE so maybe you can give more detail information about this :)
When I learned about micrcontrollers, teachers taught me to always end the code with while(1); with no code inside that loop.
This was to be sure that the software get "stuck" to keep interruption working. When I asked them if it was possible to put some code in this infinite loop, they told me it was a bad idea. Knowing that, I now try my best to keep this loop empty.
I now need to implement a finite state machine in a microcontroller. At first view, it seems that that code belong in this loop. That makes coding easier.
Is that a good idea? What are the pros and cons?
This is what I plan to do :
void main(void)
{
// init phase
while(1)
{
switch(current_State)
{
case 1:
if(...)
{
current_State = 2;
}
else(...)
{
current_State = 3;
}
else
current_State = 4;
break;
case 2:
if(...)
{
current_State = 3;
}
else(...)
{
current_State = 1;
}
else
current_State = 5;
break;
}
}
Instead of:
void main(void)
{
// init phase
while(1);
}
And manage the FSM with interrupt
It is like saying return all functions in one place, or other habits. There is one type of design where you might want to do this, one that is purely interrupt/event based. There are products, that go completely the other way, polled and not even driven. And anything in between.
What matters is doing your system engineering, thats it, end of story. Interrupts add complication and risk, they have a higher price than not using them. Automatically making any design interrupt driven is automatically a bad decision, simply means there was no effort put into the design, the requirements the risks, etc.
Ideally you want most of your code in the main loop, you want your interrupts lean and mean in order to keep the latency down for other time critical tasks. Not all MCUs have a complicated interrupt priority system that would allow you to burn a lot of time or have all of your application in handlers. Inputs into your system engineering, may help choose the mcu, but here again you are adding risk.
You have to ask yourself what are the tasks your mcu has to do, what if any latency is there for each task from when an event happens until they have to start responding and until they have to finish, per event/task what if any portion of it can be deferred. Can any be interrupted while doing the task, can there be a gap in time. All the questions you would do for a hardware design, or cpld or fpga design. except you have real parallelism there.
What you are likely to end up with in real world solutions are some portion in interrupt handlers and some portion in the main (infinite) loop. The main loop polling breadcrumbs left by the interrupts and/or directly polling status registers to know what to do during the loop. If/when you get to where you need to be real time you can still use the main super loop, your real time response comes from the possible paths through the loop and the worst case time for any of those paths.
Most of the time you are not going to need to do this much work. Maybe some interrupts, maybe some polling, and a main loop doing some percentage of the work.
As you should know from the EE world if a teacher/other says there is one and only one way to do something and everything else is by definition wrong...Time to find a new teacher and or pretend to drink the kool-aid, pass the class and move on with your life. Also note that the classroom experience is not real world. There are so many things that can go wrong with MCU development, that you are really in a controlled sandbox with ideally only a few variables you can play with so that you dont have spend years to try to get through a few month class. Some percentage of the rules they state in class are to get you through the class and/or to get the teacher through the class, easier to grade papers if you tell folks a function cant be bigger than X or no gotos or whatever. First thing you should do when the class is over or add to your lifetime bucket list, is to question all of these rules. Research and try on your own, fall into the traps and dig out.
When doing embedded programming, one commonly used idiom is to use a "super loop" - an infinite loop that begins after initialization is complete that dispatches the separate components of your program as they need to run. Under this paradigm, you could run the finite state machine within the super loop as you're suggesting, and continue to run the hardware management functions from the interrupt context as it sounds like you're already doing. One of the disadvantages to doing this is that your processor will always be in a high power draw state - since you're always running that loop, the processor can never go to sleep. This would actually also be a problem in any of the code you had written however - even an empty infinite while loop will keep the processor running. The solution to this is usually to end your while loop with a series of instructions to put the processor into a low power state (completely architecture dependent) that will wake it when an interrupt comes through to be processed. If there are things happening in the FSM that are not driven by any interrupts, a normally used approach to keep the processor waking up at periodic intervals is to initialize a timer to interrupt on a regular basis to cause your main loop to continue execution.
One other thing to note, if you were previously executing all of your code from the interrupt context - interrupt service routines (ISRs) really should be as short as possible, because they literally "interrupt" the main execution of the program, which may cause unintended side effects if they take too long. A normal way to handle this is to have handlers in your super loop that are just signalled to by the ISR, so that the bulk of whatever processing that needs to be done is done in the main context when there is time, rather than interrupting a potentially time critical section of your main context.
What should you implement is your choice and debugging easiness of your code.
There are times that it will be right to use the while(1); statement at the end of the code if your uC will handle interrupts completely (ISR). While at some other application the uC will be used with a code inside an infinite loop (called a polling method):
while(1)
{
//code here;
}
And at some other application, you might mix the ISR method with the polling method.
When said 'debugging easiness', using only ISR methods (putting the while(1); statement at the end), will give you hard time debugging your code since when triggering an interrupt event the debugger of choice will not give you a step by step event register reading and following. Also, please note that writing a completely ISR code is not recommended since ISR events should do minimal coding (such as increment a counter, raise/clear a flag, e.g.) and being able to exit swiftly.
It belongs in one thread that executes it in response to input messages from a producer-consumer queue. All the interrupts etc. fire input to the queue and the thread processes them through its FSM serially.
It's the only way I've found to avoid undebuggable messes whilst retaining the low latencty and efficient CPU use of interrupt-driven I/O.
'while(1);' UGH!

How to prevent linux soft lockup/unresponsiveness in C without sleep

How would be the correct way to prevent a soft lockup/unresponsiveness in a long running while loop in a C program?
(dmesg is reporting a soft lockup)
Pseudo code is like this:
while( worktodo ) {
worktodo = doWork();
}
My code is of course way more complex, and also includes a printf statement which gets executed once a second to report progress, but the problem is, the program ceases to respond to ctrl+c at this point.
Things I've tried which do work (but I want an alternative):
doing printf every loop iteration (don't know why, but the program becomes responsive again that way (???)) - wastes a lot of performance due to unneeded printf calls (each doWork() call does not take very long)
using sleep/usleep/... - also seems like a waste of (processing-)time to me, as the whole program will already be running several hours at full speed
What I'm thinking about is some kind of process_waiting_events() function or the like, and normal signals seem to be working fine as I can use kill on a different shell to stop the program.
Additional background info: I'm using GWAN and my code is running inside the main.c "maintenance script", which seems to be running in the main thread as far as I can tell.
Thank you very much.
P.S.: Yes I did check all other threads I found regarding soft lockups, but they all seem to ask about why soft lockups occur, while I know the why and want to have a way of preventing them.
P.P.S.: Optimizing the program (making it run shorter) is not really a solution, as I'm processing a 29GB bz2 file which extracts to about 400GB xml, at the speed of about 10-40MB per second on a single thread, so even at max speed I would be bound by I/O and still have it running for several hours.
While the posed answer using threads might possibly be an option, it would in reality just shift the problem to a different thread. My solution after all was using
sleep(0)
Also tested sched_yield / pthread_yield, both of which didn't really help. Unfortunately I've been unable to find a good resource which documents sleep(0) in linux, but for windows the documentation states that using a value of 0 lets the thread yield it's remaining part of the current cpu slice.
It turns out that sleep(0) is most probably relying on what is called timer slack in linux - an article about this can be found here: http://lwn.net/Articles/463357/
Another possibility is using nanosleep(&(struct timespec){0}, NULL) which seems to not necessarily rely on timer slack - linux man pages for nanosleep state that if the requested interval is below clock granularity, it will be rounded up to clock granularity, which on linux depends on CLOCK_MONOTONIC according to the man pages. Thus, a value of 0 nanoseconds is perfectly valid and should always work, as clock granularity can never be 0.
Hope this helps someone else as well ;)
Your scenario is not really a soft lock up, it is a process is busy doing something.
How about this pseudo code:
void workerThread()
{
while(workToDo)
{
if(threadSignalled)
break;
workToDo = DoWork()
}
}
void sighandler()
{
signal worker thread to finish
waitForWorkerThreadFinished;
}
void main()
{
InstallSignalHandler;
CreateSemaphore
StartThread;
waitForWorkerThreadFinished;
}
Clearly a timing issue. Using a signalling mechanism should remove the problem.
The use of printf solves the problem because printf accesses the console which is an expensive and time consuming process which in your case gives enough time for the worker to complete its work.

What should C program do in idle time when running on Linux?

I've written many C programs for microcontrollers but never one that runs on an OS like linux. How does linux decide how much processing time to give my application? Is there something I need to do when I have idle time to tell the OS to go do something else and come back to me later so that other processes can get time to run as well? Or does the OS just do that automatically?
Edit: Adding More Detail
My c program has a task scheduler. Some tasks run every 100ms, some run every 50 ms and so on. In my main program loop i call ProcessTasks which checks if any tasks are ready to run, if none are ready it calls an idle function. The idle function does nothing but it's there so that I could toggle a GPIO pin and monitor idle time with an O'scope... or something if I so desired. So maybe I should call sched_yield() in this idle function???
How does linux decide how much processing time to give my application
Each scheduler makes up its own mind. Some reward you for not using up your share, some roll dices trying to predict what you'll do etc. In my opinion you can just consider it magic. After we enter the loop, the scheduler magically decides our time is up etc.
Is there something I need to do when I have idle time to tell the OS
to go do something else
You might call sched_yield. I've never called it, nor do I know of any reasons why one would want to. The manual does say it could improve performance though.
Or does the OS just do that automatically
It most certainly does. That's why they call it "preemptive" multitasking.
It depends why and how you have "idle time". Any call to a blocking I/O function, waiting on a mutex or sleeping will automatically deschedule your thread and let the OS get on with something else. Only something like a busy loop would be a problem, but that shouldn't appear in your design in any case.
Your program should really only have one central "infinite loop". If there's any chance that the loop body "runs out of work", then it would be best if you could make the loop perform one of the above system functions which would make all the niceness appear automatically. For example, if your central loop is an epoll_wait and all your I/O, timers and signals are handled by epoll, call the function with a timeout of -1 to make it sleep if there's nothing to do. (By contrast, calling it with a timeout of 0 would make it busy-loop – bad!).
The other answers IMO are going into too much detail. The simple thing to do is:
while (1){
if (iHaveWorkToDo()){
doWork();
} else {
sleep(amountOfTimeToWaitBeforeNextCheck);
}
}
Note: this is the simple solution which is useful in a single-threaded application or like your case where you dont have anything to do for a specified amount of time; just to get something decent working. The other thing about this is that sleep will call whatever yield function the os prefers, so in that sense it is better than an os specific yield call.
If you want to go for high performance, you should be waiting on events.
If you have your own events it will be something like follows:
Lock *l;
ConditionVariable *cv;
while (1){
l->acquire();
if (iHaveWorkToDo()){
doWork();
} else {
cv->wait(lock);
}
l->release();
}
In a networking type situation it will be more like:
while (1){
int result = select(fd_max+1, &currentSocketSet, NULL, NULL, NULL);
process_result();
}

concurrent variable access in c

I have a fairly specific question about concurrent programming in C. I have done a fair bit of research on this but have seen several conflicting answers, so I'm hoping for some clarification. I have a program that's something like the following (sorry for the longish code block):
typedef struct {
pthread_mutex_t mutex;
/* some shared data */
int eventCounter;
} SharedData;
SharedData globalSharedData;
typedef struct {
/* details unimportant */
} NewData;
void newData(NewData data) {
int localCopyOfCounter;
if (/* information contained in new data triggers an
event */) {
pthread_mutex_lock(&globalSharedData.mutex);
localCopyOfCounter = ++globalSharedData.eventCounter;
pthread_mutex_unlock(&globalSharedData.mutex);
}
else {
return;
}
/* Perform long running computation. */
if (localCopyOfCounter != globalSharedData.eventCounter) {
/* A new event has happened, old information is stale and
the current computation can be aborted. */
return;
}
/* Perform another long running computation whose results
depend on the previous one. */
if (localCopyOfCounter != globalSharedData.eventCounter) {
/* Another check for new event that causes information
to be stale. */
return;
}
/* Final stage of computation whose results depend on two
previous stages. */
}
There is a pool of threads servicing the connection for incoming data, so multiple instances of newData can be running at the same time. In a multi-processor environment there are two problems I'm aware of in getting the counter handling part of this code correct: preventing the compiler from caching the shared counter copy in a register so other threads can't see it, and forcing the CPU to write the store of the counter value to memory in a timely fashion so other threads can see it. I would prefer not to use a synchronization call around the counter checks because a partial read of the counter value is acceptable (it will produce a value different than the local copy, which should be adequate to conclude that an event has occurred). Would it be sufficient to declare the eventCounter field in SharedData to be volatile, or do I need to do something else here? Also is there a better way to handle this?
Unfortunately, the C standard says very little about concurrency. However, most compilers (gcc and msvc, anyway) will regard a volatile read as if having acquire semantics -- the volatile variable will be reloaded from memory on every access. That is desirable, your code as it is now may end up comparing values cached in registers. I wouldn't even be surprised if the both comparisons were optimized out.
So the answer is yes, make the eventCounter volatile. Alternatively, if you don't want to restrict your compiler too much, you can use the following function to perform reads of eventCounter.
int load_acquire(volatile int * counter) { return *counter; }
if (localCopy != load_acquire(&sharedCopy))
// ...
preventing the compiler from caching
the local counter copy in a register
so other threads can't see it
Your local counter copy is "local", created on the execution stack and visible only to the running thread. Every other thread runs in a different stack and has the own local counter variable (no concurrency).
Your global counter should be declared volatile to avoid register optimization.
You can also use hand coded assembly or compiler intrinsics which will garuntee atomic checks against your mutex, they can also atomically ++ and -- your counter.
volatile is useless these days, for the most part, you should look at memory barrier's which are other low level CPU facility to help with multi-core contention.
However the best advice I can give, would be for you to bone up on the various managed and native multi-core support libraries. I guess some of the older one's like OpenMP or MPI (message based), are still kicking and people will go on about how cool they are... however for most developers, something like intel's TBB or Microsoft's new API's, I also just dug up this code project article, he's apparently using cmpxchg8b which is the lowlevel hardware route which I mentioned initially...
Good luck.

Resources