How would be the correct way to prevent a soft lockup/unresponsiveness in a long running while loop in a C program?
(dmesg is reporting a soft lockup)
Pseudo code is like this:
while( worktodo ) {
worktodo = doWork();
}
My code is of course way more complex, and also includes a printf statement which gets executed once a second to report progress, but the problem is, the program ceases to respond to ctrl+c at this point.
Things I've tried which do work (but I want an alternative):
doing printf every loop iteration (don't know why, but the program becomes responsive again that way (???)) - wastes a lot of performance due to unneeded printf calls (each doWork() call does not take very long)
using sleep/usleep/... - also seems like a waste of (processing-)time to me, as the whole program will already be running several hours at full speed
What I'm thinking about is some kind of process_waiting_events() function or the like, and normal signals seem to be working fine as I can use kill on a different shell to stop the program.
Additional background info: I'm using GWAN and my code is running inside the main.c "maintenance script", which seems to be running in the main thread as far as I can tell.
Thank you very much.
P.S.: Yes I did check all other threads I found regarding soft lockups, but they all seem to ask about why soft lockups occur, while I know the why and want to have a way of preventing them.
P.P.S.: Optimizing the program (making it run shorter) is not really a solution, as I'm processing a 29GB bz2 file which extracts to about 400GB xml, at the speed of about 10-40MB per second on a single thread, so even at max speed I would be bound by I/O and still have it running for several hours.
While the posed answer using threads might possibly be an option, it would in reality just shift the problem to a different thread. My solution after all was using
sleep(0)
Also tested sched_yield / pthread_yield, both of which didn't really help. Unfortunately I've been unable to find a good resource which documents sleep(0) in linux, but for windows the documentation states that using a value of 0 lets the thread yield it's remaining part of the current cpu slice.
It turns out that sleep(0) is most probably relying on what is called timer slack in linux - an article about this can be found here: http://lwn.net/Articles/463357/
Another possibility is using nanosleep(&(struct timespec){0}, NULL) which seems to not necessarily rely on timer slack - linux man pages for nanosleep state that if the requested interval is below clock granularity, it will be rounded up to clock granularity, which on linux depends on CLOCK_MONOTONIC according to the man pages. Thus, a value of 0 nanoseconds is perfectly valid and should always work, as clock granularity can never be 0.
Hope this helps someone else as well ;)
Your scenario is not really a soft lock up, it is a process is busy doing something.
How about this pseudo code:
void workerThread()
{
while(workToDo)
{
if(threadSignalled)
break;
workToDo = DoWork()
}
}
void sighandler()
{
signal worker thread to finish
waitForWorkerThreadFinished;
}
void main()
{
InstallSignalHandler;
CreateSemaphore
StartThread;
waitForWorkerThreadFinished;
}
Clearly a timing issue. Using a signalling mechanism should remove the problem.
The use of printf solves the problem because printf accesses the console which is an expensive and time consuming process which in your case gives enough time for the worker to complete its work.
Related
System is an embedded Linux/Busybox core on a small embedded board with a web server (Boa) running.
We are seeing some high latency in responses from the web server - sometimes >500ms for no good reason, so I've been digging...
On liberally scattering debug prints throughout the code it seems to come down to the entire process just... stopping for a bit, in a way which I can only assume must be the process/thread being interrupted by another process.
Using print statements and clock_gettime() to calculate time taken to process a request, I can see the code reach the bottom of a while() loop (parsing input), print something like "Time so far: 5ms" and then the next line at the top of the loop will print "Time so far: 350ms" - and all that the code does between the bottom of the loop and the 1st print back at the top is a basic check along the lines of while(position < end), it has nothing complicated that could hold it up.
There's no IO blocking, the data it's parsing has all arrived already, and it's not making any external calls or wandering off into complex functions.
I then looked into whether the kernel scheduler (CFS in our case) might be holding things up, adding calls to clock() (processor time rather than wall-clock) and again calculating time differences Vs processor time used I can see that the wall-clock time delay may run beyond 300ms from one loop to the next, but the reported processor time taken (which seems to have a ~10ms resolution) is more like 50ms.
So, that suggests the task scheduler is holding the process up for hundreds of milliseconds at a time. I've checked the scheduler granularity and max delay and it's nowhere near 100ms, scheduler latency is set at 6ms for example.
Any advice on what I can do now to try and track down the problem - identifying processes which could hog the CPU for >100ms, measuring/tracking what the scheduler is doing, etc.?
First you should try and run your program using strace to see if there are any system calls holding things up.
If that is ambiguous or does not help I would suggest you try and profile the kernel. You could try OProfile
This will create a call graph that you can analyze and see what is happening.
I'm looking to create a state of uninterruptible sleep for a program I'm writing. Any tips or ideas about how to create this state would be helpful.
So far I've looked into the wait_event() function defined in wait.h, but was having little luck implementing it. When trying to initialize my wait queue the compiler complained
warning: parameter names (without types) in function declaration
static DECLARE_WAIT_QUEUE_HEAD(wq);
Has anyone had any experience with the wait_event() function or creating an uninterruptible sleep?
The functions that you're looking at in include/linux/wait.h are internal to the Linux kernel. They are not available to userspace.
Generally speaking, uninterruptible sleep states are considered undesirable. Under normal circumstances, they cannot be triggered by user applications except by accident (e.g, by attempting to read from a storage device that is not responding correctly, or by causing the system to swap).
You can make sleep 'signal-aware`.
sleep can be interrupted by signal. In which case the pause would be stopped and sleep would return with amount of time still left. The application can choose to handle the signal notified and if needed resume sleep for the time left.
Actually, you should use synchronization objects provided by the operating system you're working on or simply check the return value of sleep function. If it returns to a value bigger than zero, it means your procedure was interrupted. According to this return value, call sleep function again by passing the delta (T-returnVal) as argument (probably in a loop, in case of possible interrupts that might occur again in that time interval)
On the other hand, if you really want a real-uninterruptible custom sleep function, I may suggest something like the following:
void uninterruptible_sleep(long time, long factor)
{
long i, j;
__asm__("cli"); // close interrupts
for(i=0; i<time; ++i)
for(j=0; j<factor; ++j)
; // custom timer loop
__asm__("sti"); // open interrupts
}
cli and sti are x86 assembly instructions which allow us to set IF (interrupt flag) of the cpu. In this way, it is possible to clear (cli) or set (sti) all the interrupts. However, if you're working on a multi-processor system, there needs to be taken another synchronization precautions too, due to the fact that these instructions will only be valid for single microprocessor. Moreover, this type of function as I suggested above, will be very system (cpu) dependant. Because, the inner loop requires a clock-cycle count to measure an exact time interval (execution number of instructions per second) depending on the cpu frequency. Thus, if you really want to get rid of every possible interrupt, you may use a function as I suggested above. But be careful, if your program gets a deadlock situation while it's in cli state, you will need to restart your system.
(The inline assembly syntax I have written is for gcc compiler)
I am writing a Gif animator in C.
I have two threads running in parallel, both . The first allows the user to alter the speed of the animation. The second draws the current frame, and then calls Sleep(Constant * 100 / CurrentSpeed), where CurrentSpeed is a percentage amount, ranging from 1 to 200.
The problem is that if you quickly change the speed from 100%, to 1%, and then back to the first, the second thread will execute the following:
Sleep(Constant * 100)
This will draw frame A, wait many seconds (although the speed was changed by the user), and only then draw B and the following frames in the default speed.
It seems to me that Sleep is a poor choice of mine in this case. What can I do to solve this problem?
EDIT:
The code I currently have (Simplified):
while (1) {
InvalidateRect(Handle, &ImageRect, FALSE);
if (shouldDispose) {
break;
}
if (DelayTime)
Sleep(DelayTime * 100 / CurrentSpeed);
SelectNextImage();
}
Instead of calling Sleep() with the desired frame rate, why don't you call it with a constant interval of 1 ms, for example, and use a variable as a counter?
For example, let C be a global variable (counter) which is loaded with a number of 'ticks' of 1ms. Then, write the loop:
while(1) { //Main loop of the player thread
if (C > 0) C--;
if (C == 0) nextframe(); //if counter reaches 0, load next frame.
Sleep(1);
}
The control thread would load C with a number of 1ms ticks (i.e. frame rate), and the player thread will never be stopped beyond 1 ms. The use of 1ms as the base rate is arbitrary. Use the minimum time that allows you the maximum frame rate, in order to load CPU the less as possible.
EDIT
After some hot comments (arguing is good after all), I'd like to point out that this solution is sub-optimal, i.e., it doesn't use any OS mechanism for signaling threads or any other API for preventing the thread from wasting CPU time. The solution shown here is generic: it may be used in any system (even in embedded systems without any running OS. But above all, it is based on the original code posted by the user that asked the question: using Sleep(), how can I achieve my purpose. I give him my humble answer. Anyway, I encourage other people to write sample code using the appropriate API for achieving the same goal. With no hard feelings, special thanks to Martin James.
Find a synchro API on your OS that allows a wait with a timeout, eg. WaitForSingleObject() on Windows. If you want to change the delay, change the timeout and signal the event upon which the WFSO is waiting to make it return 'early' and restart the wait with the new timeout.
Polling with Sleep(1) loops is rarely justifiable.
Create a waitable timer. When you set the timer, you can specify a callback function that will run in the setting thread's context. This means you can do it with two threads, but it actually works just fine with only a single thread as well.
The main advantage of a waitable timer is, however, that it is more accurate and more reliable than Sleep. A timer is conceptually much different from Sleep insofar as Sleep only gives up control and the scheduler marks the thread as ready to run when the time is up and when the scheduler runs anyway. It doesn't do anything beyond that. Which means that the thread will eventually be scheduled to run again, like any other thread that is ready.
A thread that is waiting on a timer (or other waitable object) causes the scheduler to run when the timer is up and has its priority temporarily boosted. It therefore runs not only more reliably and more closely to the desired time, but also earlier than all other threads with the same base priority. Which does not give a realtime guarantee but at least gives a sort of "soft guarantee".
If you still want to use Sleep, use SleepEx instead which you can alert, either by queueing an APC, or by calling the undocumented NtAlertThread function.
In any case, Sleep is troublesome not only because of being unreliable, but also because it bases on the granularity of the system-wide timer. Which you can, of course, set to as low as 1ms (or less on some systems), but that will cause a lot of unnecessary interrupts.
I've written many C programs for microcontrollers but never one that runs on an OS like linux. How does linux decide how much processing time to give my application? Is there something I need to do when I have idle time to tell the OS to go do something else and come back to me later so that other processes can get time to run as well? Or does the OS just do that automatically?
Edit: Adding More Detail
My c program has a task scheduler. Some tasks run every 100ms, some run every 50 ms and so on. In my main program loop i call ProcessTasks which checks if any tasks are ready to run, if none are ready it calls an idle function. The idle function does nothing but it's there so that I could toggle a GPIO pin and monitor idle time with an O'scope... or something if I so desired. So maybe I should call sched_yield() in this idle function???
How does linux decide how much processing time to give my application
Each scheduler makes up its own mind. Some reward you for not using up your share, some roll dices trying to predict what you'll do etc. In my opinion you can just consider it magic. After we enter the loop, the scheduler magically decides our time is up etc.
Is there something I need to do when I have idle time to tell the OS
to go do something else
You might call sched_yield. I've never called it, nor do I know of any reasons why one would want to. The manual does say it could improve performance though.
Or does the OS just do that automatically
It most certainly does. That's why they call it "preemptive" multitasking.
It depends why and how you have "idle time". Any call to a blocking I/O function, waiting on a mutex or sleeping will automatically deschedule your thread and let the OS get on with something else. Only something like a busy loop would be a problem, but that shouldn't appear in your design in any case.
Your program should really only have one central "infinite loop". If there's any chance that the loop body "runs out of work", then it would be best if you could make the loop perform one of the above system functions which would make all the niceness appear automatically. For example, if your central loop is an epoll_wait and all your I/O, timers and signals are handled by epoll, call the function with a timeout of -1 to make it sleep if there's nothing to do. (By contrast, calling it with a timeout of 0 would make it busy-loop – bad!).
The other answers IMO are going into too much detail. The simple thing to do is:
while (1){
if (iHaveWorkToDo()){
doWork();
} else {
sleep(amountOfTimeToWaitBeforeNextCheck);
}
}
Note: this is the simple solution which is useful in a single-threaded application or like your case where you dont have anything to do for a specified amount of time; just to get something decent working. The other thing about this is that sleep will call whatever yield function the os prefers, so in that sense it is better than an os specific yield call.
If you want to go for high performance, you should be waiting on events.
If you have your own events it will be something like follows:
Lock *l;
ConditionVariable *cv;
while (1){
l->acquire();
if (iHaveWorkToDo()){
doWork();
} else {
cv->wait(lock);
}
l->release();
}
In a networking type situation it will be more like:
while (1){
int result = select(fd_max+1, ¤tSocketSet, NULL, NULL, NULL);
process_result();
}
I have a C code with some functions.I need to find out the execution time of each function, I have tried using gettimeofday and rdtsc,but i guess that because its multi core system the output time provided involves the switching time between the processors. I wanted it to be serialized. So if somebody can give me an idea that how should i calulate the time or at least let me know about the syntax of rdstcp.
P.S. please reply as soon as possible
Thanks :)
It's a little impractical to expect nanosecond resolution.
You can't add code just to output the execution times of functions without increasing execution time. When you take the code out, the timing changes.
In practice, this kind of measurement is made by observing the CPU timing signals on an oscilloscope (or logic analyser).
If you have multiple cores, then the CPU timer won't be stable between them. So set the thread affinity to keep it on the one core. You also might want to use a real time timer to measure the time for the process or thread using clock_gettime(CLOCK_PROCESS_CPUTIMER_ID). Read the note for SMP systems in the usage for that function.
Both of these will effect the timing of the program, so perform multiple iterations of whatever you are benchmarking, and don't call the timing functions too often to try and mitigate this.
There should be some way to set processor affinity to tell the operating system to only run that thread on a particuar core.
In windows there is a SetThreadAffinity system call, I imagine there is a similar function in linux, although I don't know what it is called.
You could boot your dual core system to use one core only using the following kernel parameter:
maxcpus=1
But the measured time will still comprise process contest switching and thus depend on the activity on the other processes on the system. Are you interested in the execution time, or the CPU time needed to execute your task ?
Mate, I'm not sure about this, but even if you're dual core,unless the program is threaded, it will only run in 1 thread (meaning 1 core), so it should not involve the time of switching between processors, I believe there is no such thing...
Pavium is correct, the only way to get decent timing at this resolution in with an oscilloscope and toggling GPIO pins.
Bear in mind that this is all a bit academic anyway: I suppose you are running with an operating system, etc, so there is no way to get a straight run at the hardware.
You really need to look at the reason you want this measurement. Is it a performance benchmark for some code? You could try running the code many thousands of times and get some statistics. For this kind of approach I would recommend you read Zed Shaws diatribe to make sure the numbers aren't fooling you.
Precise Performance Measuring was impossible until Linux kernel 2.6.31. In this kernel a new library for accessing the performacne counters of the CPU and IMHO correcting times in the scheduler was added.
Unfortunately i don't have more details but maybe it is a starting point for more information search. I'm just adding this because nobody mentioned it before
Use the struct timespec structure & clock_gettime function as follows to obtain the time of execution of the code in nanoseconds precision
struct timespec start, end;
clock_gettime(CLOCK_REALTIME,&start);
/* Do something */
clock_gettime(CLOCK_REALTIME,&end);
It returns a value as ((((unsigned64)start.tv_sec) * ((unsigned64)(1000000000L))) + ((unsigned64)(start.tv_nsec))))
Moreover this I've used for multithreaded concepts too..
Hope this answer will be more helpful for you to get your desired execution time in nanoseconds.