I'm writing a linux daemon in C which gets values from an ADC by SPI interface (ioctl). The SPI (spidev - userland) seems to be a bit unstable and freezes the daemon at random times.
I need to have some better control of the calls to the functions getting the values, and I was thinking of making it as a thread which I could wait for to finish and get the return value and if it times out assume that it froze and kill it without this new thread taking down the daemon itself. Then I could apply measures like resetting the ADC before restarting. Is this possible?
Pseudo example of what I want to achieve:
(function int get_adc_value(int adc_channel, float *value) )
pid = thread( get_adc_value(1,&value); //makes thread calling the function
wait_until_finish(pid, timeout); //waits until function finishes/timesout
if(timeout) kill pid, start over //if thread do not return in given time, kill it (it is frozen)
else if return value sane, continue //if successful, handle return variable value and continue
Thanks for any input on the matter, examples highly appreciated!
I would try looking at using the pthreads library. I have used it for some of my c projects with good success and it gives you pretty good control over what is running and when.
A pretty good tutorial can be found here:
http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html
In glib there is too a way to check the threads, using GCond (look for it in the glib help).
In resume you should periodically set a GCond in the child thread and check it in the main thread with a g_cond_timed_wait. It's the same with the glib or the pthread.
Here is an example with the pthread:
http://koders.com/c/fidA03D565734AE2AD9F5B42AFC740B9C17D75A33E3.aspx?s=%22pthread_cond_timedwait%22#L46
I'd recommend a different approach.
Write a program that takes samples and writes them to standard output. It simply need have alarm(TIMEOUT); before every sample collection, and should it hang the program will exit automatically.
Write another program that runs that first program. If it exits, it runs it again. It looks something like this:
main(){for(;;){system("sampler");sleep(1);}}
Then in your other program, use FILE*fp=popen("supervise_sampler","r"); and read the samples from fp. Better still: Have the program simply read the samples from stdin and insist users start your program like this:
(while true;do sampler;sleep 1; done)|program
Splitting up the task like this makes it easier to develop and easier to test, for example, you can collect samples and save them to a file and then run your program on that file:
sampler > data
program < data
Then, as you make changes to program, you can simply run it again on the same data over and over again.
It's also trivial to enable data logging- so should you find a serious issue you can run all your data through your program again to find the bugs.
Something very interesting happens to a thread when it executes an ioctl(), it goes into a very special kind of sleep known as disk sleep where it can not be interrupted or killed until the call returns. This is by design and prevents the kernel from rotting from the inside out.
If your daemon is getting stuck in ioctl(), its conceivable that it may stay that way forever (at least till the ADC is re-set).
I'd advise dropping something, like a file with a timestamp prior to calling ioctl() on a known buggy interface. If your thread does not unlink that file in xx amount of seconds, something else needs to re-start the ADC.
I also agree with the use of pthreads, if you need example code, just update your question.
Related
I wrote a CPU intensive program in C to run on Windows. In the main loop I check for a keyboard press to allow you to interrupt execution in order to pause the program. The idea is to release the thread to other processes if the program is slowing down the computer too much. After a keyboard press I wait for more keyboard input using fgets(), which allows you to restart the program later. This does reduce the CPU usage shown in task manager quite well. But I was wondering if there is perhaps a more explicit way to tell the operating system that this process doesn't need any attention for a while in order to reduce the overhead while idle to the absolute minimum.
My understanding is that the operating system periodically lets a process run and then stops running it after a certain amount of time. It then checks the rest of the processes in the same way until it comes back to this one again. If it has enough to do the process will run for the maximum allowed time. Otherwise, it will stop early and return control to the operating system. So a function like fgets must immediately return control if there is no keyboard input, which is why the process runs at near 0% CPU. So I guess another way of asking my question is how do I deliberately return control to the operating system in my own code.
my question is how do I deliberately return control to the operating system in my own code
You can use either Sleep(0) or SwitchToThread(). Both pass control back to the OS and might cause the calling thread to give up the remaining time slice but the devil is in the detail.
Sleep(0)
If no other thread with a matching priority is ready to run, the call returns immediately. Otherwise, the thread gives up its remaining time slice.
You can work around the priority issue by using SwitchToThread or Sleep(1). The disadvantage of the latter is that the thread gives up its time slice unconditionally, whether or not other threads are ready to run.
SwitchToThread()
If no other thread, irrespective if its priority, is ready to run on the thread's current processor, the call returns immediately. Otherwise, the thread gives up its remaining time slice for at most one time slice.
Alternatively, you could change the priority of the process (SetPriorityClass() with PROCESS_MODE_BACKGROUND_BEGIN) or thread (SetThreadPriority() with THREAD_MODE_BACKGROUND_BEGIN) so that the OS can take care of prioritizing more important processes/threads for you. In your scenario, doing so would be a better fit. The scheduler will respond to sudden CPU demand without any additional work on your end.
You can do it in pretty much two ways. Either read the input using a blocking function, like fgets, or read the input using a non-blocking function. In the second situation you would need to incorporate a timeout of some sort. Some functions do this for you, like select. Otherwise you need to regularly sleep your process or thread.
Effectively the system is using interrupts to determine which processes care about a specific event.
I´m trying to get some values displayed on an eInk-Display (via SPI). I already wrote the software to initialize the display and display the values passed as command-line arguments. The problem is, because of the eInk-technology it takes a few seconds for the display to have fully actualized, so the display-program is also running for this time.
The other ("Master"-) program collects the values and does other stuff. It has a main loop, which has to be cycled through at least 10x/second.
So I want to start the displaying program from within the main loop and immediately continue with the loop.
When using system() or execl(), the Master-program either waits till the display program is finished or exits into the new process.
Is there a way to just start other programs out of other ones without any further connection between them? It should run on Linux.
May fork() be a solution?
quick and dirty way: use system with a background suffix (&)
char cmd[200];
sprintf("%190s &","your_command");
system(cmd);
note that it's not portable because it depends on the underlying shell. For windows you would do:
sprintf("start %190s","your_command");
The main drawback of the quick & dirty solution is that it's "fire & forget". If the program fails to execute properly, you'll still have a 0 return code as long as the shell could launch the process.
A portable method (also allowing to take care of the return code of the process) is slightly more complex, involving running a system call from a thread or a forked executable. The quick & dirty solution does a fork + exec of a shell command behind the scenes.
I'm working on a filesystem project based on FUSE. And I want to add some sort of read ahead to it. So I create a thread to process such tasks, but It seems that I made it really slower than I thought.(Even if I just add a idle thread, it makes my program become much more slower than without that, but that do not happened when I added this function to my server program, which do not use fuse)
I did not simply use fuse_main function, instead I read the sshfs's code and try to initialize it by myself with the following functions,
fuse_parse_cmdline
fuse_mount
fcntl
fuse_new
fuse_daemonize
fuse_set_signal_handlers
fuse_loop_mt
and without add the thread, it runs pretty well, but after I add this thread in
pthread_create(&tid, NULL, test, NULL); // function test is just a while(1){}
it get slower(Read a 100M file, without this thread it is 40s, and with that it is nearly 100s)
Is this something to do about schedparam or something else?
Hope you guys could give me some advice, like what things I need to check.
Thanks again.
Your thread is busy waiting, which means it will use as much CPU power as it can. You might want to add a little delay in your thread to let other threads and processes run too:
while (1)
{
usleep(1000); /* Sleep for one millisecond */
}
Is there a way to create a timer (say, to 10 seconds) on a different thread?
I mean, I know how to use CreateThread() and I know how to create/use timers. The problem I have is that the new thread cannot receive a callback function.
For those that will inevitably ask "why do you want to do this?" the answer is because i have to do it this way. it is part of a bigger program that can't at this specific part of the code use callback functions. that's all.
Is there any way to achieve this?
code is appreciated.
Thanks!
EDIT:
A better explanation of the problem:
My application consist of two separate programs. The main program (visible, interface for the user) and another doing the hard work in the background (sort of like a daemon).
The background process need to finishing writing to the DB and closing a lot of little files before exiting.
The main application send a "we're done" message to that background process. Upon receiving this the background process returns the current status and exists.
Now, I need to add the following: upon receiving the message it returns a status and triggers a timer that will wait X amount of time on another thread, in the meantime the background process closes all the DB connections and files. If the timer reached 0 then and the background process is still alive then it terminates it. If the background process closed all the db and files then the thread (and timer) will die before reaching 0 as the application terminates normally.
Is this better?
So, you need a watchdog inside the DB process (I misread again, didn't I). ThreadProc like this will probably suffice, since all threads terminates when main thread terminates:
DWORD WINAPI TerminateAfter10s(LPVOID param) {
Sleep(10000);
ExitProcess(0);
}
If you use the multimedia timer function timeSetEvent, it can be configured to pulse an event rather than use the normal callback. Does that satisfy the requirement ?
I'm more interested in knowing why you have this requirement to avoid the use of a callback. Callbacks would seem to be entirely appropriate to use in a worker thread.
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Linux API to list running processes?
How can I detect hung processes in Linux using C?
Under linux the way to do this is by examining the contents of /proc/[PID]/* a good one-stop location would be /proc/*/status. Its first two lines are:
Name: [program name]
State: R (running)
Of course, detecting hung processes is an entirely separate issue.
/proc//stat is a more machine-readable format of the same info as /proc//status, and is, in fact, what the ps(1) command reads to produce its output.
Monitoring and/or killing a process is just a matter of system calls. I'd think the toughest part of your question would really be reliably determining that a process is "hung", rather than meerly very busy (or waiting for a temporary condition).
In the general case, I'd think this would be rather difficult. Even Windows asks for a decision from the user when it thinks a program might be "hung" (on my system it is often wrong about that, too).
However, if you have a specific program that likes to hang in a specific way, I'd think you ought to be able to reliably detect that.
Seeing as the question has changed:
http://procps.sourceforge.net/
Is the source of ps and other process tools. They do indeed use proc (indicating it is probably the conventional and best way to read process information). Their source is quite readable. The file
/procps-3.2.8/proc/readproc.c
You can also link your program to libproc, which sould be available in your repo (or already installed I would say) but you will need the "-dev" variation for the headers and what-not. Using this API you can read process information and status.
You can use the psState() function through libproc to check for things like
#define PS_RUN 1 /* process is running */
#define PS_STOP 2 /* process is stopped */
#define PS_LOST 3 /* process is lost to control (EAGAIN) */
#define PS_UNDEAD 4 /* process is terminated (zombie) */
#define PS_DEAD 5 /* process is terminated (core file) */
#define PS_IDLE 6 /* process has not been run */
In response to comment
IIRC, unless your program is on the CPU and you can prod it from within the kernel with signals ... you can't really tell how responsive it is. Even then, after the trap a signal handler is called which may run fine in the state.
Best bet is to schedule another process on another core that can poke the process in some way while it is running (or in a loop, or non-responsive). But I could be wrong here, and it would be tricky.
Good Luck
You may be able to use whatever mechanism strace() uses to determine what system calls the process is making. Then, you could determine what system calls you end up in for things like pthread_mutex deadlocks, or whatever... You could then use a heuristic approach and just decide that if a process is hung on a lock system call for more than 30 seconds, it's deadlocked.
You can run 'strace -p ' on a process pid to determine what (if any) system calls it is making. If a process is not making any system calls but is using CPU time then it is either hung, or is running in a tight calculation loop inside userspace. You'd really need to know the expected behaviour of the individual program to know for sure. If it is not making system calls but is not using CPU, it could also just be idle or deadlocked.
The only bulletproof way to do this, is to modify the program being monitored to either send a 'ping' every so often to a 'watchdog' process, or to respond to a ping request when requested, eg, a socket connection where you can ask it "Are you Alive?" and get back "Yes". The program can be coded in such a way that it is unlikely to do the ping if it has gone off into the weeds somewhere and is not executing properly. I'm pretty sure this is how Windows knows a process is hung, because every Windows program has some sort of event queue where it processes a known set of APIs from the operating system.
Not necessarily a programmatic way, but one way to tell if a program is 'hung' is to break into it with gdb and pull a backtrace and see if it is stuck somewhere.