This question already has answers here:
how to detect Linux shutdown/reboot
(2 answers)
Closed 3 years ago.
I have an application written in C which runs as a daemon and needs to send something through RS232 when system is in shutdown or reboot state, and it needs to distinguish between these two.
So my idea is:
In my application script /etc/init.d/my_app in "stop" case of my script, I will run /sbin/runlevel command to get current runlevel:
0 - shutdown state
6 - reboot state
then I will execute some command to inform my daemon which state is it, daemon will perform communication through rs, and then will exit.
I think it should work, however it may not be the best solution, especially because my app is already running as a daemon, maybe I can receive some signal directly from system/kernel/library or through unix socket or something.
Best regards
Marek
I am not sure which signal is send to an application on system shutdown. My best guess is SIGTERM and if the application does not shutdown SIGKILL. So did you try to catch SIGTERM and properly shut down your program? There are a lot of examples on the net how to do that.
For more sophisticated process handling you can send SIGUSR1, SIGUSR2 to your application.
Related
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
We are using the Apache Zookeeper Client C bindings in our application. Client library version is 3.5.1. When the Zookeeper connection gets disconnected, the application is configured to exit with error code 116.
Systemd is being used to automate starting/stopping the application. The unit file does not override the default setting for KillMode, which is to send SIGTERM to the application.
When the process is stopped using the systemctl stop directive, the Zookeeper client threads seem to be attempting to reconnect to Zookeeper:
2016-04-12 22:34:45,799:4506(0xf14f7b40):ZOO_ERROR#handle_socket_error_msg#2363: Socket [128.0.0.4:61758] zk retcode=-4, errno=112(Host is down): failed while receiving a server response
2016-04-12 22:34:45,799:4506(0xf14f7b40):ZOO_INFO#check_events#2345: initiated connection to server [128.0.0.4:61758]
Apr 12 22:34:45 main thread: zookeeperWatcher: event type ZOO_SESSION_EVENT state ZOO_CONNECTING_STATE path
2016-04-12 22:34:45,801:4506(0xf14f7b40):ZOO_INFO#check_events#2397: session establishment complete on server [128.0.0.4:61758], sessionId=0x40000015b8d0077, negotiated timeout=20000
2016-04-12 22:34:46,476:4506(0xf14f7b40):ZOO_WARN#zookeeper_interest#2191: Delaying connection after exhaustively trying all servers [128.0.0.4:61758]
2016-04-12 22:34:46,810:4506(0xf14f7b40):ZOO_INFO#check_events#2345: initiated connection to server [128.0.0.4:61758]
2016-04-12 22:34:46,811:4506(0xf14f7b40):ZOO_ERROR#handle_socket_error_msg#2382: Socket [128.0.0.4:61758] zk retcode=-112, errno=116(Stale file handle): sessionId=0x40000015b8d0077 h
Due to this, the process is exiting with an error code. Systemd sees failure code upon exit and does not attempt to restart the application. Does anyone know why the client is getting disconnected?
I am aware that I can work around this by setting SuccessExitStatus=116 in the unit file, but I don't want to mask out genuine errors. I have tried registering a signal handler for SIGTERM and closing the Zookeeper client in the handler. But the handler code never seems to get hit when I issue systemctl stop.
EDIT: The handler wasn't getting called because I had made it asynchronous - it didn't execute immediately upon receiving signal. OTOH the process exits immediately upon Zookeeper disconnect.
What happens when you load the handler for SIGTERM and issue systemctrl stop?
If nothing occurs then you may have a mask blocking the signal (I guess not).
If the application keeps exiting with the same error code then I would suggest you make sure that the signal handler is being loaded correctly.
This is working expected, it's the application writer's responsibility to specify how to gracefully shutdown the service, if you don't want to use the default, which sends SIGTERM, you can use the ExecStop to make your own stop command in the unit files:
ExecStart=/usr/bin/app
ExecStop=/usr/bin/app -stop
For details see docs at
https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStop=
The issue is unrelated, someone was running a script that was killing the connection. Thank you all for your help!
This question is an extension to this previously asked question:
I implemented the solution given by jxh with following params:
SO_KEEPALIVE = Enabled
TCP_KEEPIDLE = 120 secs
TCP_KEEPINTVL = 75 secs
TCP_KEEPCNT = 1
Then why the server still waits forever for client to respond?
Also I found out on internet that
kill <pid> actually sends SIGTERM to the given process.
So I used ps -o pid,cmd,state command after 'killing' the telnet application.
I saw that the telnet process was still there but with process state = T, i.e. it was in STOPPED state
P.S.: I do not have much knowledge of Linux Signals, please consider that.
Because the client hasn't exited yet, being still in STOPPED state, and therefore hasn't closed its connections either.
Since the client processes are still alive, then the TCP stack in the kernel will process the keep-alive packets it receives with an acknowledgement packet back to the sender of the packet. So, even though the connection is indeed idle, the connection will never be closed since the kernel is happily processing the packets.
On a real network, given your parameters, the connection would be closed if the ACK from the client machine ever got lost. On your setup, since the client and server are on the same machine, your network will be essentially lossless.
It is unclear to me how you got your telnet sessions in this state. SIGTERM will not put a process in the stopped state. The process goes into stopped state when receiving SIGSTOP (and usually SIGTSTP, but it seems telnet ignores that one). I suggest that perhaps you sent that signal by mistake, or you suspended the session (with ^]z). When that happened, you should have seen in the window, the one with your telnet session, generate output like:
[1]+ Stopped telnet ...
This is printed by the shell. When the telnet process is stopped, it won't process the SIGTERM until after it is placed in the foreground.
A SIGKILL (done with kill -9 <pid>) will be processed immediately.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Preventing multiple process instances on Linux
I have multi-threaded application which can be run as a deamon process or one time with input parameters.
I want to ensure that if the application is running as a deamon process then, user should not be allowed to run this again.
EDIT:After you all suggested to go for flocks, I tried it and put it in server. I know have weird problem, when the servers are bounced, they delete all the files, including lock file :(. How now ?
The easiest way is to bind to a port (could be unix domain, in a "private" directory) Only one process can bind to a port, so if the port is bound, the process is running. If the process exits, the kernel automatically closes the filedescriptor. It does cost your process a (unused?) filedescriptor. Normally a daemon process would need some listen socket anyway.
You can try using file locks. Upon starting the process, you can open a file, lock it, and check for a value (e.g. size of file). If it's not desired value, the process can exit. If desired value, change the file to an undesired value.
I implemented similar thing by using shell scripts to start and stop the daemon.
In the start script before the exe call look if this exe is still running. If it finds it is still running then new process is not started.
I have an system running embedded linux and it is critical that it runs continuously. Basically it is a process for communicating to sensors and relaying that data to database and web client.
If a crash occurs, how do I restart the application automatically?
Also, there are several threads doing polling(eg sockets & uart communications). How do I ensure none of the threads get hung up or exit unexpectedly? Is there an easy to use watchdog that is threading friendly?
You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.
Which leaves only the problem of detecting a hung process. You can use any of the solutions pointed out by Michael Aaron Safyan for this, but a yet easier solution would be to use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.
That way, no extra programs needed, and only portable POSIX stuff used.
The gist of it is:
You need to detect if the program is still running and not hung.
You need to (re)start the program if the program is not running or is hung.
There are a number of different ways to do #1, but two that come to mind are:
Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.
Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.
With respect to #2, killing the previous PID and using fork+exec to launch a new process is typical. You might also consider making your application that runs "continuously", into an application that runs once, but then use "cron" or some other application to continuously rerun that single-run application.
Unfortunately, watchdog timers and getting out of deadlock are non-trivial issues. I don't know of any generic way to do it, and the few that I've seen are pretty ugly and not 100% bug-free. However, tsan can help detect potential deadlock scenarios and other threading issues with static analysis.
You could create a CRON job to check if the process is running with start-stop-daemon from time to time.
use this script for running your application
#!/bin/bash
while ! /path/to/program #This will wait for the program to exit successfully.
do
echo “restarting” # Else it will restart.
done
you can also put this script on your /etc/init.d/ in other to start as daemon
Any idea on how to capture closing the terminal window that my program is running in?
While I'm at it, any way to capture when the computer is shutting down but the program is still running, or if the user logs off?
If on Unix/Linux: Did you have a look at SIGTERM? This is at least the one sent to you during shutdown.
You could try the atexit() function ? (see comments)
Or look at this post here: Signals received by bash when terminal is closed
Try catching SIGTERM. Note that you can not capture SIGKILL which might be what happens during shutdown after a certain amount of time. I found this really nice post that explains some differences too.
[update] Longshot here but what about testing if std-in/out is still open and good? When the terminal dies those file descriptors should be scrapped. Disclaimer, this is a guess at best.
From my tests... the signal that my program is receiving when closing terminal is 1 or SIGHUP