So I have this old, nasty piece of C code that I inherited on this project from a software engineer that has moved on to greener pastures. The good news is... IT RUNS! Even better news is that it appears to be bug free.
The problem is that it was designed to run on a server with a set of start up parameters input on the command line. Now, there is a NEW requirement that this server is reconfigurable (didn't see that one coming...). Basically, if the server receives a command over UDP, it either starts this program, stops it, or restarts it with new start up parameters passed in via the UDP port.
Basically the code that I'm considering using to run the obfuscated program is something like this (sorry I don't have the actual source in front of me, it's 12:48AM and I can't sleep, so I hope the pseudo-code below will suffice):
//my "bad_process_manager"
int manage_process_of_doom() {
while(true) {
if (socket_has_received_data) {
int return_val = ParsePacket(packet_buffer);
// if statement ordering is just for demonstration, the real one isn't as ugly...
if (packet indicates shutdown) {
system("killall bad_process"); // process name is totally unique so I'm good?
} else if (packet indicates restart) {
system("killall bad_process"); // stop old configuration
// start with new parameters that were from UDP packet...
system("./my_bad_process -a new_param1 -b new_param2 &");
} else { // just start
system("./my_bad_process -a new_param1 -b new_param2 &");
}
}
}
So as a result of the system() calls that I have to make, I'm wondering if there's a neater way of doing so without all the system() calls. I want to make sure that I've exhausted all possible options without having to crack open the C file. I'm afraid that actually manipulating all these values on the fly would result in having to rewrite the whole file I've inherited since it was never designed to be configurable while the program is running.
Also, in terms of starting the process, am I correct to assume that throwing the "&" in the system() call will return immediately, just like I would get control of the terminal back if I ran that line from the command line? Finally, is there a way to ensure that stderr (and maybe even stdout) gets printed to the same terminal screen that the "manager" is running on?
Thanks in advance for your help.
What you need from the server:
Ideally your server process that you're controlling should be creating some sort of PID file. Also ideally, this server process should hold an exclusive lock on the PID file as long as it is still running. This allows us to know if the PID file is still valid or the server has died.
Receive shutdown message:
Try to get a lock on the PID file, if it succeeds, you have nothing to kill (the server has died, if you proceed to the kill regardless, you may kill the wrong process), just remove the old PID file.
If the lock fails, read the PID file and do a kill() on the PID, remove the old PID file.
Receive start message:
You'll need to fork() a new process, then choose your flavor of exec() to start the new server process. The server itself should of course recreate its PID file and take a lock on it.
Receive restart message:
Same as Shutdown followed by Start.
Related
Good morning;
Right now, I'm writing a program which makes a Montecarlo simulation of a physical process and then pipes the data generated to gnuplot to plot a graphical representation. The simulation and plotting work just fine; but I'm interested in printing an error message which informs the user that gnuplot is not installed. In order to manage this, I've tried the following code:
#include <stdio.h>
#include <stdlib.h>
FILE *pipe_gnuplot;
int main()
{
pipe_gnuplot = _popen("gnuplot -persist", "w");
if (pipe_gnuplot==NULL)
{
printf("ERROR. INSTALL gnuplot FIRST!\n");
exit (1);
}
return 0;
}
But, instead of printing my error message, "gnuplot is not recognized as an internal or external command, operable program or batch file" appears (the program runs on Windows). I don't understand what I'm doing wrong. According to _popen documentation, NULL should be returned if the pipe opening fails. Can you help me managing this issue? Thanks in advance and sorry if the question is very basic.
Error handling of popen (or _popen) is difficult.
popen creates a pipe and a process. If this fails, you will get a NULL result, but this occurs only in rare cases. (no more system resources to create a pipe or process or wrong second argument)
popen passes your command line to a shell (UNIX) or to the command processor (Windows). I'm not sure if you would get a NULL result if the system cannot execute the shell or command processor respectively.
The command line will be parsed by the shell or command processor and errors are handled as if you entered the command manually, e.g. resulting in an error message and/or a non-zero exit code.
A successful popen means nothing more than that the system could successfully start the shell or command processor. There is no direct way to check for errors executing the command or to get the exit code of the command.
Generally I would avoid using popen if possible.
If you want to program specifically for Windows, check if you can get better error handling from Windows API functions like CreateProcess.
Otherwise you could wrap your command in a script that checks the result and prints specific messages you can read and parse to distinguish between success and error. (I don't recommend this approach.)
Just to piggy-back on #Bodo's answer, on a POSIX-compatible system you can use wait() to wait for a single child process to return, and obtain its exit status (which would typically be 127 if the command was not found).
Since you are on Windows you have _cwait(), but this does not appear to be compatible with how _popen is implemented, as it requires a handle to the child process, which _popen does not return or give any obvious access to.
Therefore, it seems the best thing to do is to essentially manually re-implemented popen() by creating a pipe manually and spawning the process with one of the spawn[lv][p][e] functions. In fact the docs for _pipe() give an example of how one might do this (although in your case you want to redirect the child process's stdin to the write end of your pipe).
I have not tried writing an example though.
I have a problem.I've written a program in C for armbian.
I am using RTKLIB software for GPS data conversion from ubx to RTCM3.
I get some data from serial port and start str2str(rtklib software).
It creates this command to run
str2str -in tcpsvr://:2101#ubx -out serial://ttyS2:115200#rtcm3
and call system function to run this command. It is successful, but when I send a new command, I want it to stop the str2str software.
I've tried the exit(0) and it stops my software. I don't want to stop my software. I want to stop str2str and create a new command and run it again.
How can I do it? I am not good with the linux environment.
Thanks
I suggest you find out how to search for the str2str process you want to kill, and get the PID. A stackoverflow search will reveal this and then use the PID to kill the process. Unless RTKLIB has a process to do this directly.
I'm trying to recode the UNIX command script (as it is on OSX). This is part of an exercise for school to help students learn UNIX APIs. We are only allowed to use system calls, more specifically, only those available on MAN(2) pages on Mac OSX (since that's our OS at school).
I have a 'first version' that kind of works. Running a program such as ls prints the right output to the screen and in an output file.
The problem scenario
I run bash from within the script-clone. First issue is I get the following error:
bash: no job control in this shell
I have tried forcing the bash process into foreground with setpgrp and setpgid but that din't change anything so I concluded that was not the problem.
I also tried to understand why the real script command uses cfmakeraw (at least on Linux), as seen here, but I don't get it. The MAN page is not very helpful.
The real script also dup2s STDIN on the slave, as seen here, but when I do that, it seems like input isn't read anymore.
However, the bash still runs, and I can execute commands inside of it.
But if I run vim inside it, and then hit Ctrl-Z to put vim to the background, the terminal is messed up (which does not happen when I'm in my regular terminal).
So I guess I must have done something wrong. I'd appreciate any advice/help.
Here's the source code:
https://github.com/conradkleinespel/unix-command-script/tree/2587b07e7a36dc74bf6dff0e82c9fdd33cb40411
You can compile by doing: make (it builds on OSX 10.9, hopefully on Linux as well)
And run by doing: ./ft_script
Don't know it it makes more sense to have all the source code in StackOverflow as it would crowd the page with it. If needed, I can replace the Git link with the source.
I don't use OS X, so I can't directly test your code, but I'm currently writing a toy terminal emulator and had similar troubles.
about "bash: no job control in this shell"
In order to perform job control, a shell needs to be a session leader and the controlling process of its terminal. By default, your program inherits the controlling terminal of your own shell which runs your script program and which is also a session leader. Here is how to make your new slave process a session leader after fork:
/* we don't need the inherited master fd */
close(master);
/* discard the previous controlling tty */
ioctl(0, TIOCNOTTY, 0);
/* replace existing stdin/out/err with the slave pts */
dup2(slave, 0);
dup2(slave, 1);
dup2(slave, 2);
/* discard the extra file descriptor for the slave pts */
close(slave);
/* make the pts our controlling terminal */
ioctl(0, TIOCSCTTY, 0);
/* make a new session */
setsid()
At this point, the forked process has stdin/out/err bound to the new pts, the pts became its controlling terminal, and the process is a session leader. The job control should now work.
about raw tty
When you run a program inside a normal terminal, it looks like this:
(term emulator, master side) <=> /dev/pts/42 <=> (program, slave side)
If you press ^Z, the terminal emulator will write the ascii character 0x1A to the pts. It is a control character, so it won't be sent to the program, but instead the kernel will issue SIGSTP to the program and suspend it. The process of transforming characters into something else is called "line cooking" and has various settings that can be adjusted for each tty.
Now let's look at the situation with script:
term emulator <=> /dev/pts/42 <=> script <=> /dev/pts/43 <=> program
With normal line settings, what happens when you press ^Z? It will be transformed into SIGSTP by /dev/pts/42 and script will be suspended. But that's not what we want, instead we'd like the 0x1A character produced by our ^Z to go as-is through /dev/pts/42, then be passed by script to /dev/pts/43 and only then be transformed into SIGSTP to suspend the program.
This is the reason why the pts between your terminal and script must be configured as "raw", so that all control characters reach the pts between script and the program, as if you were directly working with it.
I am working on a linux daemon and having some issues with the stdin/stdout. Normally because of the nature of a daemon you do not have any stdin or stdout. However, I do have a function in my daemon that is called when the daemon runs for the first time to specify different parameters that are required for the daemon to run successfully. When this function is called the terminal becomes so sluggish that I have to launch a seperate shell and kill the daemon with top to get a responsive prompt back. Now I suspect that this has something to do with the forking process closing the stdin/stdout but I am not quite sure how I could work around this. If you guys could shed some light on the situation that would be most appreciated. Thanks.
Edit:
int main(argc, char *argv[]) {
/* setup signal handling */
/* check command line arguments */
pid_t pid, sid;
pid = fork();
if (pid < 0) {
exit(EXIT_FAILURE);
}
if(pid > 0){
exit(EXIT_SUCCESS);
}
sid = setsid();
if(sid < 0) {
exit(EXIT_FAILURE);
}
umask(027);
/* set syslogging */
/* do some logic to determine wether we are running the daemon for the first time and if we are call the one time function which uses fgets() to recieve some input */
while(1) {
/* do required work */
}
/* do some clean up procedures and exit */
return 0;
}
You guys mention using a config file. This is is exactly what I do to store the parameters recieved via input. However I still initially need to get these from the user via the stdin. The logic for determining whether we are running for the first time is based off of the existence of the config file.
Normally, the standard input of a daemon should be connected to /dev/null, so that if anything is read from standard input, you get an EOF immediately. Normally, standard output should be connected to a file - either a log file or /dev/null. The latter means all writes will succeed, but no information will be stored. Similarly, standard error should be connected to /dev/null or to a log file.
All programs, including daemons, are entitled to assume that stdin, stdout and stderr are appropriately opened file streams.
It is usually appropriate for a daemon to control where its input comes from and outputs go to. There is seldom occasion for input to come from other than /dev/null. If the code was written to survive without standard output or standard error (for example, it opens a standard log channel, or perhaps uses syslog(3)) then it may be appropriate to close stdout and stderr. Otherwise, it is probably appropriate to redirect them to /dev/null, while still logging messages to a log file. Alternatively, you can redirect both stdout and stderr to a log file - beware continuously growing log files.
Your sluggish-to-impossible response time might be because your program is not paying attention to EOF in a read loop somewhere. It might be prompting for user input on /dev/null, and reading a response from /dev/null, and not getting a 'y' or 'n' back, it tries again, which chews up your system horribly. Of course, the code is flawed in not handling EOF, and counting the number of times it gets an invalid response and stopping being silly after a reasonable number of attempts (16, 32, 64). The program should shut up shop sanely and safely if it expects a meaningful input and continues not to get it.
You guys mention using a config file. This is is exactly what I do to store the parameters recieved via input. However I still initially need to get these from the user via the stdin. The logic for determining whether we are running for the first time is based off of the existence of the config file.
Instead of reading stdin, have the user write the config file themselves; check for its existence before forking, and exit with an error if it doesn't. Include a sample config file with the daemon, and document its format in your daemon's manpage. You do have a manpage, yes? Your config file is textual, yes?
Also, your daemonization logic is missing a key step. After forking, but before calling setsid, you need to close fds 0, 1, and 2 and reopen them to /dev/null (do not attempt to do this with fclose and fopen). That should fix your sluggish terminal problem.
Your design is wrong. Daemon processes should not take input via stdin or deliver output to stdout/stderr. You'll close those descriptors as part of the daemonizing phase. Daemons should take configuration parameters from the command line, a config file, or both. If runtime-input is required you'll have to read a file, open a socket, etc., but the point of a daemon is that it should be able to run and do its thing without a user being present at the console.
If you want to run your program detached, use the shell: (setsid <command> &). Do not fork() inside your program, which will cause sysadmin nightmare.
Don't use syslog() nor redirect stdout or stderr.
Better yet, use a daemon manager such as daemon tools, runit, OpenRC and systemd, to daemonize your program for you.
Use a config file. Do not use STDIN or STDOUT with a daemon. Daemons are meant to run in the background with no user interaction.
If you insist on using stdin/keyboard input to fire up the daemon (e.g. to get some magic passphrase you wouldn't want to store in a file) then handle all I/O before the fork().
I am working on an application where I need to detect a system shutdown.
However, I have not found any reliable way get a notification on this event.
I know that on shutdown, my app will receive a SIGTERM signal followed by a SIGKILL. I want to know if there is any way to query if a SIGTERM is part of a shutdown sequence?
Does any one know if there is a way to query that programmatically (C API)?
As far as I know, the system does not provide any other method to query for an impending shutdown. If it does, that would solve my problem as well. I have been trying out runlevels as well, but change in runlevels seem to be instantaneous and without any prior warnings.
Maybe a little bit late. Yes, you can determine if a SIGTERM is in a shutting down process by invoking the runlevel command. Example:
#!/bin/bash
trap "runlevel >$HOME/run-level; exit 1" term
read line
echo "Input: $line"
save it as, say, term.sh and run it. By executing killall term.sh, you should able to see and investigate the run-level file in your home directory. By executing any of the following:
sudo reboot
sudo halt -p
sudo shutdown -P
and compare the difference in the file. Then you should have the idea on how to do it.
There is no way to determine if a SIGTERM is a part of a shutdown sequence. To detect a shutdown sequence you can either use use rc.d scripts like ereOn and Eric Sepanson suggested or use mechanisms like DBus.
However, from a design point of view it makes no sense to ignore SIGTERM even if it is not part of a shutdown. SIGTERM's primary purpose is to politely ask apps to exit cleanly and it is not likely that someone with enough privileges will issue a SIGTERM if he/she does not want the app to exit.
From man shutdown:
If the time argument is used, 5 minutes before the system goes down
the /etc/nologin file is created to ensure that further logins shall
not be allowed.
So you can test existence of /etc/nologin. It is not optimal, but probably best you can get.
Its a little bit of a hack but if the server is running systemd if you can run
/bin/systemctl list-jobs shutdown.target
... it will report ...
JOB UNIT TYPE STATE
755 shutdown.target start waiting <---- existence means shutting down
1 jobs listed.
... if the server is shutting down or rebooting ( hint: there's a reboot.target if you want to look specifically for that )
You will get No jobs running. if its not being shutdown.
You have to parse the output which is a bit messy as the systemctl doesnt return a different exit code for the two results. But it does seem reasonably reliable. You will need to watch out for a format change in the messages if you update the system however.
Making your application responding differently to some SIGTERM signals than others seems opaque and potentially confusing. It's arguable that you should always respond the same way to a given signal. Adding unusual conditions makes it harder to understand and test application behavior.
Adding an rc script that handles shutdown (by sending a special signal) is a completely standard way to handle such a problem; if this script is installed as part of a standard package (make install or rpm/deb packaging) there should be no worries about control of user machines.
I think I got it.
Source =
https://github.com/mozilla-b2g/busybox/blob/master/miscutils/runlevel.c
I copy part of the code here, just in case the reference disappears.
#include "libbb.h"
...
struct utmp *ut;
char prev;
if (argv[1]) utmpname(argv[1]);
setutent();
while ((ut = getutent()) != NULL) {
if (ut->ut_type == RUN_LVL) {
prev = ut->ut_pid / 256;
if (prev == 0) prev = 'N';
printf("Runlevel: prev=%c current=%c\n", prev, ut->ut_pid % 256);
endutent();
return 0;
}
}
puts("unknown");
see man systemctl, you can determine if the system is shutting down like this:
if [ "`systemctl is-system-running`" = "stopping" ]; then
# Do what you need
fi
this is in bash, but you can do it with 'system' in C
The practical answer to do what you originally wanted is that you check for the shutdown process (e.g ps aux | grep "shutdown -h" ) and then, if you want to be sure you check it's command line arguments and time it was started (e.g. "shutdown -h +240" started at 14:51 will shutdown at 18:51).
In the general case there is from the point of view of the entire system there is no way to do this. There are many different ways a "shutdown" can happen. For example someone can decide to pull the plug in order to hard stop a program that they now has bad/dangerous behaviour at shutdown time or a UPS could first send a SIGHUP and then simply fail. Since such a shutdown can happen suddenly and with no warning anywhere in a system there is no way to be sure that it's okay to keep running after a SIGHUP.
If a process receives SIGHUP you should basically assume that something nastier will follow soon. If you want to do something special and partially ignore SIGHUP then a) you need to coordinate that with whatever program will do the shutdown and b) you need to be ready that if some other system does the shutdown and kills you dead soon after a SIGHUP your software and data will survive. Write out any data you have and only continue writing to append-only files with safe atomic updates.
For your case I'm almost sure your current solution (treat all SIGHUPs as a shutdown) is the correct way to go. If you want to improve things, you should probably add a feature to the shutdown program which does a notify via DBUS or something similar.
When the system shuts down, the rc.d scripts are called.
Maybe you can add a script there that sends some special signal to your program.
However, I doubt you can stop the system shutdown that way.