Program restart self on update - c

I checked everywhere so I am hopefully not repeating a question.
I want to add a portable update feature to some C code I am writing. The program may not be in any specific location, and I would prefer to keep it to a single binary (No dynamic library loading)
Then after the update is complete, I want the program to be able to restart (not a loop, actually reload from the HDD)
Is there any way to do this in C on Linux?

If you know where the program is saved on disk, then you can exec() the program:
char args[] = { "/opt/somewhere/bin/program", 0 };
execv(args[0], args);
fprintf(stderr, "Failed to reexecute %s\n", args[0]);
exit(1);
If you don't know where the program is on disk, either use execvp() to search for it on $PATH, or find out. On Linux, use the /proc file system — and /proc/self/exe specifically; it is a symlink to the executable, so you would need to use readlink() to get the value. Beware: readlink() does not null terminate the string it reads.
If you want, you can arrange to pass an argument which indicates to the new process that it is being restarted after update; the bare minimum argument list I provided can be as complex as you need (a list of the files currently open for edit, perhaps, or any other appropriate information and options).
Also, don't forget to clean up before reexecuting — cleanly close any open files, for example. Remember, open file descriptors are inherited by the executed process (unless you mark them for closure on exec with FD_CLOEXEC or O_CLOEXEC), but the new process won't know what they're for unless you tell it (in the argument list) so it won't be able to use them. They'll just be cluttering up the process without helping in the least.

Yes, you need to call the proper exec() function. There might be some complications, it can be troublesome to find the absolute path name. You need to:
Store the current directory in main().
Store the argc and (all) argv[] values from main().
Since calling exec() replaces the current process, that should be all you need to do in order to restart yourself. You might also need to take care to close any opened files, since they might otherwise be "inherited" back to yourself, which is seldom what you want.

Related

Why should I close all file descriptors after calling fork() and prior to calling exec...()? And how would I do it?

I've seen a lot of C code that tries to close all file descriptors between calling fork() and calling exec...(). Why is this commonly done and what is the best way to do it in my own code, as I've seen so many different implementations already?
When calling fork(), your operation system creates a new process by simply cloning your existing process. The new process will be pretty much identical to the process it was cloned from, except for its process ID and any properties that are documented to be replaced or reset by the fork() call.
When calling any form of exec...(), the process image of the calling process is replaced by a new process image but other than that the process state is preserved. One consequence is that open file descriptors in the process file descriptor table prior to calling exec...() are still present in that table after calling it, so the new process code inherits access to them. I guess this has probably been done so that STDIN, STDOUT, and STDERR are automatically inherited by child processes.
However, keep in mind that in POSIX C file descriptors are not only used to access actual files, they are also used for all kind of system and network sockets, pipes, shared memory identifiers, and so on. If you don't close these prior to calling exec...(), your new child process will get access to all of them, even to those resources it could not gain access on its own as it doesn't even have the required access rights. Think about a root process creating a non-root child process, yet this child would have access to all open file descriptors of the root parent process, including open files that should only be writable by root or protected server sockets below port 1024.
So unless you want a child process to inherit access to currently open file descriptors, as may explicitly be desired e.g. to capture STDOUT of a process or feed data via STDIN to that process, you are required to close them prior to calling exec...(). Not only because of security (which sometimes may play no role at all) but also because otherwise the child process will have less free file descriptors available (and think of a long chain of processes, each opening files and then spawning a sub-process... there will be less and less free file descriptors available).
One way to do that is to always open files using the flag O_CLOEXEC, which ensures that this file descriptor is automatically closed when exec...() is ever called. One problem with that solution is that you cannot control how external libraries may open files, so you cannot rely that all code will always set this flag.
Another problem is that this solution only works for file descriptors created with open(). You cannot pass that flag when creating sockets, pipes, etc. This is a known problem and some systems are working around that by offering the non-standard acccept4(), pipe2(), dup3(), and the SOCK_CLOEXEC flag for sockets, however these are not yet POSIX standard and it's unknown if they will become standard (this is planned but until a new standard has been released we cannot know for sure, also it will take years until all systems have adopted them).
What you can do is to later on set the flag FD_CLOEXEC using fcntl() on the file descriptor, however, note that this isn't safe in a multi-thread environment. Just consider the following code:
int so = socket(...);
fcntl(so, F_SETFD, FD_CLOEXEC);
If another thread calls fork() in between the first and the second line, which is of course possible, the flag has not yet been set yet and thus this file descriptor won't get closed.
So the only way that is really safe is to explicitly close them and this is not as easy as it may seem!
I've seen a lot of code that does stupid things like this:
for (int i = STDERR_FILENO + 1; i < 256; i++) close(i);
But just because some POSIX systems have a default limit of 256 doesn't mean that this limit cannot be raised. Also on some system the default limit is always higher to begin with.
Using FD_SETSIZE instead of 256 is equally wrong as just because the select() API has a hard limit by default on most systems doesn't mean that a process cannot have more open file descriptors than this limit (after all you don't have to use select() with them, you can use poll() API as a replacement and poll() has no upper limit on file descriptor numbers).
Always correct is to use OPEN_MAX instead of 256 as that is really the absolute maximum of file descriptors a process can have. The downside is that OPEN_MAX can theoretically be huge and doesn't reflect the real current runtime limit of a process.
To avoid having to close too many non-existing file descriptors, you can use this code instead:
int fdlimit = (int)sysconf(_SC_OPEN_MAX);
for (int i = STDERR_FILENO + 1; i < fdlimit; i++) close(i);
sysconf(_SC_OPEN_MAX) is documented to update correctly if the open file limit (RLIMIT_NOFILE) has been raised using setrlimit(). The resource limits (rlimits) are the effective limits for a running process and for files they will always have to be between _POSIX_OPEN_MAX (documented as the minimum number of file descriptors a process is always allowed to open, must be at least 20) and OPEN_MAX (must be at least _POSIX_OPEN_MAX and sets the upper limit).
While closing all possible descriptors in a loop is technically correct and will work as desired, it may try to close several thousand file descriptors, most of them will often not exist. Even if the close() call for a non-existing file descriptor is fast (which is not guaranteed by any standard), it may take a while on weaker systems (think of embedded devices, think of small single-board computers), which may be a problem.
So several systems have developed more efficient ways to solve this issue. Famous examples are closefrom() and fdwalk() which BSD and Solaris systems support. Unfortunately The Open Group voted against adding closefrom() to the standard (quote): "it is not possible to standardize an interface that closes arbitrary file descriptors above a certain value while still guaranteeing a conforming environment." (Source) This is of course nonsense, as they make the rules themselves and if they define that certain file descriptors can always be silently omitted from closing if the environment or system requires or the code itself requests that, then this would break no existing implementation of that function and still offer the desired functionality for the rest of us. Without these functions people will use a loop and do exactly what The Open Group tries to avoid here, so not adding it only makes the situation even worse.
On some platforms you are basically out of luck, e.g. macOS, which is fully POSIX conform. If you don't want to close all file descriptors in a loop on macOS, your only option is to not use fork()/exec...() but instead posix_spawn(). posix_spawn() is a newer API for platforms that don't support process forking, it can be implemented purely in user space on top of fork()/exec...() for those platforms that do support forking and can otherwise use some other API a platform offers for starting child processes. On macOS there exists a non-standard flag POSIX_SPAWN_CLOEXEC_DEFAULT, which will tread all file descriptors as if the CLOEXEC flag has been set on them, except for those for that you explicitly specified file actions.
On Linux you can get a list of file descriptors by looking at the path /proc/{PID}/fd/ with {PID} being the process ID of your process (getpid()), that is, if the proc file system has been mounted at all and it has been mounted to /proc (but a lot of Linux tools rely on that, not doing so would break many other things as well). Basically you can limit yourself to close all descriptors listed under this path.
True story: Once upon a time I wrote a simple little C program that opened a file, and I noticed that the file descriptor returned by open was 4. "That's funny," I thought. "Standard input, output, and error are always file descriptors 0, 1, and 2, so the first file descriptor you open is usually 3."
So I wrote another little C program that started reading from file descriptor 3 (without opening it, that is, but rather, assuming that 3 was a pre-opened fd, just like 0, 1, and 2). It quickly became apparent that, on the Unix system I was using, file descriptor 3 was pre-opened on the system password file. This was evidently a bug in the login program, which was exec'ing my login shell with fd 3 still open on the password file, and the stray fd was in turn being inherited by programs I ran from my shell.
Naturally the next thing I tried was a simple little C program to write to the pre-opened file descriptor 3, to see if I could modify the password file and give myself root access. This, however, didn't work; the stray fd 3 was opened on the password file in read-only mode.
But at any rate, this helps to explain why you shouldn't leave file descriptors open when you exec a child process.
[Footnote: I said "true story", and it mostly is, but for the sake of the narrative I did change one detail. In fact, the buggy version of /bin/login was leaving fd 3 opened on the groups file, /etc/group, not the password file.]

Accessing and running /usr/bin programs in C

What is the most efficient method for running programs from the /usr/bin directory in a C program? I am tasking with getting user input, and if the user input matches up with a program in bin, running the respective program.
I had the idea of throwing the names of all the bin's programs in a text file, and using a loop to iterate through each word in the file while comparing the word to the input. However, I figured that might be reinventing the wheel a bit. Are there any streamlined ways to do this?
The "classical" way of invoking a program from an existing process on any of the UNIX-like OS'es is to use one of the exec() functions. When you read about exec() most tutorials will start by explaining another function: fork(). These functions are very commonly used together, but don't get too caught up on that, because they are both useful in their own right.
To answer your question, a fairly efficient way of doing what you're after:
Take the user generated input from whatever your source happens to be
(Optionally, call fork())
Call the execvp() function
What you do here will depend on whether you called fork() in step (2), and what you intend to do (if anything) after performing the task you described.
execvp() will do the legwork for you, by automatically searching on your environment PATH for a filename matching the first argument. If the current environment does not have a PATH set, it will default to /bin:/usr/bin. Since the only way that a call to exec() can yield a return value is when that call failed, you might want to check the value of errno as part of step (4). In the event that the user input didn't match any executable in the environment PATH, errno will be set to ENOENT. Exactly how you do this and what additional steps might be worth taking will depend on whether or not you forked, along with any additional requirements for your program.
I would suggest looking if the name you get from the user matches a file in the /usr/bin directory and if it does use the system function to run this program.
https://linux.die.net/man/3/system
#include <stdlib.h>
int system(const char *command);
i.e
FILE *file;
if (file = fopen(userinput, "r")){
fclose(file);
system(userinput);
}

Closing open directories on exit()

I would like to know if there is a way to access a list of all open directories from the current process? I have a function that opens many directories recursively but exits the program as soon as something is wrong. Of course, I would like to close all directories before calling exit() without having to keep track of everything I open. Is this even possible?
Thanks!
I have a function that opens many directories recursively but exits the program as soon as something is wrong.
Of course, I would like to close all directories before calling exit() without having to keep track of everything I open.
I think your very approach is wrong. What is the point of opening the directories if you don't keep a handle on them?
You should keep a reference to the opened directory as long as you need it and discard it as soon as you can.
Keep in mind that normally, the nomber of open file descriptors is limited, e. g. to 1024.
You do not need to do this as exit() will (eventually) exit the process, which will close all open file descriptors whether for directories or real files.
However, you absolutely do need to worry about valgrind and friends reporting this, as this means fds are leaking in your program. But the solution is not to hunt around for open directories, but rather to simply ensure each opendir is matched by a closedir. That's what valgrind is prompting you to do.
When you exit(), file handles are close()d. This is good for one-time tools, but not good practice in the long run.
You should instead walk back up the recursion, close()ing as you go. Replace, for example:
exit(1);
for:
close(current_fd);
return NULL;
Change your recursive call for:
if (thisfunc(...) == NULL) {
close(current_fd);
return NULL;
}

Set an environment variable from a process that will be visible by all processes

How to set an envirnoment variable from a process that will be visible by all processes?
I'm using C with Glib.
I have 10 processes that use the same library. The problem is that in that library a checking procedure (which is CPU hungry) is performed. I want to avoid that library checking procedure to be executed for every process. For the first process that is using the library it will be enough.
This is simply not possible.
Setting an environment variable (or changing your current environment) is only visible from the children (and descendants) processes of your current process.
Other processes, in particular the parent process (usually the shell running in the terminal where you start your program) are not affected.
You might play dirty tricks like e.g. adding lines into $HOME/.bashrc etc. But you should not.
You just need to document what environment variables are relevant. It is the user's responsibility to set environment variables (perhaps by manually editing his $HOME/.bashrc etc etc). Leave that freedom to your user. Explain to him how to do that and why.
You edited your question to explain that
I have 10 processes that use the same library. The problem is that in that library a checking procedure ( which is CPU hungry ) is performed. I want to avoid that library checking procedure to be executed for every process.
But you definitely should not need to change environment variable for that.
You could
decide and document that the checking is not performed, unless some particular environment variable (or some program argument) is given
decide that the checking is given a particular file name, and use file locked write to write that file, and use file locked reads to read it again
Have the checking write its result in some known in advance file, and read that file before deciding it you want to make the costly checks
Have one process starting all the others, and inform them about the check (perhaps indeed setting some environment variable or some program argument) or use some Inter Process Communication trick to communicate with the others (you could use sockets, locked files, shared memory, etc etc...)
Do many other tricks.
That's not possible. You can set the environment for child processes only.
flock() sounds like it may be your friend.
http://beej.us/guide/bgipc/html/multi/flocking.html
You may also want to look at Semaphores or SHM (Shared Memory).
https://beej.us/guide/bgipc/html/multi/semaphores.html
http://beej.us/guide/bgipc/html/multi/shm.html
It all depends on the level of coordination you want. File locks will be good enough for one process to say stay out while I'm working. Semaphores and shared memory would allow you to coordinate access.

What can cause exec to fail? What happens next?

What are the reasons that an exec (execl,execlp, etc.) can fail? If you make a call to exec and it returns, are there any best practices other than just panicking and calling exit?
The problem with handling exec failure is that usually exec is performed in a child process, and you want to do the error handling in the parent process. But you can't just exit(errno) because (1) you don't know if error codes fit in an exit code, and (2), you can't distinguish between failure to exec and failure exit codes from the new program you exec.
The best solution I know is using pipes to communicate the success or failure of exec:
Before forking, open a pipe in the parent process.
After forking, the parent closes the writing end of the pipe and reads from the reading end.
The child closes the reading end and sets the close-on-exec flag for the writing end.
The child calls exec.
If exec fails, the child writes the error code back to the parent using the pipe, then exits.
The parent reads eof (a zero-length read) if the child successfully performed exec, since close-on-exec made successful exec close the writing end of the pipe. Or, if exec failed, the parent reads the error code and can proceed accordingly. Either way, the parent blocks until the child calls exec.
The parent closes the reading end of the pipe.
From the exec(3) man page:
The execl(), execle(), execlp(), execvp(), and execvP() functions may fail and set errno for any of the errors specified for the library functions execve(2) and malloc(3).
The execv() function may fail and set errno for any of the errors specified for the library function execve(2).
And then from the execve(2) man page:
ERRORS
Execve() will fail and return to the calling process if:
[E2BIG] - The number of bytes in the new process's argument list is larger than the system-imposed limit. This limit is specified by the sysctl(3) MIB variable KERN_ARGMAX.
[EACCES] - Search permission is denied for a component of the path prefix.
[EACCES] - The new process file is not an ordinary file.
[EACCES] - The new process file mode denies execute permission.
[EACCES] - The new process file is on a filesystem mounted with execution disabled (MNT_NOEXEC in <sys/mount.h>).
[EFAULT] - The new process file is not as long as indicated by the size values in its header.
[EFAULT] - Path, argv, or envp point to an illegal address.
[EIO] - An I/O error occurred while reading from the file system.
[ELOOP] - Too many symbolic links were encountered in translating the pathname. This is taken to be indicative of a looping symbolic link.
[ENAMETOOLONG] - A component of a pathname exceeded {NAME_MAX} characters, or an entire path name exceeded {PATH_MAX} characters.
[ENOENT] - The new process file does not exist.
[ENOEXEC] - The new process file has the appropriate access permission, but has an unrecognized format (e.g., an invalid magic number in its header).
[ENOMEM] - The new process requires more virtual memory than is allowed by the imposed maximum (getrlimit(2)).
[ENOTDIR] - A component of the path prefix is not a directory.
[ETXTBSY] - The new process file is a pure procedure (shared text) file that is currently open for writing or reading by some process.
malloc() is a lot less complicated, and uses only ENOMEM. From the malloc(3) man page:
If successful, calloc(), malloc(), realloc(), reallocf(), and valloc() functions return a pointer to allocated memory. If there is an error, they return a NULL pointer and set errno to ENOMEM.
What you do after the exec() call returns depends on the context - what the program is supposed to do, what the error is, and what you might be able to do to work around the problem.
One source of trouble could be that you specified a simple program name instead of a pathname; maybe you could retry with execvp(), or convert the command into an invocation of sh -c 'what you originally specified'. Whether any of these is reasonable depends on the application. If there are major security issues involved, probably you don't try again.
If you specified a pathname and there is a problem with that (ENOTDIR, ENOENT, EPERM), then you may not have any sensible fallback, but you can report the error meaningfully.
In the old days (10+ years ago), some systems did not support the '#!' shebang notation, and if you were not sure whether you were executing an executable or a shell script, you tried it as an executable and then retried it as a shell script. That might or might not work if you were running a Perl script, but in those days, you wrote your Perl scripts to detect that they were being run by a shell and to re-exec themselves with Perl. Fortunately, those days are mostly over.
To the extent possible, it is important to ensure that the process reports the problem so that it can be traced - writing its message to a log file or just to stderr (or maybe even syslog()), so that those who have to work out what went wrong have more information to help them other than the hapless end user's report "I tried X and it didn't work". It is crucial that if nothing works, then the exit status is not 0 as that indicates success. Even that might be ignored - but you did what you could.
Other than just panicking, you could take a decision based on errno's value.
Exec should always succeed
(except for shells, e.g. if the user entered a bogus command).
If exec does fail, it indicates:
a "fault" with the program (missing or bad component, wrong pathname, bad memory, ...), or
a serious system error (out of memory, too many processes, disk fault, ...)
For any serious error, the normal approach is to write the error message on stderr, then exit with a failure code. Almost all of the standard tools do this. For exec:
execl("bork", "bork", NULL);
perror("failed: exec");
exit(127);
The shell does that, too (more or less).
Normally if a child process fails, the parent has failed too and should exit. It does not matter whether the child failed in exec, or while running the program. If exec failed, it does not matter why exec failed. If the child process failed for any reason, the calling process is in trouble and needs to stop.
Don't waste lots of time trying to anticipate all possible error conditions. Don't write code that tries to handle each error code in the best possible way. You'll just bloat the code, and introduce many new bugs. If your program is broken, or it's being abused, it should simply fail. If you force it to continue, worse trouble will come of that.
For example, if the system is out of memory and thrashing swap, we don't want to cycle over and over trying to run a process; it would just make the situation worse. If we get a filesystem error, we don't want to continue running on that filesystem; it might make the corruption worse. If the program was installed wrongly, or has a bug, or has memory corruption, we want to stop as soon as possible, before that broken program does some real damage (such as sending a corrupted report to a client, trashing a database, ...).
One possible alternative: a failing process might call for help, pause itself (SIGSTOP), then retry the operation if told to continue. This could help when the system is out of memory, or disks are full, or perhaps even if there is a fault in the program. Few operations are so expensive and important that this would be worthwhile.
If you're making an interactive GUI program, try to do it as a thin wrapper over reusable command-line tools (which exit if something goes wrong). Every function in your program should be accessible through the GUI, through the command-line, and as a function call. Write your functions. Write a few tools to make commmand-line and GUI wrappers for any function. Use sub-processes too.
If you are making a truly critical system, such as a controller for a nuclear power station, or a program to predict tsunamis, then what are you doing reading my dumb advice? Critical systems should not depend entirely on computers or software. There needs to be a 'manual override', with someone to drive it. Especially, do not attempt to build a critical system on MS Windows; that is like building sandcastles underwater.

Resources