I'm developing a simple shell for an assignment. I read a command entered by the user, tokenize it, fork(), then in the child process use execvp() to execute the command in the background.
The problem is I need to implement a history feature that records valid commands. The only way I know to check if the string the user enters is a valid command is to check if execvp() returns -1. This is a fine way to check for an invalid command, but since the call to execvp() happens in the child process and the data structure I use for my history is copied to the child on fork() rather than shared, I can't update the history using the results of the execvp() within the child (since the history structure is a copy any changes I make won't be reflected in the parent's copy of the structure).
Is there any way I can check if execvp() would return -1 without actually calling it (i.e. before or after the fork)? If I could figure out a way to do that I'd be able to verify in the parent processes whether or not execvp() will succeed and use that info to update my history data structure properly.
What you are asking for is a system call which would let you implement the classic check-before-do race condition.
In that error, a program verifies whether an action is possible and then performs the action, leaving open the possibility that some external event will happen just after the check which makes the action illegal.
So then the action fails, even though the program checked that it was possible. This often results in chaos.
You should avoid this antipattern, and the system API should help by not tempting you with system calls you would only use to get yourself into trouble. In this case, the system does the right thing; there is no such API.
The parent process must eventually retrieve the exit status of the child. That is the moment you need to update (or not) the history. If a failed execvp causes the child to exit() with a failure status code, the parent will notice the failure and can react by not adding the command line to the history.
Some notes added after a bit of reflection:
To retrieve the status code of the child process, the parent will call wait or waitpid. For synchronous execution, the parent will likely do so immediately; for asynchronous execution, the parent will do so when it receives a SIGCHLD signal. But it is imperative that the parent does this, to avoid zombie processes.
In the case of asynchronous execution, it is not possible to use this strategy to avoid putting invalid commands into the history, because asynchronous commands must be recorded in the history when they are started. For a similar reason, Posix shells also count asynchronous execution of a command as successful, even if the command is invalid.
While this exercise undoubtedly has pedagogic value (as I hope this answer demonstrates), it is actually a terrible way of doing shell history. While shell users occasionally use history to retrieve and re-execute successful commands, the history feature is much more useful to retrieve and edit an unsuccessful command. It's intensely annoying to not be able to make corrections from a history feature. (Many Android applications exhibit precisely this annoying flaw with search history: after a search which gives you undesired results, you can retrieve the incorrect search and rerun it, but not modify it. I'm glad to say that things have improved since my first Android.)
Related
I have been recently wondering why do we need to use exit after the execution of a child after performing execvp. An in depth explannation is welcome.
You do not need for the child to call exit() after execvp(), but it is often wise for you to ensure that it does so. So much so as to support giving that to novices as a rule.
When a call to execvp() or one of the other exec functions succeeds, it has the effect of replacing the currently-running process image with a new one. As a result, these functions do not return on success, and nothing that follows matters in that case. The issue, then, is entirely about what to do in the event that the exec* call fails.
If you do not exit the child in the event that an exec fails then it will continue running the code of the parent. That is rarely intended or wanted. It may do work that it was not intended to do, delay the parent (that is often wait()ing for it) by not exiting promptly, and eventually mislead the parent with an exit status that is not reflective of what actually happened.
All of that, and perhaps more, is addressed by ensuring that the control flow following an exec* call pretty quickly reaches program termination with a failure status (typically via _Exit() or _exit(); less typically via exit() or abort()). Depending on the circumstances, it may or may not be appropriate to emit a diagnostic message before exiting, or to perform some other kind of cleanup.
int uv_process_kill(uv_process_t* handle, int signum)
Sends the specified signal to the given process handle. Check the documentation on uv_signal_t — Signal handle for signal support, specially on Windows.
int uv_kill(int pid, int signum)
Sends the specified signal to the given PID. Check the documentation on uv_signal_t — Signal handle for signal support, specially on Windows.
Are these two ways of doing the exact same thing, or is the mechanism inside the library somehow different? I need to handle the error condition where my UV loop may have failed to run (for whatever reason), but I have already called uv_spawn for all the processes I wish to spawn.
My goal is to clean up the resources allocated to the child processes, without needing to know if the uv loop is running, stopped or in an error state.
uv_process_kill and uv_kill perform the same action, but they differ from each other because of their interface. The former accepts an uv_process_t handle while the latter requires a pid explicitly (both have a second argument that is a signal number).
It's worth noting that the struct uv_process_t (that you can use with uv_process_kill) has a field named pid (that you can use with uv_kill), thus one could argue that the two functions are redundant.
Anyway, the pid of the process to be killed could come to hand because of an external source (as an example, an user could provide it through the command line - think at how the kill tool works on Linux). Therefore, there is no guarantee that you have an instance of uv_process_t whenever you have a pid and it goes without saying that the two functions serve slightly different purposes.
Of course, you can still use uv_kill when you have an instance of uv_process_t as:
uv_kill(proc.pid);
Anyway this is not the way libuv works and you should ever use the functions that accept uv_* data structures when you have them, for they know how to tear down everything correctly.
To sum up, you can think at uv_process_kill as a more libuv oriented function to be used when you are in charge of the whole lifecycle of the process (you spawn it and you kill it if needed). On the other side, uv_kill is a more general purpose function to be used when you want to deal with processes of which you know the pid but for which you don't have a properly initialized uv_process_t.
Look at the source (here and here). uv_process_kill and uv_kill do the same thing.
I'm writing a program that spawns child processes. For security reasons, I want to limit what these processes can do. I know of security measures from outside the program such as chroot or ulimit, but I want to do something more than that. I want to limit the system calls done by the child process (for example preventing calls to open(), fork() and such things). Is there any way to do that? Optimally, the blocked system calls should return with an error but if that's not possible, then killing the process is also good.
I guess it can be done wuth ptrace() but from the man page I don't really understand how to use it for this purpose.
It sounds like SECCOMP_FILTER, added in kernel version 3.5, is what you're after. The libseccomp library provides an easy-to-use API for this functionality.
By the way, chroot() and setrlimit() are both system calls that can be called within your program - you'd probably want to use one or both of these in addition to seccomp filtering.
If you want to do it the ptrace way, you have some options (and some are really simple). First of all, I recommend you to follow the tutorial explained here. With it you can learn how to know what system calls are being called, and also the basic ptrace knowledge (don't worry, it's a very short tutorial). The options (that I know) you have are the following:
The easiest one would be to kill the child, that is this exact code here.
Secondly you could make the child fail, just by changing the registers with PTRACE_SETREGS, putting wrong values in them, and you can also change the return value of the system call if you want (again, with PTRACE_SETREGS).
Finally you could skip the system call. But for that you should know the address after the system call call, make the intruction register point there and set it (again, with PTRACE_SETREGS).
What are the reasons that an exec (execl,execlp, etc.) can fail? If you make a call to exec and it returns, are there any best practices other than just panicking and calling exit?
The problem with handling exec failure is that usually exec is performed in a child process, and you want to do the error handling in the parent process. But you can't just exit(errno) because (1) you don't know if error codes fit in an exit code, and (2), you can't distinguish between failure to exec and failure exit codes from the new program you exec.
The best solution I know is using pipes to communicate the success or failure of exec:
Before forking, open a pipe in the parent process.
After forking, the parent closes the writing end of the pipe and reads from the reading end.
The child closes the reading end and sets the close-on-exec flag for the writing end.
The child calls exec.
If exec fails, the child writes the error code back to the parent using the pipe, then exits.
The parent reads eof (a zero-length read) if the child successfully performed exec, since close-on-exec made successful exec close the writing end of the pipe. Or, if exec failed, the parent reads the error code and can proceed accordingly. Either way, the parent blocks until the child calls exec.
The parent closes the reading end of the pipe.
From the exec(3) man page:
The execl(), execle(), execlp(), execvp(), and execvP() functions may fail and set errno for any of the errors specified for the library functions execve(2) and malloc(3).
The execv() function may fail and set errno for any of the errors specified for the library function execve(2).
And then from the execve(2) man page:
ERRORS
Execve() will fail and return to the calling process if:
[E2BIG] - The number of bytes in the new process's argument list is larger than the system-imposed limit. This limit is specified by the sysctl(3) MIB variable KERN_ARGMAX.
[EACCES] - Search permission is denied for a component of the path prefix.
[EACCES] - The new process file is not an ordinary file.
[EACCES] - The new process file mode denies execute permission.
[EACCES] - The new process file is on a filesystem mounted with execution disabled (MNT_NOEXEC in <sys/mount.h>).
[EFAULT] - The new process file is not as long as indicated by the size values in its header.
[EFAULT] - Path, argv, or envp point to an illegal address.
[EIO] - An I/O error occurred while reading from the file system.
[ELOOP] - Too many symbolic links were encountered in translating the pathname. This is taken to be indicative of a looping symbolic link.
[ENAMETOOLONG] - A component of a pathname exceeded {NAME_MAX} characters, or an entire path name exceeded {PATH_MAX} characters.
[ENOENT] - The new process file does not exist.
[ENOEXEC] - The new process file has the appropriate access permission, but has an unrecognized format (e.g., an invalid magic number in its header).
[ENOMEM] - The new process requires more virtual memory than is allowed by the imposed maximum (getrlimit(2)).
[ENOTDIR] - A component of the path prefix is not a directory.
[ETXTBSY] - The new process file is a pure procedure (shared text) file that is currently open for writing or reading by some process.
malloc() is a lot less complicated, and uses only ENOMEM. From the malloc(3) man page:
If successful, calloc(), malloc(), realloc(), reallocf(), and valloc() functions return a pointer to allocated memory. If there is an error, they return a NULL pointer and set errno to ENOMEM.
What you do after the exec() call returns depends on the context - what the program is supposed to do, what the error is, and what you might be able to do to work around the problem.
One source of trouble could be that you specified a simple program name instead of a pathname; maybe you could retry with execvp(), or convert the command into an invocation of sh -c 'what you originally specified'. Whether any of these is reasonable depends on the application. If there are major security issues involved, probably you don't try again.
If you specified a pathname and there is a problem with that (ENOTDIR, ENOENT, EPERM), then you may not have any sensible fallback, but you can report the error meaningfully.
In the old days (10+ years ago), some systems did not support the '#!' shebang notation, and if you were not sure whether you were executing an executable or a shell script, you tried it as an executable and then retried it as a shell script. That might or might not work if you were running a Perl script, but in those days, you wrote your Perl scripts to detect that they were being run by a shell and to re-exec themselves with Perl. Fortunately, those days are mostly over.
To the extent possible, it is important to ensure that the process reports the problem so that it can be traced - writing its message to a log file or just to stderr (or maybe even syslog()), so that those who have to work out what went wrong have more information to help them other than the hapless end user's report "I tried X and it didn't work". It is crucial that if nothing works, then the exit status is not 0 as that indicates success. Even that might be ignored - but you did what you could.
Other than just panicking, you could take a decision based on errno's value.
Exec should always succeed
(except for shells, e.g. if the user entered a bogus command).
If exec does fail, it indicates:
a "fault" with the program (missing or bad component, wrong pathname, bad memory, ...), or
a serious system error (out of memory, too many processes, disk fault, ...)
For any serious error, the normal approach is to write the error message on stderr, then exit with a failure code. Almost all of the standard tools do this. For exec:
execl("bork", "bork", NULL);
perror("failed: exec");
exit(127);
The shell does that, too (more or less).
Normally if a child process fails, the parent has failed too and should exit. It does not matter whether the child failed in exec, or while running the program. If exec failed, it does not matter why exec failed. If the child process failed for any reason, the calling process is in trouble and needs to stop.
Don't waste lots of time trying to anticipate all possible error conditions. Don't write code that tries to handle each error code in the best possible way. You'll just bloat the code, and introduce many new bugs. If your program is broken, or it's being abused, it should simply fail. If you force it to continue, worse trouble will come of that.
For example, if the system is out of memory and thrashing swap, we don't want to cycle over and over trying to run a process; it would just make the situation worse. If we get a filesystem error, we don't want to continue running on that filesystem; it might make the corruption worse. If the program was installed wrongly, or has a bug, or has memory corruption, we want to stop as soon as possible, before that broken program does some real damage (such as sending a corrupted report to a client, trashing a database, ...).
One possible alternative: a failing process might call for help, pause itself (SIGSTOP), then retry the operation if told to continue. This could help when the system is out of memory, or disks are full, or perhaps even if there is a fault in the program. Few operations are so expensive and important that this would be worthwhile.
If you're making an interactive GUI program, try to do it as a thin wrapper over reusable command-line tools (which exit if something goes wrong). Every function in your program should be accessible through the GUI, through the command-line, and as a function call. Write your functions. Write a few tools to make commmand-line and GUI wrappers for any function. Use sub-processes too.
If you are making a truly critical system, such as a controller for a nuclear power station, or a program to predict tsunamis, then what are you doing reading my dumb advice? Critical systems should not depend entirely on computers or software. There needs to be a 'manual override', with someone to drive it. Especially, do not attempt to build a critical system on MS Windows; that is like building sandcastles underwater.
Current scenario, I launch a process that forks, and after a while it aborts().
The thing is that both the fork and the original process print to the shell, but after the original one dies, the shell "returns" to the prompt.
I'd like to avoid the shell returning to the prompt and keep as if the process didn't die, having the child handle the situation there.
I'm trying to figure out how to do it but nothing yet, my first guess goes somewhere around tty handling, but not sure how that works.
I forgot to mention, the shell takeover for the child could be done on fork-time, if that makes it easier, via fd replication or some redirection.
I think you'll probably have to go with a third process that handles user interaction, communicating with the "parent" and "child" through pipes.
You can even make it a fairly lightweight wrapper, just passing data back and forth to the parent and terminal until the parent dies, and then switching to passing to/from the child.
To add a little further, as well, I think the fundamental problem you're going to run into is that the execution of a command by the shell just doesn't work that way. The shell is doing the equivalent of calling system() -- it's going to wait for the process it just spawned to die, and once it does, it's going to present the user with a prompt again. It's not really a tty issue, it's how the shell works.
bash (and I believe other shells) have the wait command:
wait: wait [n]
Wait for the specified process and report its termination status. If
N is not given, all currently active child processes are waited for,
and the return code is zero. N may be a process ID or a job
specification; if a job spec is given, all processes in the job's
pipeline are waited for.
Have you considered inverting the parent child relationship?
If the order in which the new processes will die is predictable, run the code that will abort in the "child" and the code that will continue in the parent.