I have a C function to do a fork and exec that will be called twice.
The first call executes a shell script (call it setenv.sh) which can be any kind of shell (bash/korn/c/perl etc) that will set environment variables. The envp array will be NULL for this call but the intent was that it will return a populated array based on environ from the child process after setenv.sh has run.
The second call will be a C or java program that needs a certain environment to run so for this call, the envp array will be the populated one returned from the first call.
int execute(char **args, int argc, char **envp)
{
char *function = "execute";
int status, i;
pid_t p, pid;
extern int errno;
sigset_t mask, savemask;
struct sigaction ignore, saveint, savequit;
int fd[2];
pipe(fd);
sigemptyset(&ignore.sa_mask);
ignore.sa_handler = SIG_IGN;
ignore.sa_flags=0;
sigaction(SIGINT, &ignore, &saveint);
sigaction(SIGQUIT, &ignore, &savequit);
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &savemask);
if ((pid=fork()) < 0) status = -1;
if (pid ==0) {
/* Child */
close(fd[0]);
sigaction(SIGINT, &saveint, (struct sigaction *) 0);
sigaction(SIGQUIT, &savequit, (struct sigaction *) 0);
sigprocmask(SIG_SETMASK, &savemask, (sigset_t *) 0);
printf("Command Line Parameters\n");
printf("-----------------------\n");
for (i = 0; i < argc; i++) {
printf("[%d]: %s\n", (i+1), args[i]);
}
if (execve(*args, args, envp) < 0)
{
sprintf(err_data,"Failed to execute %s", args[0]);
perror(err_data);
return(FAILED);
}
write(fd[1], &environ, sizeof(environ));
close(fd[1]);
}
while (waitpid(pid, &status, 0) < 0) {
if (errno != EINTR) {
status = -1;
break;
}
}
if (status==0) {
read(fd[0], &envp, sizeof(envp));
}
close(fd[0]);
sigaction(SIGINT, &saveint, (struct sigaction *) 0);
sigaction(SIGQUIT, &savequit, (struct sigaction *) 0);
sigprocmask(SIG_SETMASK, &savemask, (sigset_t *) 0);
return(status);
}
This function is working fine without the pipe code to execute a real program passed in and i can also pass it a set of environment variables in an envp array and it runs in that environment fine.
However, in testing with the pipe included, i find after the exec of setenv.sh, the child process never executes the writing of environ to the pipe and the parent then just blocks on the read from pipe.
I understand why it doesnt work - because the exec of the shell script overwrites the original C code in the child. The question is, is there a way to achieve the aim of running a shell script with exec and capturing the resulting ENVIRONMENT back in the parent (not the same as capturing stdin/stdout/stderr). Assume you cannot change the contents of setenv.sh because it may be provided by a third party.
No need to nitpick over error handling etc.. , this is a work in progress so just after some inputs in how to achieve the aim.
An alternative i considered was parsing the setenv.sh script in the parent to obtain the variables into an array which can then be passed to the real program. Problem with this is the setenv.sh script might contain if statement blocks and includes of other shell scripts so i really wanted to capture the environment at the end of the run of setenv.sh (by exec'ing it) and passing this back to the parent.
Any suggestions appreciated ?
You basically can't solve this generally without using debugging facilities of your operating system and digging into the memory of your child process. Which basically requires you to do write half of a debugger.
The closest you can get with a third-party script is something like this. Let's say that the script is for /bin/bash. You write your own wrapper script like this:
#!/bin/bash
. setenv.sh
env >&3
Where 3 is the file descriptor number of your pipe. You can write equivalent scripts for other shells. The only reason this works though is because the "setenv.sh" script is executed inside your wrapper script without creating a child process. Environment variables can only be communicated to children of a process.
In a system I use at work we have environment variables that need to be unified between many different programs that come from various scripts, many of which we don't have any control over. The way we resolved that mess is that instead of environment variables we require those scripts to output "KEY=VALUE\n" lines and then import them into scripts, makefiles, etc. through simple scripts (if required). That's probably the best you can do.
You can use :
extern char **environ;
environ is defined as a global variable in the Glibc source file posix/environ.c.
Related
Here is a C program which operates finding specific properties like CPU bus info by consecutive calls of lshw (to access total hardware list with respective properties) and grep (to select just a relevant point among lshw results):
char *strCombine(char *str1, char *str2, int n)
{
int i = strlen(str2);
int j = 0;
if((str2 = (char *) realloc(str2, (i + n + 1))) == NULL)
perror(0);
while(j < n && str1[j])
{
str2[i] = str1[j];
i++;
j++;
}
str2[i] = 0;
return (str2);
}
int main()
{
pid_t parent;
char buf[1000] = {0};
char *str;
char *argv[6] = {"/usr/bin/lshw", "-C", "CPU", "|", "grep", "bus info"};
int fd[2];
int ret;
if(pipe(fd) == -1)
{
perror(NULL);
return -1;
}
parent = fork();
if(parent == 0)
{
close(fd[1]);
while((ret = read(fd[0], buf, 1000)))
str = strCombine(buf, str, ret);
close(fd[0]);
}
else
{
close(fd[0]);
execv(argv[0], argv);
close(fd[1]);
wait(0);
}
wait(0);
printf("%s", str);
return 0;
}
In this code grep is expected to follow lshw since both go executed by invoking execv. However, this pipeline doesn't work because lshw usage reference gets printed out in terminal (running on Ubuntu 18.04 LTS) instead of bus info needed originally. What makes this program failed to show just info that matters and what way must I try to set up pipeline?
The vertical bar is not a parameter you use to separate commands, as the execve(2) system call will load a program into the virtual space of one process only. You need to create two processes, one per command you want to execute, and communicate them so input from one goes to output from the other. I think also you'll be interested in the output of the last command, so you need to do two redirections (one from the first command to the second, and one from the output of the second command to a pipe descriptor), two forks, and two exec's in order to do this.
First the good news, you can do all this stuff with a simple call to popen(3) without the nitty gritties of making forks and execs while redirecting i/o from individual commands. Just use this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *cmd = "/usr/bin/lshw -C CPU | grep 'bus info'";
int n = 0;
char line[1000];
/* f will be associated to the output of the pipeline, so you can read from it.
* this is stated by the "r" of the second parameter */
FILE *f = popen(cmd, "r");
if (!f) {
perror(cmd);
exit(EXIT_FAILURE);
}
/* I read, line by line, and process the output,
* printing each line with some format string, but
* you are free here. */
while (fgets(line, sizeof line, f)) {
char *l = strtok(line, "\n");
if (!l) continue;
printf("line %d: [%s]\n", ++n, l);
}
/* once finished, you need to pclose(3) it. This
* makes program to wait(2) for child to finish and
* closing descriptor */
pclose(f);
}
If you need to mount such a pipeline you'll end having to
redirections from first command to second, from second to
parent process, and fork/exec both processes yourself.
In this approach, you handle a subshell to do the piping
and redirection work for you, and just you get a FILE * descriptor to read upon.
(if I find some time, I'll show you a full example of a chain of N commands with redirections to pipe them, but I cannot promise, as I have to write the code)
NOTE
fork() returns the pid of the child process to the parent, and 0 to the child process itself. I don't understand why you have a variable named parent where you store the value received from fork(). If it is nonzero (and non-negative) it represents the pid of a child process. You need two, as you need two processes. In the example I post, you create three processes (you ask a subshell to mount the pipeline for you, so you have a subshell you instruct to create two more processes, to execute your command) If you had to mount all this paraphernalia, you'd also to wait(2) for the children to finish (this is done in pclose(3) call)
I have a little program to spawn a process (only one) repeatedly, while overprinting its output in the same place. I use it as some kind of htop program when I try to see e.g. the output of ls -l (showing a file growing as it is being filled) or the output of df command. It starts the program, makes one fork, redirects the output of it to a pipe and gets the output of the command to count the number of lines output (to emit an escape sequence to put the cursor on top of the listing, and to emit a clear to the end of line after each output line, so shorter lines dont get blurred by longer ones. It shows you how to deal with forks and exec system calls, and you can use as example on how to do the things the brave way. But having popen(3) I think is the solution to your problem. If you want to have a look to my cont program, just find it here.
Like many others, I'm trying to simulate a shell. I've gotten to use the execvp properly on a string coming from the user. The string is parsed and an array of strings is generated (each word has its array, split on the space character), including a NULL at the very end.
When I find that the last word entered by the user is &, I set a flag up to notify my shell that the command is to be executed in the background while letting the user input another command right away. The "background-executed" command sees its & replaced by a NULL character within the array of strings passed to execvp.
As it is, I've been trying to use a pthread to run the process in the background, but it's acting somewhat weird: the command passed to execvp through the thread's function requires me to press two times ENTER after sending the command.
Here is my simplified main function that is to simulate a shell:
int main (void) {
fprintf (stdout, "%% ");
bool running = true;
while(running) {
/* Ask for an instruction and parses it. */
char** args = query_and_split_input();
/* Executing the commands. */
if (args == NULL) { // error while reading input
running = false;
} else {
printf("shell processing new command\n");
int count = count_words(args);
split_line* line = form_split_line(args, count);
Expression* ast = parse_line(line, 0, line->size - 1);
if(line->thread_flag) {
pthread_t cmd_thr;
/* Setting up the content of the thread. */
thread_data_t thr_data;
thr_data.ast = *ast;
thr_data.line = *line;
/* Executing the thread. */
int thr_err;
if ((thr_err = pthread_create(&cmd_thr, NULL, thr_func, &thr_data))) {
fprintf(stderr, "error: pthread_create, rc: %d\n", thr_err);
return EXIT_FAILURE;
}
printf("thread has been created.\n");
} else {
run_shell(args);
}
free(line);
printf("done running shell on one command\n");
}
}
/* We're all done here. See you! */
printf("Bye!\n");
exit (0);
}
Here is my thread's function:
void *thr_func(void *arg) {
thread_data_t *data = (thread_data_t *)arg;
data->line.content[data->line.size-1] = NULL; // to replace the trailing '&' from the command
run_shell(data->line.content);
printf("thread should have ran the command\n");
pthread_exit(NULL);
}
And the actual line that runs a command:
void run_shell(char** args) {
/* Forking. */
int status;
pid_t pid; /* Right here, the created THREAD somehow awaits a second 'ENTER' before going on and executing the next instruction that forks the process. This is the subject of my first question. */
pid = fork();
if (pid < 0) {
fprintf(stderr, "fork failed");
} else if (pid == 0) { // child
printf("Child executing the command.\n");
/* Executing the commands. */
execvp(args[0], args);
/* Child process failed. */
printf("execvp didn't finish properly: running exit on child process\n");
exit(-1);
} else { // back in parent
waitpid(-1, &status, 0); // wait for child to finish
if (WIFEXITED(status)) { printf("OK: Child exited with exit status %d.\n", WEXITSTATUS(status)); }
else { printf("ERROR: Child has not terminated correctly. Status is: %d\n", status); }
free(args);
printf("Terminating parent of the child.\n");
}
}
So basically, as an example, what run_shell(args) receives is either ["echo","bob","is","great",NULL] (in the case of a sequential execution) or ["echo","bob","is","great",NULL,NULL] (in the case of a command to be executed in the background).
I've left the printf traces since it might help you understand the execution flow.
If I input echo bob is great, the output (printf traces) is:
shell processing new command
Child executing the command.
bob is great
OK: Child exited with exit status 0.
Terminating parent of the child.
done running shell on one command
However, if I input echo bob is great &, the output is:
shell processing new command
thread has been created.
done running shell on one command
And then I actually need to press ENTER again to obtain the following output:
Child executing the command.
bob is great
OK: Child exited with exit status 0.
Terminating parent of the child.
thread should have ran the command
(On that last execution, I also get traces of my function that queries and parses the input of the user, but that seemed irrelevant so I abstracted this whole part.)
So my questions are:
How comes the created thread awaits a second ENTER before running the execvp ? (thr_func stops executing run_shell and awaits the second ENTER right before the pid = fork(); instruction)
Do I have the right approach to solve the problem at hand? (Trying to execute a shell command in the background.)
You cannot use a thread to simulate a process. Well, strictly you can, but there's no use on doing that. The problem is that all the threads belonging to a process share the same virtual address space. There's no reason to create a thread, as you finally need to fork() to create a new process (you'll need this for reasons explained below), so why to create two threads of execution if one of them will be stopped all the time just waiting for the subprocess to finish. There's no use on this schema.
The need of a fork() system call comes historically to make a simple call to create a new process (with different virtual memory map) to allow for a new program to be able to be executed. You need to create a new, complete process before calling exec(2) system call, because the process address space will be overwritten by the text and data segments of the new program. If you do this in a thread, you'll be overwriting the whole process address space (this is the shell) and killing all the threads you can have running on behalf of that process. The schema to follow is (pseudocode):
/* create pipes for redirection here, before fork()ing, so they are available
* in the parent process and the child process */
int fds[2];
if (pipe(fds) < 0) { /* error */
... /* do error treatment */
}
pid_t child_pid = fork();
switch(child_pid) {
case -1: /* fork failed for some reason, no subprocess created */
...
break;
case 0: /* this code is executed in the childd process, do redirections
* here on pipes acquired ***before*** the fork() call */
if (dup2(0 /* or 1, or 2... */, fds[0 /* or 1, or 2... */]) < 0) { /* error */
... /* do error management, considering you are in a different process now */
}
execvpe(argc, argv, envp);
... /* do error management, as execvpe failed (exec* is non-returning if ok) */
break; /* or exit(2) or whatever */
default: /* we are the parent, use the return value to track the child */
save_child_pid(child_pid);
... /* close the unused file descriptors */
close(fds[1 /* or 0, or 2, ... */]);
... /* more bookkeeping */
/* next depends on if you have to wait for the child or not */
wait*(...); /* wait has several flavours */
} /* switch */
Exec and fork system calls are separated by two reasons:
you need to be able to do housekeeping between both calls to execute the actual redirections in the child before exec().
there was a time when unix was not multitasking or protected, and the exec call just replaced all the memory in the system with the new program to execute (including kernel code, to cope with the fact that an unprotected system could be corrupted by the executing program) This was common in old operating systems and I've seen it on systems like CP/M or TRS-DOS. The implementation in unix conserved almost all the semantics of exec() call and added with fork() the unavailable functionality only. This was good, as it allowed both, parent and child processes to do the necessary bookkeeping when the time for pipes came.
Only if you need a different thread to communicate with each child is when you probably can use a different thread to do the task. But think that a thread shares all the virtual space with the parent (case we can talk about a parent/child relationship between threads) and if you do an exec call you'll get that virtual space overwritten for the whole process (all threads there)
I have a multi-threaded application and have got a way to do a telnet, ssh on to this application. In my application, I do one of the init script restart using the custom system() call below. It seems like, the child process is still active. I am saying this because If I logout from telnet session still the process hangs i.e. it cannot logout. This happens only when I restart the script using this system call. Is there something wrong with my system() function?
int system(const char *command)
{
int wait_val, pid;
struct sigaction sa, save_quit, save_int;
sigset_t save_mask;
syslog(LOG_ERR,"SJ.. calling this system function\r\n");
if (command == 0)
return 1;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = SIG_IGN;
/* __sigemptyset(&sa.sa_mask); - done by memset() */
/* sa.sa_flags = 0; - done by memset() */
sigaction(SIGQUIT, &sa, &save_quit);
sigaction(SIGINT, &sa, &save_int);
__sigaddset(&sa.sa_mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &sa.sa_mask, &save_mask);
if ((pid = vfork()) < 0) {
perror("vfork fails: ");
wait_val = -1;
goto out;
}
if (pid == 0) {
sigaction(SIGQUIT, &save_quit, NULL);
sigaction(SIGINT, &save_int, NULL);
sigprocmask(SIG_SETMASK, &save_mask, NULL);
struct sched_param param;
param.sched_priority = 0;
sched_setscheduler(0, SCHED_OTHER, ¶m);
setpriority(PRIO_PROCESS, 0, 5);
execl("/bin/sh", "sh", "-c", command, (char *) 0);
_exit(127);
}
#if 0
__printf("Waiting for child %d\n", pid);
#endif
if (wait4(pid, &wait_val, 0, 0) == -1)
wait_val = -1;
out:
sigaction(SIGQUIT, &save_quit, NULL);
sigaction(SIGINT, &save_int, NULL);
sigprocmask(SIG_SETMASK, &save_mask, NULL);
return wait_val;
}
Any ideas on how to debug whether this system call is getting hanged or not?
I realized this happens because file descriptors are inherited upon fork .
Since my custom system() is nothing but fork() and exec(). There are plenty of sockets in my application. These socket file descriptors gets inherited by the child process.
My assumption here is that "Child process can't exit because it is waiting for parent process to close the file descriptors or those file descriptors are in a state where it can be closed". Not sure what those states are though.
So, here is the interesting link I found -
Call system() inside forked (child) process, when parent process has many threads, sockets and IPC
Solution -
linux fork: prevent file descriptors inheritance
Not sure, I can do this in a big application where sockets are opened at thousand of places. So, here is what I did.
My Solution -
I created a separate process/daemon that listens for the command from the parent application. This communication is based on socket. Since, it is a separate application/daemon it doesn't affect the main application which is running multiple threads and has a lot of opened sockets. This worked for me.
I believe that this problem will be fixed once I do -
fcntl(fd, F_SETFD, FD_CLOEXEC);
Any comments are welcome here.
Is this a fundamental problem in Linux, C i.e.
all file descriptors are inherited by default?
Why linux/kernel allow this? What advantage do we get out of it?
In order to realize a shell command interpretor, I try to execute pipes.
To do it, I use a recursive function in wich I use the pipe function and some redirections with dup2.
Here is my code :
void test_recurs(pid_t pid, char **ae)
{
char *const arg[2] = {"/bin/ls", NULL};
char *const arg2[3] = {"/bin/wc", NULL};
static int limit = 0;
int check;
int fd[2];
if (limit > 5)
return ;
if (pipe(fd) == -1)
{
printf("pipe failed\n");
return ;
}
pid = fork();
if(pid != 0)
{
printf("père %d\n",getpid());
close(fd[0]);
dup2(fd[1], 1);
close(fd[1]);
if ((execve("/bin/ls", arg, ae)) == -1)
exit(125);
dprintf(2, "execution ls\n");
wait(&check);
}
else
{
printf("fils %d\n", getpid());
close(fd[1]);
dup2(fd[0], 0);
close(fd[0]);
if ((execve("/bin/wc", arg2, ae)) == -1)
printf("echec execve\n");;
dprintf(2, "limit[%d]\n", limit);
limit++;
test_recurs(pid, ae);
}
}
The problem is it only execute "ls | wc" one time and then wait on the standard input. I know that the problem may come from the pipes (and the redirections).
It's a bit unclear how you are trying to use the function you present, but here are some notable points about it:
It's poor form to rely on a static variable to limit recursion depth because it's not thread-safe and because you need to do extra work to manage it (for example, to ensure that any changes are backed out when the function returns). Use a function parameter instead.
As has been observed in comments, the exec-family functions return only on failure. Although you acknowledge that, I'm not sure you appreciate the consequences, for both branches of your fork contain code that will never be executed as a result. The recursive call in particular is dead and will never be executed.
Moreover, the process in which the function is called performs an execve() call itself. The reason that function does not return is that it replaces the process image with that of the new process. That means that function test_recurs() also does not return.
Just as shell ordinarily must fork / exec to launch a single external command, it ordinarily must fork / exec for each command in a pipeline. If it fails to do so then afterward it is no longer running -- whatever it exec'ed without forking runs instead.
The problem is it only execute "ls | wc" one time and then wait on the standard input.
Certainly it does not recurse, because the recursive call is in a section of dead code. I suspect you are mistaken in your claim that it afterward waits on standard input, because the process that calls that function execs /bin/ls, which does not read from standard input. When the ls exits, however, leaving you with neither shell nor ls, what you then see might seem to be a wait on stdin.
I have this code;
pid_t process;
process = fork();
if (process < 0){
//fork error
perror("fork");
exit(EXIT_FAILURE);
}
if (process == 0){
//i try here the execl
execl ("process.c", "process" , n, NULL);
}
else {
wait(NULL);
}
I don't know if this use of fork() and exec() combined is correct. When I try to run the program from the bash I do not receive any result, so I thought it could be a problem in this part of code.
Thanks.
One problem is that
if (process = 0){
should read
if (process == 0){
Otherwise you're assigning zero to process and only calling execl if result is non-zero (i.e. never).
Also, you're trying to exec something called process.c. There's no doubt that one could have an executable called process.c. However, conventionally names ending in .c are given to C source code files. If process.c is indeed a C file, you need to compile and link it first.
Once you've built the executable, you need to either place it somewhere on $PATH or specify its full path to execle(). In many Unix environments placing it in the current directory won't be enough.
Finally, it's unclear what n is in the execle() call, but the name hints at a numeric variable. You need to make sure that it's a string and not, for example, an integer.
Well as per the answers and comments above your code should look somewhat like this
pid_t process;
process = vfork(); //if your sole aim lies in creating a child that will ultimately call exec family functions then its advisable to use vfork
if (process < 0)
{
//fork error
perror("fork");
exit(EXIT_FAILURE);
}
if (process == 0)
{
//i try here the execl
char N[MAX_DIGITS];//A correction here
itoa(n,N);//write this function yourself
execl ("process", "process" , N, NULL);// Here process is the name of the executable N is your original argument
fprintf(stderr,"execl failed\n");//check for error in execl
}
else
{
wait(NULL);
}
Notice the use of vfork instead of fork.Its because it would be much more efficient.The reason could be found here