Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have a shell scripts ,which will execute something like below:
file:example.sh
#!/bin/sh
#some other code
echo "someconfig">config_file
I hope the config_file just contain someconfig ,but strange thing happen with config_file,It has a single 'c' in the first line. I found no printf('c') in the parent process who execute the example.sh
My process will call the linux c function to execute the script this way:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>
#include <sys/types.h>
int execute_shell(const char *shell)
{
pid_t pid;
int iRet = 0;
if((pid = fork()) < 0)
{
return -1;
}
else if (pid == 0)
{
execlp("sh", "sh", "-c", shell, (char*)0);
exit(127);
}
else
{
while (waitpid(pid, &iRet, 0) < 0) {
if (errno != EINTR) {
iRet = -1;
break;
}
}
}
if(WIFEXITED(iRet) == 0)
{
return -1;
}
if(WEXITSTATUS(iRet) != 0)
{
return -1;
}
return 0;
}
int main()
{
char shell_cmd[1024]="./example.sh";
if( execute_shell(shell_cmd) == -1 )
{
// handle error
}
/*other code blew,may be will write to stdout*/
return 0;
}
Sometimes the config file looks strange,not what the shell scripts echo.
I use the cmd to analysis the possibility:
strace -f ./fork
[pid 12235] open("config_file", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
[pid 12235] fcntl(1, F_GETFD) = 0
[pid 12235] fcntl(1, F_DUPFD, 10) = 10
[pid 12235] fcntl(1, F_GETFD) = 0
[pid 12235] fcntl(10, F_SETFD, FD_CLOEXEC) = 0
[pid 12235] dup2(3, 1) = 1
[pid 12235] close(3) = 0
[pid 12235] fstat(1, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 12235] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbabb409000
[pid 12235] write(1, "someconfig\n", 11) = 11
[pid 12235] dup2(10, 1) = 1
[pid 12235] fcntl(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
[pid 12235] close(10) = 0
[pid 12235] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid 12235] read(255, "", 52) = 0
[pid 12235] exit_group(0) = ?
[pid 12235] +++ exited with 0 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 12235
I do not understand what the meaning of function = num?
I will appreciate if someone analysis what the meaning of the strace output.
I suspect the parent and child write to the stdout,leading to strange output in config.
In my project we just use the linux c code execute a shell scripts which would echo someconfig to config_file,It run 6 months normally,but one day the config looks strange(two machine,with the same error,first line with c not what it echo to).
I just want to talk if there any possibility this happen,to have a direction to fix the problem.
After analysis the strace output ,child process execute some fd operation,wich make sure child and parent echo to differnet fd.So I think there is no possibility to make the config mess.
the following proposed code:
cleanly compiles
performs the desired functionality
properly handles any failures seen in the sub function
properly calls the function: execlp()
Here is what I fixed:
the call (and results) of execlp()
the handling of the call to fork() when it fails
the call to waitpid() when it returns an error.
added include statements for: <sys/types.h> and <sys/wait.h> for the function: waitpid()
replaced any calls to _exit() with exit() since most beginning programmers have no idea about the result of using _exit() rather than exit()
added place to handle the returned value from the call to: execute_shell()
and now, the proposed code:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
int execute_shell(const char *shell)
{
pid_t pid;
pid = fork();
if( pid < 0 )
{
perror( "fork failed" );
return -1;
}
else if (pid == 0)
{ // child process
execlp("sh", "sh", "-c", shell, (char*)0);
perror( "execlp failed" );
return -2;
}
else
{ // parent process
int iRet;
if( waitpid(pid, &iRet, 0) == -1 )
{
return -3;
}
if (errno != EINTR)
{
return -4;
}
if(WIFEXITED(iRet) == 0)
{
return -5;
}
if(WEXITSTATUS(iRet) != 0)
{
return -6;
}
}
return 0;
}
int main( void )
{
char shell_cmd[1024]="./example.sh";
if( execute_shell(shell_cmd) )
{
// handle error
}
/*other code blew,may be will write to stdout*/
return 0;
}
I created a file: example.sh that contained:
#!/bin/sh
#some other code
echo "someconfig" > config_file
then ran the command:
chmod 777 example.sh
to make it executable, then ran the program I posted.
The result is nothing displayed on the terminal, however, now there is a new file: config_file that contains:
someconfig
I ran the code I posted several times and the file config_file contents did not change
I.E. no stray character 'c'
s
What have you not told us?
Related
// This is my code for reading a file from command line arguments and storing it in another file.//
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h> //for system calls such as dup(),pipe(),etc...
#include <sys/wait.h>
#define COUNT_PROGRAM "b"
#define CONVERT_PROGRAM "c"
int main(int argc, char *argv[])
{
if (argc != 3)
{
fprintf(stderr,"%s","ERROR : argument count not satisfied!\n");
exit(1);
}
/* It is important to check all system calls (open, creat, dup, etc.) for a return value < 0,
particularly -1 because such a return value means an error has occurred. */
int fd_in = open(argv[1], O_RDONLY);
if (fd_in < 0)
{
fprintf(stderr,"%s", "ERROR : file to be read does not exist!\n");
exit(1);
}
int fd_out = creat(argv[2], 0644); /* mode = permissions, here rw-r--r-- */
if (fd_out < 0)
{
fprintf(stderr,"%s", "ERROR : file could not be created!\n");
exit(1);
}
if(dup(fd_in) < 0)//dup fd_in to 3
fprintf(stderr , "ERROR assigning STREAM 3 to fd_in");
if(dup(fd_out) < 0)//dup fd_in to 4
fprintf(stderr , "ERROR assigning STREAM 4 to fd_out");
//dup fd_in to 0
close(0);
dup(fd_in);
close(3);
//dup fd_out to 1
close(1);
dup(fd_out);
close(4);
int fd[2];
pipe(fd);
pid_t pid_child_1, pid_child_2;
int status;
if ((pid_child_1 = fork()) != 0)
{
if ((pid_child_2 = fork()) != 0)
{
close(fd[0]);
close(fd[1]);
wait(&status);
wait(&status);
// fprintf(stderr , "\nstatus of child_1 = %d",wait(&status));
// fprintf(stderr , "\nstatus of child_2 = %d",wait(&status));
}
else
{
// close(fd[1]);
// dup(1);
dup2(fd[1],1);
close(fd[0]);
execl( CONVERT_PROGRAM, CONVERT_PROGRAM, (char*) NULL);
}
}
else
{
// close(fd[0]);
// dup(0);
dup2(fd[0],0);
close(fd[1]);
execl( COUNT_PROGRAM , COUNT_PROGRAM ,(char*) NULL);
}
}
After compiling my text file which should contain the output is empty.
THOSE PROGRAMS ARE WORKING WELL ALONE.
Here I am adding the strace result after running the strace command. Here it is printing 7 on the screen and no output in the text file.
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\35\2\0\0\0\0\0"..., 8 32) = 832
strace: Process 6018 attached
strace: Process 6019 attached
[pid 6019] read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\35\2\0\0\ 0\0\0"..., 832) = 832
[pid 6018] read(4, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\35\2\0\0\ 0\0\0"..., 832) = 832
[pid 6019] read(0, "hello my name is himanshu KAUSHI"..., 4096) = 35
[pid 6019] read(0, "", 4096) = 0
[pid 6019] write(1, "HELLO MY NAME IS HIMANSHU kaushi"..., 35) = 35
[pid 6019] +++ exited with 0 +++
[pid 6018] read(0, "HELLO MY NAME IS HIMANSHU kaushi"..., 4096) = 35
[pid 6017] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6019, si_u id=1062, si_status=0, si_utime=0, si_stime=0} ---
[pid 6018] read(0, "", 4096) = 0
[pid 6018] write(2, "\n7", 2
7) = 2
[pid 6018] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6018, si_uid=1062, si_ status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
I changed standard table descriptors in 1st dup2(f_dsc[1],3)
command and I got the output in the text file but then my first program stooped running.
#include<stdio.h>
#include<ctype.h>
int main()
{
char c;
int count=0;
while(1)
{
c=getchar();
if(c==EOF)
{
break;
}
if(!isalpha(c))
{
count++;
}
}
fprintf(stderr,"\n%d",count);
}
this is my simple b program.
#include<stdio.h>
#include<ctype.h>
int main()
{
char c;
int count=0;
while(1)
{
c=getchar();
if(c==EOF)
{
break;
}
if(islower(c))
{
c=toupper(c);
}
else
{
c=tolower(c);
}
putchar(c);
}
return 0;
}
and this is my simple c program.
When I try your program, it correctly executes equivalently to
c <a.txt | b >b.txt
So, let's consider what may be different in your setup. Although you write
/* It is important to check all system calls (open, creat, dup, etc.) for a return value < 0,
particularly -1, because such a return value means an error has occurred. */
you don't check the return value of the execl calls (or just put perror("…"); after them to see if they fail). Perhaps b or c is a script without a first line like
#!/bin/sh
You can get away without such a line when calling the script from a shell (I guess you mean that when you yell THOSE PROGRAMS ARE WORKING WELL ALONE), but not when using execl.
I wrote the following code, the goal is to make thread to suicide with syscall (without calling pthread_exit()).
So I create two threads and select one that get its thread id, and send SIGKILL to itself with tkill()
The problem is that all the process terminates, not only the selected thread.
Why is that? How can I makes thread to suicide with syscall only?
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/syscall.h>
void *myThreadFun(void *vargp)
{
int tid = syscall(SYS_gettid);
if(tid %2 ==0)
{
printf("kill thread id ! = %ld \n",tid);
syscall(SYS_tkill,tid,9);
}
while(1)
{
printf("Thread ID: %d\n", tid);
sleep(2);
}
}
int main()
{
int i;
pthread_t tid;
printf("main thread id = %ld \n",syscall(SYS_gettid));
for (i = 0; i < 2; i++)
pthread_create(&tid, NULL, myThreadFun, NULL);
char tmp;
scanf("%c",&tmp);
return 0;
}
output:
main thread id = 11911
kill thread id ! = 11912
Killed
There is no "kill this thread only" signal disposition; there are only "ignore", "terminate the entire process", "terminate the entire process and dump a core file", "stop (pause) the entire process", and "continue the entire process if it is stopped (paused)". You cannot even choose the disposition per se; only between the default disposition for that particular signal, "ignore", or an userspace signal handler function.
There is really only one option: the SYS_exit syscall.
In C, you can do this via
#include <unistd.h>
#include <sys/syscall.h>
static inline void exit_thread(void) __attribute__((noreturn));
static inline void exit_thread(void)
{
syscall(SYS_exit, 0);
}
A real-world example always beats a snippet, I think. Consider the following program:
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <pthread.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
void *thread_function(void *payload)
{
#ifdef DO_EXIT
const int exit_code = (int)(intptr_t)payload;
fprintf(stderr, "Running thread_function(%p); calling syscall(SYS_exit, %d).\n", payload, exit_code);
fflush(stderr);
syscall(SYS_exit, exit_code);
#else
fprintf(stderr, "Running thread_function(%p); calling pthread_exit(%p).\n", payload, payload);
fflush(stderr);
pthread_exit(payload);
#endif
return NULL; /* Never reached */
}
int main(void)
{
void *const payload = thread_function; /* Just some random pointer value */
pthread_t thread_id;
void *thread_status;
int err;
printf("Calling pthread_create(&thread_id, NULL, thread_function, %p): ", payload);
fflush(stdout);
err = pthread_create(&thread_id, NULL, thread_function, payload);
if (err) {
printf("Failed: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
printf("Success.\n");
printf("Calling pthread_join(thread_id, &thread_status): ");
fflush(stdout);
err = pthread_join(thread_id, &thread_status);
if (err) {
printf("Failed: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
printf("Success; thread_status == %p.\n", thread_status);
return EXIT_SUCCESS;
}
Save it as example.c, then compile one version, ex1, that uses pthread_exit():
gcc -Wall -Wextra -O2 example.c -pthread -o ex1
and another, ex2, that uses syscall(SYS_exit,):
gcc -Wall -Wextra -DDO_EXIT -O2 example.c -pthread -o ex2
To see exactly what is happening, run each example under strace, logging each clone, write, futex, exit, and exit_group syscalls the program makes:
strace -f -e clone,write,futex,exit,exit_group -o ex1.log ./ex1
strace -f -e clone,write,futex,exit,exit_group -o ex2.log ./ex2
On my machine, ex1.log looks like
28703 write(1, "Calling pthread_create(&thread_i"..., 75) = 75
28703 clone(child_stack=0x7f11f9c14fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f11f9c159d0, tls=0x7f11f9c15700, child_tidptr=0x7f11f9c159d0) = 28704
28703 write(1, "Success.\n", 9) = 9
28703 write(1, "Calling pthread_join(thread_id, "..., 49) = 49
28703 futex(0x7f11f9c159d0, FUTEX_WAIT, 28704, NULL <unfinished ...>
28704 write(2, "Running thread_function(0x563f0c"..., 79) = 79
28704 futex(0x7f11f94141a0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
28704 exit(0) = ?
28703 <... futex resumed> ) = 0
28704 +++ exited with 0 +++
28703 write(1, "Success; thread_status == 0x563f"..., 42) = 42
28703 exit_group(0) = ?
28703 +++ exited with 0 +++
and ex2.log looks like
28707 write(1, "Calling pthread_create(&thread_i"..., 75) = 75
28707 clone(child_stack=0x7f42c7db8fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f42c7db99d0, tls=0x7f42c7db9700, child_tidptr=0x7f42c7db99d0) = 28708
28707 write(1, "Success.\n", 9) = 9
28708 write(2, "Running thread_function(0x556b57"..., 80 <unfinished ...>
28707 write(1, "Calling pthread_join(thread_id, "..., 49 <unfinished ...>
28708 <... write resumed> ) = 80
28707 <... write resumed> ) = 49
28708 exit(1472596720 <unfinished ...>
28707 futex(0x7f42c7db99d0, FUTEX_WAIT, 28708, NULL <unfinished ...>
28708 <... exit resumed>) = ?
28707 <... futex resumed> ) = 0
28708 +++ exited with 240 +++
28707 write(1, "Success; thread_status == (nil)."..., 33) = 33
28707 exit_group(0) = ?
28707 +++ exited with 0 +++
From their differences, and the fact that this is GNU C library version 2.27 running on x86-64 architecture, we can make a couple of important observations:
The GNU pthreads implementation uses a futex to pass the thread function return value, when the thread exits and is reaped (using pthread_join()) by another thread.
Aside from the futex used to pass the return value, pthread_exit() calls the exit syscall with exit status 0.
(In fact, the man 3 pthread_exit man page says so explicitly.)
When the thread exits (using the exit syscall), the exit status code is irrelevant.
If our thread function uses the syscall directly, cleanup functions registered by atexit() and pthread_push() will not be called, and if pthread_join() is called on this thread, the return value will essentially be (void *)0 (== NULL in Linux, printed as (nil) by printf()/fprintf() %p format specifier).
select on fds higher then 255 do not check if the fd is open. Here is my example code:
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/select.h>
int main()
{
fd_set set;
for(int i = 5;i<FD_SETSIZE;i++)
{
printf("--> i is %d\n", i);
FD_ZERO(&set);
FD_SET(i, &set);
close(i);
int retval = select(FD_SETSIZE, &set, NULL, NULL, NULL);
if(-1 == retval)
{
perror("select");
}
}
}
This results in:
--> i is 5
select: Bad file descriptor
...
--> i is 255
select: Bad file descriptor
--> i is 256
Then the application blocks.
Why does this not create a EBADF on 256 till FD_SETSIZE?
Requested Information from comments:
The result of prlimit is:
NOFILE max number of open files 1024 1048576
This is the result of strace ./test_select:
select(1024, [127], NULL, NULL, NULL) = -1 EBADF (Bad file descriptor)
dup(2) = 3
fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
write(3, "select: Bad file descriptor\n", 28select: Bad file descriptor
) = 28
close(3) = 0
write(1, "--> i is 128\n", 13--> i is 128
) = 13
close(128) = -1 EBADF (Bad file descriptor)
select(1024, [128], NULL, NULL, NULL
Debunking thoughts from the comments:
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/select.h>
#include <fcntl.h>
int main()
{
char filename[80];
int fd;
for(int i = 5;i<500;i++)
{
snprintf(filename, 80, "/tmp/file%d", i);
fd = open(filename, O_RDWR | O_APPEND | O_CREAT);
}
printf("--> fd is %d, FD_SETSIZE is %d\n", fd, FD_SETSIZE);
fd_set set;
FD_ZERO(&set);
FD_SET(fd, &set);
int retval = select(FD_SETSIZE, NULL, &set, NULL, NULL);
if(-1 == retval)
{
perror("select");
}
}
Results in:
$ ./test_select
--> fd is 523, FD_SETSIZE is 1024
Process exits normally, no blocking.
Something very strange is going on here. You may have found a bug in the Linux kernel.
I modified your test program to make it more precise and also to not get stuck when it hits the problem:
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/time.h>
int main(void)
{
fd_set set;
struct timeval tv;
int i;
for(i = 5; i < FD_SETSIZE; i++)
{
FD_ZERO(&set);
FD_SET(i, &set);
tv.tv_sec = 0;
tv.tv_usec = 1000;
close(i);
int retval = select(FD_SETSIZE, &set, 0, 0, &tv);
if (retval == -1 && errno == EBADF)
;
else
{
if (retval > 0)
printf("fd %d: select returned success (%d)\n", i, retval);
else if (retval == 0)
printf("fd %d: select timed out\n", i);
else
printf("fd %d: select failed (%d; %s)\n", i, retval, strerror(errno));
return 1;
}
}
return 0;
}
My understanding of POSIX says that, whatever FD_SETSIZE is, this program should produce no output and exit successfully. And that is what it does on FreeBSD 11.1 and NetBSD 7.1 (both running on x86 processors of some description). But on Linux (x86-64, kernel 4.13), it prints
fd 256: select timed out
and exits unsuccessfully. Even stranger, if I run the same binary under strace, that changes the output:
$ strace -o /dev/null ./a.out
fd 64: select timed out
The same thing happens if I run it under gdb, even if I don't tell gdb to do anything other than just run the program.
Reading symbols from ./a.out...done.
(gdb) r
Starting program: /tmp/a.out
fd 64: select timed out
[Inferior 1 (process 8209) exited with code 01]
So something is changing just because the process is subject to ptrace monitoring. That can only be caused by the kernel.
I have filed a bug report on the Linux kernel and will report what they say about it.
The following code can act as expected if executed by a shell.
But if I set this program as a user's shell and ssh into the host to execute this program as a shell, the read(0, &buf123, 1); will return an EIO(Input/output error):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <regex.h>
#include <curl/curl.h>
#include <readline/readline.h>
int main() {
char *shell = "/bin/bash";
pid_t child;
if ((child = fork()) < 0) {
perror("vfork");
return;
}
if (child == 0) {
execl(shell, shell + 5, "-c", "exec /bin/bash --login", NULL);
perror("execl");
return;
}
wait(NULL);
char buf123[1024];
read(0, &buf123, 1);
printf("::%s::\n", buf123);
}
But if a change execl(bash) into non-interactive bash execl(bash -c "id") or other program rather than bash, the read(0, &buf123, 1); will success.
So to reproduce this error, two conditions need to be met:
1. execvl() an interactive bash(system() can also reproduce this error)
2. run as a user's shell using ssh
Could anyone help me figure out why and how to avoid this?
The following is the strace result:
wait4(-1, NULL, 0, NULL) = 2
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2, si_status=0, si_utime=0, si_stime=0} ---
read(0, 0x7fff7a4c8cb0, 1) = -1 EIO (Input/output error)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcd281f3000
write(1, "::Xv\37(\315\177::\n", 11) = 11
exit_group(11) = ?
+++ exited with 11 +++
Thanks in advance!
I happens because your sub-shell is a login interactive shell and so it took control over the terminal (set it as a its session control terminal). Then your process is disconnected from the terminal and cannot read on it anymore.
Of course if you use a non interactive shell, it don't need the control over the terminal leaving it as is for your process.
Read about POSIX Terminal, sessions and processes group.
I'm currently writing my own shell as a project for a class, and have everything virtually working. My problem is with my pipes, sometimes they work, and sometimes, they just hang until I interrupt them. I've done research on this, and it seems that the function that is getting it's stdin written to isn't receiving an EOF from the first process; usually as I've learned the problem is that the pipe isn't being closed, but this isn't the case (to my knowledge) with my code.
All redirection works and any variation thereof:
ls -l > file1
wc < file1 > file2
The following piped commands work:
w | head -n 4
w | head -n 4 > file1
This doesn't work: ls | grep file1 it shows the correct output and never ends unless an interrupt signal is sent to it by the user. ls | grep file1 > file2 also does not work. It hangs without showing output, creates the file2, but never writes to it.
Anyway, I hope there's something I'm missing that someone else can notice; I've been at this for a while. Let me know if there's anymore code I can provide. The code I've posted below is the main file, nothing removed.
/*
* This code implemenFts a simple shell program
* At this time it supports just simple commands with
* any number of args.
*/
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <errno.h>
#include <signal.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <string.h>
#include "input.h"
#include "myShell.h"
#include "BackgroundStack.h"
/*
* The main shell function
*/
main() {
char *buff[20];
char *inputString;
BackgroundStack *bgStack = malloc(sizeof(BackgroundStack));
initBgStack(bgStack);
struct sigaction new_act;
new_act.sa_handler = sigIntHandler;
sigemptyset ( &new_act.sa_mask );
new_act.sa_flags = SA_RESTART;
sigaction(SIGINT, &new_act, NULL);
// Loop forever
while(1) {
const char *chPath;
doneBgProcesses(bgStack);
// Print out the prompt and get the input
printPrompt();
inputString = get_my_args(buff);
if (buff[0] == NULL) continue;
if (buff[0][0] == '#') continue;
switch (getBuiltInCommand(buff[0])) {
case EXIT:
exit(0);
break;
case CD:
chPath = (buff[1]==NULL) ? getenv("HOME") : buff[1];
if (chdir(chPath) < 0) {
perror(": cd");
}
break;
default:
do_command(buff, bgStack);
}
//free up the malloced memory
free(inputString);
}// end of while(1)
}
static void sigIntHandler (int signum) {}
/*
* Do the command
*/
int do_command(char **args, BackgroundStack *bgStack) {
int status, statusb;
pid_t child_id, childb_id;
char **argsb;
int pipes[2];
int isBgd = isBackgrounded(args);
int hasPipe = hasAPipe(args);
if (isBgd) removeBackgroundCommand(args);
if (hasPipe) {
int cmdBi = getSecondCommandIndex(args);
args[cmdBi-1] = NULL;
argsb = &args[cmdBi];
pipe(pipes);
}
// Fork the child and check for errors in fork()
if((child_id = fork()) == -1) {
switch(errno) {
case EAGAIN:
perror("Error EAGAIN: ");
return;
case ENOMEM:
perror("Error ENOMEM: ");
return;
}
}
if (hasPipe && child_id != 0) {
childb_id = fork();
if(childb_id == -1) {
switch(errno) {
case EAGAIN:
perror("Error EAGAIN: ");
return;
case ENOMEM:
perror("Error ENOMEM: ");
return;
}
}
}
if(child_id == 0 || (childb_id == 0 && hasPipe)) {
if (child_id != 0 && hasPipe) args = argsb;
if (child_id == 0 && isBgd) {
struct sigaction new_act;
new_act.sa_handler = SIG_IGN;
sigaction(SIGINT, &new_act, 0);
}
if (child_id == 0 && hasPipe) {
if (dup2(pipes[1], 1) != 1) fatalPerror(": Pipe Redirection Output Error");
close(pipes[0]);
close(pipes[1]);
}
if (child_id != 0 && hasPipe) {
if (dup2(pipes[0], 0) != 0) fatalPerror(": Pipe Redirection Input Error");
close(pipes[0]);
close(pipes[1]);
waitpid(child_id, NULL, 0);
}
if ((child_id != 0 && hasPipe) || !hasPipe) {
if (hasAReOut(args)) {
char outFile[100];
getOutFile(args, outFile);
int reOutFile = open(outFile, O_RDWR|O_CREAT|O_TRUNC, S_IREAD|S_IWRITE);
if (reOutFile<0) fatalPerror(": Redirection Output Error");
if (dup2(reOutFile,1) != 1) fatalPerror(": Redirection Output Error");
close(reOutFile);
}
}
if ( (child_id == 0 && hasPipe) || !hasPipe) {
if (hasAReIn(args)) {
char inFle[100];
getInFile(args, inFle);
int reInFile = open(inFle, O_RDWR);
if (reInFile<0) fatalPerror(": Redirection Input Error");
if (dup2(reInFile,0) != 0) fatalPerror(": Redirection Input Error");
close(reInFile);
} else if (isBgd && !hasPipe) {
int bgReInFile = open("/dev/null", O_RDONLY);
if (bgReInFile<0) fatalPerror(": /dev/null Redirection Input Error");
if (dup2(bgReInFile,0) != 0) fatalPerror(": /dev/null Redirection Input Error");
close(bgReInFile);
}
}
// Execute the command
execvp(args[0], args);
perror(args[0]);
exit(-1);
}
// Wait for the child process to complete, if necessary
if (!isBgd) waitpid(child_id, &status, 0);
else if (!hasPipe) {
printf("Child %ld started\n", (long)child_id);
BackgroundProcess *bgPrs = malloc(sizeof(BackgroundProcess));
bgPrs->pid = child_id;
bgPrs->exitStatus = -1;
addProcessToBgStack(bgStack, bgPrs);
}
if (hasPipe) waitpid(childb_id, &statusb, 0);
if ( WIFSIGNALED(status) && !isBgd ) printf("Child %ld terminated due to signal %d\n", (long)child_id, WTERMSIG(status) );
if ( hasPipe && WIFSIGNALED(statusb) ) printf("Child %ld terminated due to signal %d\n", (long)childb_id, WTERMSIG(status) );
} // end of do_command
The second child should not wait for the first child to exit - it should just start running straight away (it will block until some output is produced on the pipe by the first child), so remove that waitpid() executed by childb. Instead, the parent process should wait for both child processes (or perhaps just the second one). (Indeed, as noted by JeremyP, this waitpid() call is failing anyway, since childb is not the parent of child).
Your problem, though, is that the parent process is mainintaining open file descriptors to the pipe. Right before the comment // Wait for the child process to complete, if necessary, the parent process should close its pipe file descriptors:
close(pipes[0]);
close(pipes[1]);
The open file descriptor in the parent means that the child grep process never sees EOF, so it doesn't exit.
I don't know the answer but I have spotted one issue.
You'll agree that the condition for
if(child_id == 0 || (childb_id == 0 && hasPipe))
is true only for the two child processes, but inside the if statement block you have this:
if (child_id != 0 && hasPipe) {
if (dup2(pipes[0], 0) != 0) fatalPerror(": Pipe Redirection Input Error");
close(pipes[0]);
close(pipes[1]);
waitpid(child_id, NULL, 0);
}
The waitpid() call is incorrect because it is called from the second child to wait for the first child. It's probably failing with ECHILD because the first child is not a child of the second child.
As for your real problem, I suspect it has to do with the fact that the grep command will not terminate until its input is closed. There might be some deadlock condition going on that stops that from happening. You need to run this in a debugger or put some logging in to see where the parent process is hanging.
Edit
caf's answer tells us everything.
I was assuming that the input to grep was being closed because ls will close its output when it terminates, but of course, the parent process also has grep's input file descriptor open. The version using head works properly because head -n 4 terminates after four lines regardless of whether its input file descriptor is closed or not.