I have two problems: a program I made am glitching out and making nearly unkillable processes, if either subproblem is solved I believe both problems will be easily resolved. I am running an early 2008 Macbook on OSX 10.6.8.
/Problem #1, coding:/
I've been playing around with the iRobot Create using termios.h I/O. I compile without warnings or errors and can run the program without a hitch. I am connecting to the robot by usb which explains the input "/dev/tty.usbserial".
gcc simple.c -o simple
./simple /dev/tty.usbserial
The program starts by checking the arguments given, then tries to connect with the given argument (char * device is /dev/tty.usbserial) with the biscConnect() function. It fails on my mac.
//in file simple.c:
int main(int argc, char *argv[]){
if(argc!=2){
printf("Put port... like /dev/tty.usbserial or something\n");
exit(EXIT_FAILURE);
}
printf("Starting...\n");
biscConnect(argv[1]);
}
void biscConnect(char *device){
struct termios tty;
// Try to open the device from the input
if((fd = open(device, O_RDWR | O_NOCTTY | O_NONBLOCK))==-1){
fprintf(stderr, "Serial port at %s could not be opened: %s\n", device, strerror(errno));
exit(EXIT_FAILURE);
}
tcflush (fd, TCIOFLUSH);
tcgetattr(fd, &tty);
tty.c_iflag = IGNBRK | IGNPAR;
tty.c_lflag = 0;
tty.c_oflag = 0;
tty.c_cflag = CREAD | CS8 | CLOCAL;
cfsetispeed(&tty, B57600);
cfsetospeed(&tty, B57600);
tcsetattr(fd, TCSANOW, &tty);
//The code fails prior to this point
}
I would then send bytes to the robot to make it move if it didn't get stuck before then.
/Problem #2, unkillable processes:/
When I run the file, the terminal goes into a weird mode where the prompt is gone and I can type anything I want (usually signifying a process is running). I cannot exit using control-c. The only way I can seem to exit is closing the terminal window. This fails to kill the running process.
I can easily look up the pid but the Activity Monitor but the Force Quit fails to kill the process, kill -9 [pid] fails, killall [program name] fails etc. despite acknowledging the existence of the program. The only way to force terminate the process seems to be to physically close off the power from the computer and reboot it (ie shuting down doesn't work because it tries, and fails, to terminate the process(es)). I am wasting a terrible amount of time if to debug the program I need to power-cycle my laptop every run! I can continually create more process but am unable to delete them.
I think if I knew the parent process I might be able to kill these "zombie" processes but I don't know what the parent is.
Any ideas on how to get rid of these processes without power-cycling would be tremendous help, thanks!
On Mac OS X 10.8.4, I created a program zombie from zombie.c:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
pid_t pid;
if ((pid = fork()) >= 0)
{
if (pid == 0)
{
printf("%d: committing suicide (parent %d)\n", (int)getpid(), (int)getppid());
exit(0);
}
else
{
printf("%d: going to sleep for a while - child %d might die while I snooze\n",
(int)getpid(), (int)pid);
sleep(30);
printf("%d: awake\n", (int)getpid());
}
}
return 0;
}
When I ran it in one terminal window, the output was:
$ ./zombie
2443: going to sleep for a while - child 2444 might die while I snooze
2444: committing suicide (parent 2443)
2443: awake
$
In another terminal window, running ps -f produced:
UID PID PPID C STIME TTY TIME CMD
503 260 249 0 12:42PM ttys004 0:00.08 -bash
503 2443 260 0 5:11PM ttys004 0:00.00 ./zombie
503 2444 2443 0 5:11PM ttys004 0:00.00 (zombie)
The parenthesize (zombie) is the defunct process, and the name in parentheses is the original name of the process. When I copied the program to living-dead, the corresponding output was:
UID PID PPID C STIME TTY TIME CMD
503 260 249 0 12:42PM ttys004 0:00.09 -bash
503 2454 260 0 5:13PM ttys004 0:00.00 ./living-dead
503 2455 2454 0 5:13PM ttys004 0:00.00 (living-dead)
(On most systems, the defunct process is marked as <defunct> or something similar.)
Clearly, the value in the PPID column identifies the parent process of the zombie, and the various process IDs match the output from the programs themselves.
You can find out who birthed that undead process by running ps ef or using htop.
PID TTY STAT TIME COMMAND
21138 tty1 T 0:00 sudo /usr/bin/aura -Akax open_watcom openwatcom-extras-hg HOME=/home/hav3lock USER=hav3lock SHELL=/bin/zsh TERM=linux PATH=/usr/local/sbin:/u
21139 tty1 T 0:00 \_ /usr/bin/aura -Akax open_watcom openwatcom-extras-hg TERM=linux PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/
22111 tty1 Z 0:00 \_ [su] <defunct>`
In the above I have a zombie process who's momma is /usr/bin/aura, a process spawned by sudo /usr/bin/aura.
Running sudo kill -9 21138 will kill the parent, thereby subsequently killing its zombie spawn.
PID TTY STAT TIME COMMAND
23858 pts/9 S+ 0:00 sudo sudo ps ef LANG=en_US.utf8 DISPLAY=:0 SHLVL=3 LOGNAME=hav3lock XDG_VTNR=1 PWD=/home/hav3lock/sy.l/repos/pub_rel/slen/linux/src HG=/usr/b
23860 pts/9 S+ 0:00 \_ sudo ps ef LANG=en_US.utf8 DISPLAY=:0 LS_COLORS=no=0:fi=0:di=34:ln=00;96:or=91;4:mi=00;31;9:mh=01;37:pi=33;7:so=01;35:do=35:bd=01;33:cd=9
23861 pts/9 R+ 0:00 \_ ps ef LANG=en_US.utf8 DISPLAY=:0 LS_COLORS=no=0:fi=0:di=34:ln=00;96:or=91;4:mi=00;31;9:mh=01;37:pi=33;7:so=01;35:do=35:bd=01;33:cd=93
13405 tty2 S<sl+ 3:49 X :0 HOME=/home/hav3lock USER=hav3lock SHELL=/bin/zsh TERM=linux PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/ve`
Related
So I am writing my own shell in a C program (basically same as bash) and I am working on the sleep command. I want to be able to run the sleep command in the background, thus if I type in "sleep 10 &", it will be nonblocking and allow me to continue using the shell.
BUT when the background sleep command finishes (after 10 seconds), instead of terminating, it turns into a zombie process, why? I want it to just be completely gone.
Also, how do I print a message AFTER the background process has finished running?
Below is my sleep function that just calls the execl function:
void sleepFunc(char *secondsInput)
{
char *bin_path = "/bin/sleep";
char *args[] = {bin_path, secondsInput, NULL};
execl("/bin/sleep", "sleep", secondsInput, NULL);
perror("execl");
}
Below is my code that forks a child process and uses child process to call the sleep function:
int childStatus;
pid_t childPid = fork();
int childProcess_pid;
if (childPid == -1)
{
perror("fork() failed!");
exit(1);
break;
}
else if (childPid == 0)
{
// Child process executes this branch
childProcess_pid = getpid();
sleepFunc(secondsInput);
break;
}
else
{
// The parent process executes this branch
printf("background pid is %d\n", childPid);
// WNOHANG specified. If the child hasn't terminated, waitpid will immediately return with value 0
childPid = waitpid(childPid, &childStatus, WNOHANG);
printf("Background process %s has completed!\n", childPid); // I want this print statement to print only AFTER the background process completes (so after x amount of seconds)
}
}
My output is shown below. First "ps" command is executed immediately after typing "sleep 10 &", and the second "ps" command is executed after 10 seconds. Child process becomes a zombie, and nothing gets printed to the terminal letting me know it has completed. Any suggestions?
: sleep 10 &
background pid is 52002
In the parent process waitpid returned value 0
: ps
PID TTY TIME CMD
45433 ttys001 0:00.50 -zsh
25223 ttys003 0:08.71 /bin/zsh -l
51996 ttys003 0:00.00 ./a.out
52002 ttys003 0:00.00 sleep 10
: ps
PID TTY TIME CMD
45433 ttys001 0:00.50 -zsh
25223 ttys003 0:08.71 /bin/zsh -l
51996 ttys003 0:00.00 ./a.out
52002 ttys003 0:00.00 (sleep)
:
In my application I require to start Busybox udhcpd (dhcp server), the code is below. While udhcpd does start and run I get two versions in the process list. udhcpd is running correctly, i.e. assigned IP addresses to devices.
pid_t forked_pid = vfork();
if ( forked_pid == 0 )
{
// Child process, execute udhcpd.
execl( "/usr/bin/udhcpd",
"udhcpd",
"/var/run/udhcpd.conf", // the location of the udhcpd config file
NULL );
}
else if ( forked_pid > 0 )
{
// Parent process, record the childs pid
m_udhcpd_pid = forked_pid;
log( Log_Info, "UDHCPD started with PID: %d (PID=%d)", forked_pid, getpid());
}
else
{
log( Log_Warning, "Failed to start UDHCPD" );
}
Log Output
UDHCPD started with PID: 647 (PID=528)
PS output
528 root 0:03 ./MyApp
647 root 0:00 [udhcpd]
648 root 0:00 udhcpd /var/run/udhcpd.conf
Now if I look at /var/run/udhcpd.pid it has the pid of 648. In another part of our code we start dhcpcd (dhcp client) using the same code as above and it only has one entry in the process list. Can anyone explain what the difference is and if I am doing things incorrectly what I should be doing?
The reason for asking is I require to later stop udhcpd and it seems I will need to stop using both the childs pid (647) and also the pid read from /var/run/udhcpd.pid (648).
I believe the answer is udhcpd does another fork leaving a zombie process. Reverted to just doing a system call and killing the process using the PID in the PID file.
Trying to simulate the functionality of the | command line argument in Linux. I've already parsed the arguments properly but program control is not returning to my main process and I've got one uninterruptible process and a zombie - which I don't understand how since I figured if I spawned the processes they should terminate on their own as they are just regular linux processes. I added the processes below. I'm just running ps aux | grep notepad. I posted previously here How do I create a grep process with fork that will accept data from a pipe in Linux C programming but this issue is different as I have the correct output I just don't want the processes to hang.
1000 4074 0.0 0.0 4392 824 pts/0 S+ 21:38 0:00 grep notepad
1000 4075 0.0 0.0 0 0 pts/0 Z+ 21:38 0:00 [ps] <defunct>
1000 4076 0.0 0.0 4944 1172 pts/0 R+ 21:38 0:00 ps aux
int ppid = fork ();
if(ppid == 0)
{
pid = fork();
//Parent assume execution control
if(ppid == 0 && pid != 0)
{
//Close the parents in and redirect to pipe
close(0);
dup(pfds[0]);
close(pfds[1]);
execvp(secondargs[0], secondargs);
perror("exect failed to");
exit(-1);
}
//C1 execute first line of command line
else if(pid == 0)
{
close(1); //close stdout
dup(pfds[1]); // make stdout pfds[1]
close(pfds[0]);
//execute the args
execvp(args[0], args);
perror("exec failed to");
exit(-1);
}
}
Processes must be wait()ed for. The zombie entry exists to retain the process return code until the wait() picks it up, since the RC might carry important information about how the child process exited.
If you don't want to deal with keeping appropriate wait()s spinning until the children exit, one standard trick is to "double fork" -- spawn an intermediate process which launches the desired child process and then kills itself. That results in the child being disowned, at which point it becomes child of the system process and the system-default wait() handling takes over to absorb the zombie and discard the return code.
Websearch for "unix fork zombie wait", or some combination of those terms, will find examples and more extensive discussion of the issue.
I'm fairly sure that the problem is that you create the pipe before you fork any children (since there is no pipe() call in the code fragment).
Unfortunately, you don't close the pipe in the original parent process, so it can still write to the write end of the pipe. Therefore, even though the ps has exited, the system knows that the parent could still write on the pipe (even though it won't) so grep never gets told EOF.
In other words, grep is waiting for the original parent process to close the pipe, and the original parent process is waiting for grep to finish (or, at least, it might be). If the original parent exits, then grep will get EOF and will exit, and the system will clean up both corpses. If the original parent is waiting, then it is going to wait for a very long time.
I have written C program that uses fork(2) and execl(3) to run ssh for port forwarding purposes.
The ssh's are run in the background the -f option.
When the C program exits, I want it to send SIGTERM to the ssh instances it spawned.
I have tried
// creating the ssh process
ssh_pid = fork();
if (ssh_pid == 0)
execl("/usr/bin/ssh", "/usr/bin/ssh", "-f", other options, NULL)
// ...
printf("Killing %d\n", ssh_pid); // <--- output the PID
kill(ssh_pid, 15);
sleep(1); // give it a chance to exit properly
if (kill(ssh_pid, 0) == 0)
kill(ssh_pid, 9); // the "shotgun" approach
However, this doesn't work (even with the SIGKILL).
If I run ps before the program exits
ps aux | grep ssh | grep -v sshd | grep -v grep
I see something like this:
user 27825 0.2 0.0 0 0 pts/0 Z+ 18:23 0:00 [ssh] <defunct>
user 27834 0.0 0.0 41452 1176 ? Ss 18:23 0:00 /usr/bin/ssh -f [other options]
When the program prints the PID it is killing, I see this:
Killing 27825
Subsequently repeating the ps gives me:
user 27834 0.0 0.0 41452 1176 ? Ss 18:23 0:00 /usr/bin/ssh -f [other options]
It seems that the original ssh has forked itself in order to become a background process.
So I changed my call to kill(2) to attempt to kill all processes spawned by the original ssh:
kill(-ssh_pid, 15);
But this appears to have no effect. I suspect it is because the original ssh is no longer the parent of the backgrounded ssh.
So, how do I safely kill the backgrounded ssh? Is it even possible?
The solution I have just found is not to use the -f option at all, and background the ssh myself.
ssh_pid = fork();
if (ssh_pid == 0)
{
freopen("/dev/null", "r", stdin);
freopen("/dev/null", "w", stdout);
freopen("/dev/null", "w", stderr);
execl("/usr/bin/ssh", "/usr/bin/ssh", other options, NULL);
}
Because when ssh gets the -f option it creates a child, and signals sent to the parent won't get passed to the child.
My program is as follow:
//... init fd[2] as pipe ...
if (child==0){
close(fd[1]);
dup2(fd[0], 0);
execlp("/bin/sh","sh",NULL);
} else {
close(fd[0]);
char *line; int nbytes=100; int bytes=0;
line=(char*) malloc(nbytes+1);
while ( (bytes = getline((char **)&line,&nbytes,stdin))!= -1 ){
write(fd[1],line, bytes);
}
}
This run OK, however when I try to replace exec("/bin/sh","sh",NULL) with exec("/bin/sh","sh","-i",NULL) to force an interactive shell, my program stop after executing the first command.
I'm new to pipe so please help me understand the reason and make interactive shell work... I also feel that my code to read line and pass to the child pipe is a bit odd.. is there any better way to achieve the same behavior ?
You should close(fd[0]); after the dup2() in the child. If you supply an absolute or relative path like "/bin/sh", there's no point in using execlp(); it will only do a PATH-based search for a bare filename (program name). The cast in the call to getline() should be unnecessary; avoid such casts whenever possible. You should include at least exit(1); after execlp() just in case it fails; a diagnostic message would be a good idea too. You should close(fd[1]); after the loop in the parent to indicate EOF to the child. (Just for once, it doesn't matter if you don't detect the error return from malloc(); it is legitimate to pass the address of a pointer where the pointer holds NULL to the getline() function, and it will then try to allocate memory itself. Of course, if the main program fails to allocate memory, it is highly likely that getline() will also fail to allocate memory.)
Those changes lead to:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(void)
{
int fd[2];
pid_t child;
if (pipe(fd) != 0)
perror("pipe");
else if ((child = fork()) < 0)
perror("fork");
else if (child == 0)
{
close(fd[1]);
dup2(fd[0], 0);
close(fd[0]);
execl("/bin/sh", "sh", NULL);
perror("oops");
exit(1);
}
else
{
close(fd[0]);
size_t nbytes = 100;
int bytes = 0;
char *line = (char*)malloc(nbytes+1);
while ((bytes = getline(&line, &nbytes, stdin)) != -1)
{
write(fd[1], line, bytes);
}
close(fd[1]);
}
return(0);
}
This compiles without complaint under stringent compilation flags:
gcc -O3 -g -std=c99 -Wall -Wextra xf.c -o xf
When run (on Mac OS X 10.7.3) with the code above (invoking sh without the -i option), things behave reasonably sanely. You can type commands and the shell executes them. You can type 'exit' and the shell exits, but the program you wrote (which I called xf) doesn't exit until I type a new command. It then exits because of a SIGPIPE signal as it writes to the now readerless pipe. There is no prompt from this shell because its standard input is not a terminal (it is a pipe).
When the sub-shell is run with the -i option, then there seems to be a fight between the job control shells about which shell is in charge of the terminal. When I run it, I get:
$ ps -f
UID PID PPID C STIME TTY TIME CMD
503 381 372 0 Wed08PM ttys001 0:00.07 -sh
503 21908 381 0 9:32PM ttys001 0:00.01 sh
$ ./xf
sh-3.2$
[1]+ Stopped(SIGTTIN) ./xf
$
$ ps -f
UID PID PPID C STIME TTY TIME CMD
503 381 372 0 Wed08PM ttys001 0:00.07 -sh
503 21908 381 0 9:32PM ttys001 0:00.01 sh
503 22000 21908 0 9:36PM ttys001 0:00.00 ./xf
503 22001 22000 0 9:36PM ttys001 0:00.00 sh -i
$ ls
awk.data osfile-keep.c pthread-2.c send.c xf
const-stuff.c perl.data pthread-3.c so.8854855.sql xf.c
fifocircle.c piped-merge-sort.c quine.c strandsort.c xf.dSYM
madump.c powa.c recv.c unwrap.c xxx.sql
makefile pthread-1.c regress.c vap.c yyy.sql
$ jobs
[1]+ Stopped(SIGTTIN) ./xf
$ fg %1
./xf
exit
$
(The initial -sh is the login shell for my terminal window. In that, I've run sh to get a sub-shell, and I've set the prompt PS1='$ ' to make the prompt distinctive.)
AFAICT, the sh-3.2$ prompt comes from the sh -i shell. The parent shell seems to be reading input, and has dumped the xf program into the background, which is not very civilized of it. The ps -f output doesn't show the ps command, which is a nuisance. I did manage to get the ls command to show up in a ps listing in one run, and it was the child of the original shell, not the sh -i run by xf. When I bring xf into the foreground, it immediately exits (presumably it reads 0 bytes from standard input, which indicates EOF, and so getline() returns -1, and everything shuts up shop. The exit is from the sh -i; it echoes it. It never got any input because the sh shell took command instead of letting xf have control of the terminal. That's pretty excruciatingly messy. I'm not sure why it happens like that, but it feels to me like it shouldn't happen like that.