execve("/bin/sh", 0, 0); in a pipe - c

I have the following example program:
#include <stdio.h>
int
main(int argc, char ** argv){
char buf[100];
printf("Please enter your name: ");
fflush(stdout);
gets(buf);
printf("Hello \"%s\"\n", buf);
execve("/bin/sh", 0, 0);
}
I and when I run without any pipe it works as it should and returns a sh promt:
bash$ ./a.out
Please enter your name: warning: this program uses gets() which is unsafe.
testName
Hello "testName"
$ exit
bash$
But this does not work in a pipe, i think I know why that is, but I cannot figure out a solution. Example run bellow.
bash$ echo -e "testName\npwd" | ./a.out
Please enter your name: warning: this program uses gets() which is unsafe.
Hello "testName"
bash$
I figure this has something to do with the fact that gets empties stdin in such a way that /bin/sh receives a EOF and promtly quits without an error message.
But how do I get around this (without modifying the program, if possible, and not removing gets, if not) so that I get a promt even though I supply input through a pipe?
P.S. I am running this on a FreeBSD (4.8) machine D.S.

You can run your program without any modifications like this:
(echo -e 'testName\n'; cat ) | ./a.out
This way you ensure that your program's standard input doesn't end after what echo outputs. Instead, cat continues to supply input to your program. The source of that subsequent input is your terminal since this is where cat reads from.
Here's an example session:
bash-3.2$ cc stdin_shell.c
bash-3.2$ (echo -e 'testName\n'; cat ) | ./a.out
Please enter your name: warning: this program uses gets(), which is unsafe.
Hello "testName"
pwd
/home/user/stackoverflow/stdin_shell_question
ls -l
total 32
-rwxr-xr-x 1 user group 9024 Dec 14 18:53 a.out
-rw-r--r-- 1 user group 216 Dec 14 18:52 stdin_shell.c
ps -p $$
PID TTY TIME CMD
93759 ttys000 0:00.01 (sh)
exit
bash-3.2$
Note that because shell's standard input is not connected to a terminal, sh thinks it is not executed interactively and hence does not display the prompt. You can type your commands normally, though.

Using execve("/bin/sh", 0, 0); is cruel and unusual punishment for the shell. It gives it no arguments or environment at all - not even its own program name, nor even such mandatory environment variables as PATH or HOME.

Not 100% sure of this (the precise shell being used and the OS might throw these answers a bit; I believe that FreeBSD uses GNU bash by default as /bin/sh?), but
sh may be detecting that its input is not a tty.
or
Your version of sh might go into non-interactive mode like that also if called as sh, expecting login will prepend a - onto argv[0] for it. Setting up execve ("/bin/sh", { "-sh", NULL}, NULL) might convince it that it's being run as a login shell.

Related

C program is not reading redirected shell command standard input

I have compiled the below c code which should send any input it receives in standard input to standard output. It works as expected in the following scenarios:
[user#host ~]$ ./my_program
test...
test...
^C
[user#host ~]$ echo "Hello" | ./my_program
Hello
[user#host ~]$ ./my_program < test.txt
Contents of test.txt ...
However, if I redirect the output of a shell command into my program like so:
[user#host ~]$ ./my_program <(echo "Hello")
It does not output anything and waits for input as if I started the program with just ./my_program
I expected an output of Hello and then the program to end. When I run the command cat <(echo "Hello") I get this expected result. What is causing the difference in behaviour between cat and my_program?
/* my_program.c */
#include <stdio.h>
int main()
{
int c = getchar();
while (c != EOF) {
putchar(c);
c = getchar();
}
return 0;
}
Posting Community Wiki because the question is caused by a typo and thus off-topic.
You're passing a filename associated with a pipeline with the output of echo "Hello" on your program's command line, not attaching it to stdin.
To attach it to stdin you need an extra < on the command line:
./my_program < <(echo "Hello")
It works with cat the other way because when cat is passed command-line arguments, it treats them as files to read from.

Bash reopen tty on simple program

#include <stdio.h>
#include <stdlib.h>
int main()
{
char buf[512];
fgets(buf, 512, stdin);
system("/bin/sh");
}
Compile with cc main.c
I would like a one-line command that makes this program run ls without it waiting for user input.
# This does not work - it prints nothing
(echo ; echo ls) | ./a.out
# This does work if you type ls manually
(echo ; cat) | ./a.out
I'm wondering:
Why doesn't the first example work?
What command would make the program run ls, without changing the source?
My question is shell and OS-agnostic but I would like it to work at least on bash 4.
Edit:
While testing out the answers, I found out that this works.
(python -c "print ''" ; echo ls) | ./a.out
Using strace:
$ (python -c "print ''" ; echo ls) | strace ./a.out
...
read(0, "\n", 4096)
...
This also works:
(echo ; sleep 0.1; echo ls) | ./a.out
It seems like the buffering is ignored. Is this due to the race condition?
strace shows what's going on:
$ ( echo; echo ls; ) | strace ./foo
[...]
read(0, "\nls\n", 4096) = 4
[...]
clone(child_stack=NULL, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7ffdefc88b9c) = 9680
In other words, your program reads a whole 4096 byte buffer that includes both lines before it runs your shell. It's fine interactively, because by the time the read happens, there's only one line in the pipe buffer, so over-reading is not possible.
You instead need to stop reading after the first \n, and the only way to do that is to read byte by byte until you hit it. I don't know libc well enough to know if this kind of functionality is supported, but here it is with read directly:
#include <unistd.h>
#include <stdlib.h>
int main()
{
char buf[1];
while((read(0, buf, 1)) == 1 && buf[0] != '\n');
system("/bin/sh");
}

Why does using pipes with `who` cause mom not to like me?

In a program I'm writing, I fork() and execl() do determine who mom likes. I noticed that if I set up pipes to write to who's stdin, it produces no output. If I don't set up pipes to write to stdin, then who produces output as normal. (yes, I know, writing to who's stdin is pointless; it was residual code from executing other processes that made me discover this).
Investigating this, I wrote this simple program (edit: for a simpler example, just run: true | who mom likes):
$ cat t.c:
#include <unistd.h>
#include <assert.h>
int main()
{
int stdin_pipe[2];
assert( pipe(stdin_pipe) == 0);
assert( dup2(stdin_pipe[0], STDIN_FILENO) != -1);
assert( close(stdin_pipe[0]) == 0);
assert( close(stdin_pipe[1]) == 0);
execl("/usr/bin/who", "/usr/bin/who", "mom", "likes", (char*)NULL);
return 0;
}
Compiling and running results in no output, which is what surprised me initially:
$ cc t.c
$ ./a.out
$
However, if I compile with -DNDEBUG (to remove the piping work in the assert()s) and run, it works:
$ cc -DNDEBUG t.c
$ ./a.out
batman pts/0 2014-08-15 12:57 (:0)
$
As soon as I call dup2(stdin_pipe[0], STDIN_FILENO), who stops producing output. The only explanation I could come up with is that dup2 affects the tty, and who uses the tty do determine who I am (given the -m flag prints "only hostname and user associated with stdin"). My main question is:
Why can't who mom likes/who am i/who -m determine who I am when I give it a pipe for stdin? What mechanism is it using to determine its information, and why does using a pipe ruin this mechanism? I know it's using stdin somehow, but I don't understand exactly how or exactly why stdin being a pipe matters.
Let's look at the source code for GNU coreutils who:
if (my_line_only)
{
ttyname_b = ttyname (STDIN_FILENO);
if (!ttyname_b)
return;
if (STRNCMP_LIT (ttyname_b, DEV_DIR_WITH_TRAILING_SLASH) == 0)
ttyname_b += DEV_DIR_LEN; /* Discard /dev/ prefix. */
}
When -m (my_line_only) is used, who finds the tty device connected to stdin, and then proceeds to finds the entry for that tty in utmp.
When stdin is not a terminal, there is no name to look up in utmp, so it exits without printing anything.

Redirecting output of a C program as input of another program in Linux command shell

I wrote a program p1.c which takes input from the linux command shell (Using- char n=argv[1]). I want the character output of p1.c to be taken as input of program p2.c . How can I do this? I used the command
./p2.out < ./p1.out T > output.txt. It doesn't seem to work as 'T' is taken as input for p2.out and its output is written in output.txt.
Use pipeline: ./p1.out T | ./p2.out

Why first arg to execve() must be path to executable

I understand that execve() and family require the first argument of its argument array to be the same as the executable that is also pointed to by its first argument. That is, in this:
execve(prog, args, env);
args[0] will usually be the same as prog. But I can't seem to find information as to why this is.
I also understand that executables (er, at least shell scripts) always have their calling path as the first argument when running, but I would think that the shell would do the work to put it there, and execve() would just call the executable using the path given in its first argument ("prog" from above), then passing the argument array ("args" from above) as one would on the command line.... i.e., I don't call scripts on the command line with a duplicate executable path in the args list....
/bin/ls /bin/ls /home/john
Can someone explain?
There is no requirement that the first of the arguments bear any relation to the name of the executable:
int main(void)
{
char *args[3] = { "rip van winkle", "30", 0 };
execv("/bin/sleep", args);
return 1;
}
Try it - on a Mac (after three tests):
make x; ./x & sleep 1; ps
The output on the third run was:
MiniMac JL: make x; ./x & sleep 1; ps
make: `x' is up to date.
[3] 5557
PID TTY TIME CMD
5532 ttys000 0:00.04 -bash
5549 ttys000 0:00.00 rip van winkle 30
5553 ttys000 0:00.00 rip van winkle 30
5557 ttys000 0:00.00 rip van winkle 30
MiniMac JL:
EBM comments:
Yeah, and this makes it even more weird. In my test bash script (the target of the execve), I don't see the value of what execve has in arg[0] anywhere -- not in the environment, and not as $0.
Revising the experiment - a script called 'bash.script':
#!/bin/bash
echo "bash script at sleep (0: $0; *: $*)"
sleep 30
And a revised program:
int main(void)
{
char *args[3] = { "rip van winkle", "30", 0 };
execv("./bash.script", args);
return 1;
}
This yields the ps output:
bash script at sleep (0: ./bash.script; *: 30)
PID TTY TIME CMD
7804 ttys000 0:00.11 -bash
7829 ttys000 0:00.00 /bin/bash ./bash.script 30
7832 ttys000 0:00.00 sleep 30
There are two possibilities as I see it:
The kernel juggles the command line when executing the script via the shebang ('#!/bin/bash') line, or
Bash itself dinks with its argument list.
How to establish the difference? I suppose copying the shell to an alternative name, and then using that alternative name in the shebang would tell us something:
$ cp /bin/bash jiminy.cricket
$ sed "s%/bin/bash%$PWD/jiminy.cricket%" bash.script > tmp
$ mv tmp bash.script
$ chmod +w bash.script
$ ./x & sleep 1; ps
[1] 7851
bash script at sleep (0: ./bash.script; *: 30)
PID TTY TIME CMD
7804 ttys000 0:00.12 -bash
7851 ttys000 0:00.01 /Users/jleffler/tmp/soq/jiminy.cricket ./bash.script 30
7854 ttys000 0:00.00 sleep 30
$
This, I think, indicates that the kernel rewrites argv[0] when the shebang mechanism is used.
Addressing the comment by nategoose:
MiniMac JL: pwd
/Users/jleffler/tmp/soq
MiniMac JL: cat al.c
#include <stdio.h>
int main(int argc, char **argv)
{
while (*argv)
puts(*argv++);
return 0;
}
MiniMac JL: make al.c
cc al.c -o al
MiniMac JL: ./al a b 'c d' e
./al
a
b
c d
e
MiniMac JL: cat bash.script
#!/Users/jleffler/tmp/soq/al
echo "bash script at sleep (0: $0; *: $*)"
sleep 30
MiniMac JL: ./x
/Users/jleffler/tmp/soq/al
./bash.script
30
MiniMac JL:
That shows that it is the shebang '#!/path/to/program' mechanism, rather than any program such as Bash, that adjusts the values of argv[0]. So, when a binary is executed, the value of argv[0] is not adjusted; when a script is executed via the shebang, the argument list is adjusted by the kernel; argv[0] is the binary listed on the shebang; if there is an argument after the shebang, that becomes argv[1]; the next argument is the name of the script file, followed by any remaining arguments from the execv() or equivalent call.
MiniMac JL: cat bash.script
#!/Users/jleffler/tmp/soq/al -arg0
#!/bin/bash
#!/Users/jleffler/tmp/soq/jiminy.cricket
echo "bash script at sleep (0: $0; *: $*)"
sleep 30
MiniMac JL: ./x
/Users/jleffler/tmp/soq/al
-arg0
./bash.script
30
MiniMac JL:
According to this, the first argument being the program name is a custom.
by custom, the first element should be
the name of the executed program (for
example, the last component of path)
That said, these values could be different. If for example, the program was launched from a symbolic link. The program name might be different than that of the link used to launch it.
And, you are right. The shell would normally do the work of setting up the first argument. In this case however, the use of execve circumvents the shell altogether - which is why you need to set it up yourself.
It allows you to specify the exact path to the executable to be loaded, but also allows for a "beautified" name to be presented in tools such as ps or top.
execl("/bin/ls", "ls", "/home/john", (char *)0);
That allows a program to have many names and work slightly differently depending on using which name it was called.
Imaging trivial program, e.g. print0.c compiled into print0:
#include <stdio.h>
int main(int argc, char **argv)
{
printf("%s\n",argv[0]);
return 0;
}
Running it as ./print0 would print ./print0 Make a symbolic link e.g. print1 to it and now use name ./print1 to run it - it would print "./print1".
Now that was with a symlink. But with exec*() function, you can tell program its name explicitly.
Artifact from *NIX, but nice to have nevertheless.

Resources