We know that each time a user runs a program by typing the name of an executable object file to the shell, the shell creates(use fork) a new process and then loads(use execve) and runs the executable object file in the context of this new process.
Below is my understanding of how a shell works internally, please correct me if I was wrong:
Commands such as ls, cat etc are executable objects (source file written by C) are in /bin/ directory. for example, when a user type in bash shell ls to list files and directories, the bash shell inteprets ls command and fork a child process to run ls
Q1-Is my understanding correct?
Q2-if my understanding is correct, then when a shell run .sh script file which is:
#!/bin/sh
echo "what is your name?"
read name
so the shell forks two child processes for echo and read, then how does these two processes communicate with each other? I mean how does the return output of echo process get passed to read process?
Is my understanding correct?
Generally, yes. But the executable file not necessarily is in /bin/. A file named ls is searched in paths specified inside PATH environment variable and the match is used. The file can be in /usr/bin /usr/sbin /usr/local/bin etc.
And ls may be a builtin. Or a function. Or an alias.
so the shell forks two child processes for echo and read
And this is where a "built-in" comes in. A built-in is an internal part of the shell handled internally by the shell. There is no fork, just some internal code is run and that way it can modify the environment variables. echo not necessarily is a builtin, it only outputs data. But read has to be handled specially and most probably is a builtin for it to modify name variable (there is no requirement for read to be builtin, it may not be, but usually shell writers solve this problem by just making read a builtin).
On bash you can check the type of command with type. Ex. type echo shows echo is a shell builtin.
I mean how does the return output of echo process get passed to read process?
It doesn't.
You may want to read posix Command Search and Execution.
Related
I want to execute a shell command with execlp, I tried with the following instruction :
execlp("sh", "sh", "-c", p_command, (char*)NULL);
p_command is a pointer to a const char representing a shell command line.
My minimal test tells me the program succeded as expected. I first choose to use "/bin/sh" instead of "sh" but I've learned that p(ath) in execlp allows us to avoid writing the full path, as if exec will complete the path for us ; so I removed "/bin/".
My concern is that I never saw a code using execlp with only "sh", as it effectively does for exemple for ls we can directly use "ls" instead of "/bin/ls".
As a beginner I am wondering what "/bin/sh" stands for, what is the difference between "sh" and "/bin/sh" in this situation and why we have to write the full path for execlp to execute a shell ?
When the path passed to execlp is sh, execlp searches for it in the directories listed in the PATH environment variable. If an attacker is able to modify the PATH variable in the environment that runs your program, they can set it to list a directory of their choosing, and they can place their own program named sh in that directory. Then your program will execute their program instead of executing the system sh program. In some cases (depending on a bit in the file’s mode bits), programs are executed with the permissions of their owners rather than the permissions of the user executing the program. Such programs must be written carefully to avoid situations like this, where an attacker would be able to exploit the program.
When the path passed to execlp is /bin/sh, execlp looks for it in the path that is /bin/sh starting from the root of the file system, called /. This will always use the sh program that the system administrator has put in the /bin directory (usually done as part of system installation).
For each command in a shell script passed to exec(), is it forked and ran in a child process?
Say I have a shell script called test.sh with the following contents;
#!/bin/bash
echo Hello
echo There
What I want to know is how the series of commands is treated if I were to call execvp() on test.sh in a C program.
For each command within the script, is there a fork followed by another exec call on that command, before a return to the parent and a repeat for the next command?
So far I have used strace on this exact example. My findings are that if I put two echos into the script, there are no calls to clone() (which I believe equates to forking?), but if I put two separate cats as such:
#!/bin/bash
cat file1
cat file2
Then I find two calls to clone in the strace. At the same time, stracing a singular cat call on its own, without running it from an execvp call on a shell script, does not yield any clones in the strace.
I would really appreciate a clarification on the way which exec calls handle shell scripts.
The bash shell has a number of built-in commands that are executed in the same process as the shell. The echo command is one of those built in commands.
cat on the other hand is at external program, so the shell must fork and exec to create a process and run the program.
Some commands are built into the shell, so they don't need to fork. All the commands that implement control flow (e.g. if, while, case) have to be built in. So do the commands that change the shell process's state, such as cd and ulimit.
In addition, a number of simple commands are implemented as shell built-ins; these include echo and printf. So you won't see any forks in the first script with two echo commands.
The type command (itself a built-in) can be used to show which commands are built-ins. For example, type echo reports echo is a shell builtin, but type cat reports cat is /bin/cat.
Finally, the exit status of a script is the same as the exit status of the last command. So as an optimization, it doesn't fork for the last command, it simply execs it. That's why you see forks when the script has two cat commands, but not when there's only one.
I'm using macOS and I noticed (via a separate article) that the cat command is written in C. But I'm sure I've read elsewhere that some shell commands (builtins?) are written in Bash.
How can you tell the difference?
UPDATE: seems I was misinformed and that no builtin commands are written in bash. What I must have read was something related to an external executable.
Use the 'file' command to determine the type of file.
Built-ins are not written in bash. The are intrinsically part of the command interpreter (which is often bash). Example: 'cd'. The 'file' command will not be able to find a built-in and will give an error.
the difference between a bash builtin and an executable is that when calling from a bash process a builtin is a function call whereas an external command forks a new process (if not in background waits for termination).
note the overhead of calling a new process
for((i=0;i<1000;i++)); do /bin/echo -n ; done
to know if a command is a builtin or an executable you can use type
type cat
type -a echo
to explicitly call echo builtin
builtin echo
to explicitly call echo command
command echo
note commands that changes process environment like cd can't be an executable because calling a sub process can't change caller's environment.
I have an occasion where a C program invokes a shell script, which in-turn does some copying stuff from the CD mount location to an installation directory.
Now my question is that, is there a straightforward approach to get the absolute path of this C program inside this shell script ?.
I tried a couple of approaches that includes using "$(ps -o comm= $PPID)" from within the script, but nothing did work out till now. I know that I can create a temporary file from the C program which contains its own name (argv[0]) and then make the shell script to read that file, but I don't want to follow that approach here.
Of course, it can be passed as an argument to the script, but I was thinking why the bash built-in macros or something cannot be used here
On linux there is a /proc/self/exe path that points the absolute path of the current executed file. So you can push an environment variable that contains the path before spawning the shell. Something like:
readlink("/proc/self/exe",...,buf);
putenv("MYEXE",buf);
system("thescript");
and accessing the variable in the script:
echo $MYEXE
Before running a foo command you could use which like
fooprog=$(which foo)
to get the full path of the program (scanning your $PATH). For example which ls could give /bin/ls ....
On Linux specifically you could use proc(5).
In your shell process (running bash or some POSIX compliant shell) started by your C program, $PPID give the parent process id, hopefully the pid of the process running your C program.
Then the executable is /proc/$PPID/exe which is a symbolic link. Try for example the ls -l /proc/$PPID/exe command in some terminal.
(notice that you don't run C source files or stricto sensu C programs, you often run some ELF executable which was built by compiling C code)
You might have weird cases (you'll often ignore them, but you might decide to handle them). Someone might move or replace or remove your executable while it is running. Or the parent process (your executable) died prematurely, so the shell process becomes orphan. Or the executable removed itself.
I have executed following command on shell
sw0:root> pwd
/root
sw0:root> echo $(history 1)
2 echo $(history 1)
sw0:root>
Now i call system system call in a c file as shown below
system (" echo \"___history1 = $(history 1)____\"");
Output:
___history1 = ____
What i have tried is i try to read the last history command of a shell from C using system system call.
Please clarify following doubts
Why i'm unable to read last history command executed in shell from c file?
Is it because when i call system system call,it forks a new shell?
If so, how do i achieve this? Reading command output of 1 shell from other?
When you run your program, it runs in a subshell and does not inherit the history of the calling shell.
You can compare this to running bash -c history, you get no result.
You open shell:1 now perfrom some commands.
Now close that shell.
After that open new shell2 and use that system() command it will have that information of commands executated in shell1
Untill you close current shell, its history is not flushed in global history file.
Yes system will open its own context.
OP: i executed this command on main shell that runs from the bootup of
the system,so is there any way i could manually flush it?
For that you need to add this line in your .bashrc file
export PROMPT_COMMAND='history -a'
see: http://www.aloop.org/2012/01/19/flush-commands-to-bash-history-immediately/