What is happening internally when we called any shell commands - c

Can any one please help me in understanding the code/Steps flow internally when we are calling any shell command. For example suppose I run the follwoing on bourne shell:
ls -l | grep -r "string"
What are the function calls happening internally?
As far as I know it will call some execv family functions internally. But can anyone tell me what are the other function call it will make and what will be the sequence of that?

You can take a look yourself at what happens by using the strace utility. Run it with:
strace sh -c 'ls -l | grep -r "string"'
This will run a shell that in turn will run your command, and at the end strace will print out what's happening behind the scenes in terms of system calls.

In short:
parsing and lexical analysis
expansion
brace expansion
tidle expansin
variable expansion
artithmetic and other subrstitutions
word splitting
filename generation/expansion
execution
bash fork itself (once for every command)
restore the SIGINT handler to default
opens pipes between commands (dups stdin, stdout)
closes original stdin/stdout
exec each child with the command
parent bash waits...
maybe others will add more precise "steps"...

Related

Do exec() calls fork for each command if passed a shell script?

For each command in a shell script passed to exec(), is it forked and ran in a child process?
Say I have a shell script called test.sh with the following contents;
#!/bin/bash
echo Hello
echo There
What I want to know is how the series of commands is treated if I were to call execvp() on test.sh in a C program.
For each command within the script, is there a fork followed by another exec call on that command, before a return to the parent and a repeat for the next command?
So far I have used strace on this exact example. My findings are that if I put two echos into the script, there are no calls to clone() (which I believe equates to forking?), but if I put two separate cats as such:
#!/bin/bash
cat file1
cat file2
Then I find two calls to clone in the strace. At the same time, stracing a singular cat call on its own, without running it from an execvp call on a shell script, does not yield any clones in the strace.
I would really appreciate a clarification on the way which exec calls handle shell scripts.
The bash shell has a number of built-in commands that are executed in the same process as the shell. The echo command is one of those built in commands.
cat on the other hand is at external program, so the shell must fork and exec to create a process and run the program.
Some commands are built into the shell, so they don't need to fork. All the commands that implement control flow (e.g. if, while, case) have to be built in. So do the commands that change the shell process's state, such as cd and ulimit.
In addition, a number of simple commands are implemented as shell built-ins; these include echo and printf. So you won't see any forks in the first script with two echo commands.
The type command (itself a built-in) can be used to show which commands are built-ins. For example, type echo reports echo is a shell builtin, but type cat reports cat is /bin/cat.
Finally, the exit status of a script is the same as the exit status of the last command. So as an optimization, it doesn't fork for the last command, it simply execs it. That's why you see forks when the script has two cat commands, but not when there's only one.

How to tell if a shell command is written in Bash or C?

I'm using macOS and I noticed (via a separate article) that the cat command is written in C. But I'm sure I've read elsewhere that some shell commands (builtins?) are written in Bash.
How can you tell the difference?
UPDATE: seems I was misinformed and that no builtin commands are written in bash. What I must have read was something related to an external executable.
Use the 'file' command to determine the type of file.
Built-ins are not written in bash. The are intrinsically part of the command interpreter (which is often bash). Example: 'cd'. The 'file' command will not be able to find a built-in and will give an error.
the difference between a bash builtin and an executable is that when calling from a bash process a builtin is a function call whereas an external command forks a new process (if not in background waits for termination).
note the overhead of calling a new process
for((i=0;i<1000;i++)); do /bin/echo -n ; done
to know if a command is a builtin or an executable you can use type
type cat
type -a echo
to explicitly call echo builtin
builtin echo
to explicitly call echo command
command echo
note commands that changes process environment like cd can't be an executable because calling a sub process can't change caller's environment.

Bad Substitution Error with System command in C

I have written a C program with some system command in it. I use a software called Gromacs. Here is the snippet of C code :-
#include<stdio.h>
#include <stdlib.h>
/*I have removed unnecessary code, which works fine for me. */
int main() {
float LAMBDA=0.37;
for(LAMBDA=0.37 ; LAMBDA <0.55; LAMBDA +=0.02 ) {
system("g_bar -f md*.xvg -o -oi -oh");
system("mapfile -t a < <(g_bar -f md*.xvg -o -oi -oh | sed '/lambda/s/.*DG *//')");
printf("Free Energy:\t ");
system("echo ${a[120]}");
return 0;
}
I receive an error
sh: 1: Bad substitution
I have checked previous answers on Bad substitution. It seems dash doesn't work with arrays then how can I enable Bash for system commands ? If somebody can troubleshoot me I will be grateful.
The sh vs dash vs bash is not the root problem here.
You create a 'a' (whatever that is) in your second call to system().
Then you try to use this 'a' in the forth system() call.
But this is another shell, and 'a' does not exist here.
Each time you call system(), a new shell environment is created, and disappear at return.
What you need to do is somehow save your 'a' to some file that a subsequent call may work on.
In other words, each call to system() act as if you opened a new terminal, do your stuff and then closed it. The variables created in one terminal (shell session) do not exist in the following one.
EDIT:
And to convince you that the sh/dash/bash is not your root problem here, once you've check your commands run OK when typed in the same shell session (terminal), you can always explicitly use bash in your system() calls by;
system("bash -c do_my_stuff from_this and_that etc");
First, mapfile is a bash 4 builtin command. system runs sh, not bash.
Second, and the cause of the error message, you are using process substitution here:
<(g_bar -f md*.xvg -o -oi -oh | sed '/lambda/s/.*DG *//')
sh does not support process substitution. system runs sh, not bash.
You have several calls to system. Your last call (as shown) looks at a variable a that was created in a previous shell process, it won't exist anymore!
I suggest you write a bash script, complete with #!/bin/bash, and call that from C. You could always write out the script from C, using fopen and fprintf.
If that isn't practical, the use bash -c as suggested by #jbm. But you can't expect any persistence across calls to system except via the C program.

Can't get control back after execvp and wait()

I'm coding a small shell that must execute my commands that I parse.
f is a char** like this: [ls][-la]
p is the same, used like this: [wc]
So I tried to pipe ls -la in wc.
My probleme is that when I execute "ls -la | wc && date", which works well for the pipe, my minishell get closed and it doesn't execute "date". I used the wait function to wait for it to finish but doesn't do anything. Looks like it's stuck and exit just after the 2nd execvp.
My arrays ends well by NULL.
ls -la | wc is well executed but I get back to bash after this.
I've tried with execlp and execl but I think this is not the probleme considering that I need options of my first argument (ls + -la).
Could you help me please ?
Thanks in advance :)
All forms of exec never return; they replace the currently running image with the indicated executable. The key word here is "replace".
The only circumstance in which the statement following a call to exec* gets executed is if the exec fails (for example, if it cannot find the executable).

can't execute bash command via c standard system function

the code snippet I wrote is like this:
#include <stdlib.h>
int main()
{
system("/bin/bash ls");
}
when I compile and execute the binary, I got the result:
/bin/ls: /bin/ls: cannot execute binary file
so what's the thing missing here?
ls is an actual system binary. it's not a built-in shell command. All you need is system("ls"). Right now you're trying to pass the contents of the ls binary file into bash as a script.
Do not use system() from a program , because strange values for some environment variables might be used to subvert system integrity. Use the exec(3) family of functions instead, but not execlp(3) or execvp(3). system() will not, in fact, work properly from programs with set-user-ID or set-group-ID privileges on systems on which /bin/sh is bash version 2, since bash 2 drops privileges on startup. (Debian uses a modified bash which does not do this when invoked as sh.)
In your case , ls is not built in command in shell so system() is not working.
You can check using type <cmd_name> command to know that cmd_name is built-in or not.
For more man system()
If no options are specified, the argument to /bin/bash is the name of a file containing shell commands to execute.
To execute commands specified on the command line, use the -c option: /bin/bash -c ls.
As others have noted, there are security considerations when doing this, so you should seek alternatives.

Resources