How shell commands execute - c

I am a newbee and looking for some info.
Thanks in advance.
What is difference between echo "Hello World!" and a c-program which prints "Hello World!" using printf.
How do shell commands get executed. For example if I give ls it lists all the files in the directory. Is there executable binary which is run when we enter ls in shell.
Please let me know if you guys have any links or source to get this clear.

There are two main types of "commands" that the shell can execute. Built-in commands are executed by the shell itself - no new program is started. Simply typing echo in a shell prompt is an example of such a built-in command.
On the other hand, other commands execute external programs (also called binaries) - and ls is an example of this kind of command.
So, if you run echo in a shell, it's executed by the shell itself, but if you write a C program that performs the same action, it wil be run as an external program. As a matter of fact, most Linux systems come with such a binary, located at /bin/echo.
Why does it sometimes make sense to have both a built-in command and a program to accomplish the same task? Built-in commands are faster to execute as there is some cost involved in running an external program. But built-ins have some drawbacks, too: they can't be too complex as this would make the shell big and slow; they can not be upgraded separately from the shell and from each other; finally, there are situations where an external program which is not your shell would like to run an application: it can run external programs but it can't execute shell built-ins directly since it's not the shell. So sometimes it makes sense to have it both ways. Apart from echo, time is another example of this double approach.

The shell is just a user level way of interacting with the operating system, or the kernel. That's one of the reasons it's called a shell. The shell itself (sh, csh, tcsh, ksh, zsh, bash, etc...) is essentially just a binary the operating system executes to allow you to execute other binaries.
It generally gives a lot of other functionality though like built in functions (echo, fg, jobs, etc...), an interpreted language (for x in ..., if then, etc...), command history, and so on...
So, any text entered into the shell (like echo), the binary (or process) interprets and runs the corresponding functions in its code. Built in functions (like echo) don't need to create a new process, but if the text is interpreted as a request to execute a binary (vim, emacs, gcc, test, true, false, etc...) the shell will create a new process for it (unless you prefix it withexec), and execute it.
So, echo "Hello World! just runs code in the shell (process). A printf("Hello World!") would be in seperate binary that the shell would create a new process for (fork), and have the operating system execute (exec).

Related

Is is possible to implement internal shell command (shell builtin) by only using the exec() functions?

I want to implement some Linux commands by only using the exec() family of functions. When I use external commands such as "ls", "whoami", it runs well. However, I can't run internal commands (shell builtins) such as "export". Is there any other way to run those commands? And also, why can system() implement those internal commands even though it uses execl() function?
The Linux implementation of system does indeed use execl, but it uses it to run a shell, not to directly invoke the utility. man system is quite explicit:
The system() library function uses fork(2) to create a child process that executes the shell command specified in command using execl(3) as follows:
execl("/bin/sh", "sh", "-c", command, (char *) 0);
So system runs an instance of the standard shell (/bin/sh), passing it two command-line parameters: -c and the argument to system. Shells are expected to interpret the command-line flag -c as requesting the the shell execute the following argument as a shell command. The shell will certainly be able to execute its own builtins from a -c argument in the same way that it executes them if you type them interactively.
However, although you can execute a cd or export command with system, you will find that it is utterly pointless because it works on the execution environment of the shell invoked by system. That shell will terminate as soon as it finishes executing the command it has been asked to execute, so any changes to its execution environment will immediately vanish.
So that's not very useful if you are trying to write a shell. You will want cd to actually change working directories and export to modify your shell's environment variables. So, just like the system shell, you'll have to implement your own builtins. As the word "builtin" implies, these commands are interpreted directly by the shell itself, rather than being passed to some external utility to implement.
For these particular commands, you'll probably want to investigate the standard library functions setenv(3) (to modify environment variables) and chdir(2) (to change the current working directory).

Need to get C program name inside shell script

I have an occasion where a C program invokes a shell script, which in-turn does some copying stuff from the CD mount location to an installation directory.
Now my question is that, is there a straightforward approach to get the absolute path of this C program inside this shell script ?.
I tried a couple of approaches that includes using "$(ps -o comm= $PPID)" from within the script, but nothing did work out till now. I know that I can create a temporary file from the C program which contains its own name (argv[0]) and then make the shell script to read that file, but I don't want to follow that approach here.
Of course, it can be passed as an argument to the script, but I was thinking why the bash built-in macros or something cannot be used here
On linux there is a /proc/self/exe path that points the absolute path of the current executed file. So you can push an environment variable that contains the path before spawning the shell. Something like:
readlink("/proc/self/exe",...,buf);
putenv("MYEXE",buf);
system("thescript");
and accessing the variable in the script:
echo $MYEXE
Before running a foo command you could use which like
fooprog=$(which foo)
to get the full path of the program (scanning your $PATH). For example which ls could give /bin/ls ....
On Linux specifically you could use proc(5).
In your shell process (running bash or some POSIX compliant shell) started by your C program, $PPID give the parent process id, hopefully the pid of the process running your C program.
Then the executable is /proc/$PPID/exe which is a symbolic link. Try for example the ls -l /proc/$PPID/exe command in some terminal.
(notice that you don't run C source files or stricto sensu C programs, you often run some ELF executable which was built by compiling C code)
You might have weird cases (you'll often ignore them, but you might decide to handle them). Someone might move or replace or remove your executable while it is running. Or the parent process (your executable) died prematurely, so the shell process becomes orphan. Or the executable removed itself.

Enable LD_PRELOAD just for BASh after system startup

Is there a way to inject/enable LD_PRELOAD just for new sessions (ie: BASh)?
I have a syntax highlighting library that I want to have automatically enabled (ie: highlight warnings for certain users), and just need it loaded for BASh rather than all processes. If I put it in /etc/ld.so.preload, it's disruptive and causes issues for all the system services and other programs that don't need it running, wrapping system calls (printf and exec mainly).
Is there a simple way to accomplish this?
The easiest solution is probably to replace bash with a shell script that performs the LD_PRELOAD logic, then calls the actual (renamed) bash binary.
That is, move /bin/bash to /bin/bash.original, then create a script /bin/bash with the following contents:
#!/bin/sh
LD_PRELOAD=/path/to/my/library.so
export LD_PRELOAD
exec /bin/bash.original "$#"
You could include logic here (e.g., "is stdout a tty") if you want to only perform the LD_PRELOAD when connected to an interactive session. Trying to perform any sort of terminal manipulation when bash isn't connected to a tty will probably yield weird results.

How do I copy everything from my terminal to a file including the stdout and the prompt using C?

I know how to get the stdout into a file using dup/dup2 system calls, but how do I get the entire output that would be normally shown on my terminal(including the prompt that says my username along with the $ symbol and the current working directory) to a file?
Yes you can, but this may be difficult in many details (depending on your expert level). For the shell to behave normally (I would mean exactly as in a terminal), then it needs to interact with a terminal (special system object). So you need to create a program that behave like a terminal, this what pseudo-terminals devices (/dev) are intended for. Read documentation about this to implement it but roughly, your application should behave like the user so should be connected to the slave side of the pseudo-terminal, and the shell to the master side of the pseudo-terminal. Then you can easily log real inputs made by the user and catch outputs made by the shell.
Can't comment cause of low reputation.
I would say there is no way to do that inside a code in C. Instead, you could use bash for example to redirect everything to a file, and leave the code in C as it is.
In this way you have all the info you want to save: prompt, current directory, call to the program (including flags), and of course the output of the program.
Well, you can do:
-For bash prompt PS1: Echo expanded PS1 (in case you want it expanded, if not there is a simple way to do it just echong PS1)
- For executed command: https://unix.stackexchange.com/questions/169259/how-to-capture-command-line-input-into-logfile-and-execute-it-at-the-same-time
- Standard output and error output: Redirect stderr and stdout in a Bash script
And that's all you want to capture, I think.
Look up the script command in Unix systems. If you want to capture all keyboard and std in/out for a command, use the script executable. If you want to see how it's done, look up the source.

Execute any command-line shell like into execve

In case this is helpful, here's my environment: debian 8, gcc (with std = gnu99).
I am facing the following situation:
In my C program, I get a string (char* via a socket).
This string represents a bash command to execute (like 'ls ls').
This command can be any bash, as it may be complex (pipelines, lists, compound commands, coprocesses, shell function definitions ...).
I can not use system or popen to execute this command, so I use currently execve.
My concern is that I have to "filter" certain command.
For example, for the rm command, I can apply it only on the "/home/test/" directory. All other destinations is prohibited.
So I have to prevent the command "rm -r /" but also "ls ls && rm -r /".
So I have to parse the command line that is given me, to find all the command and apply filters on them.
And that's when I'm begin to be really lost.
The command can be of any complexity, so if I want to make pipelines (execve execute a command at a time) or if I want to find all commands for applying my filters, I'll have to develop parser identical to that of sh.
I do not like creating the wheel again, especially if I make it square.
So I wonder if there is a feature in the C library (or that of gnu) for that.
I have heard of wordexp, but I do not see how manage pipelines, redirection or other (in fact, this does not seem made for this) and i do not see how can I retrieve all the command inside the commande.
I read the man of sh(1) to see if I can use it to "parse" but not execute a command, but so far, I find nothing.
Do I need to code a parser from the beginning?
Thank for your reading, and I apologies for my bad english : it's not my motherlanguage (thanks google translate ...).
Your problem:
I am facing the following situation: In my C program, I get a string
(char* via a socket). This string represents a bash command to execute
(like 'ls ls'). This command can be any bash, as it may be complex
(pipelines, lists, compound commands, coprocesses, shell function
definitions ...).
How do you plan on authenticating who is at the other end of the socket connection?
You need to implement a command parser, with security considerations? Apparently to run commands remotely, as implied by "I get a string (char* via a socket)"?
The real solution:
How to set up SSH without passwords
Your aim
You want to use Linux and OpenSSH to automate your tasks. Therefore
you need an automatic login from host A / user a to Host B / user b.
You don't want to enter any passwords, because you want to call ssh
from a within a shell script.
Seriously.
That's how you solve this problem:
I receive on a socket a string that is a shell command and I have to
execute it. But before execute it, i have to ensure that there is not
a command in conflict with all the rules (like 'rm only inside this
directory, etc etc). For executing the command, I can't use system or
popen (I use execve). The rest is up to me.
Given
And that's when I'm begin to be really lost.
Because what you're being asked to do is implement security features and command parsing. Go look at the amount of code in SSH and bash.
Your OS comes with security features. SSH does authentication.
Don't try to reinvent those. You won't do it well - no one can. Look how long it's taken for bash and SSH to get where they are security-wise. (Hint: it's decades because there's literally decades of history and knowledge that were built into bash and SSH when they were first coded...)

Resources