Linux shell pipe syntax - c

I am implementing a program that simulates the Linux shell and I need to implement expressions with multiple pipes - but I am not sure what's considered legal or how to handle a few things, for example:
Is pipe as the last character in the command legal? When I try it in the Linux shell it displays really weird behavior - after pressing enter it shows a new line with > in the beginning. I am not sure what does this mean as to the legality of the command?
How to handle several consecutive pipes? For example ls -l ||||| grep 7
it seems the shell just works as usual and ignores the redundant pipes but I am nit sure. Would like some help.

There is not a single Linux shell (but several shells). The most common one is GNU bash, but you can use some other like zsh (which I am using interactively) or fish, or even scsh -or es- which has a quite different syntax. And all of them don't share exactly the same syntax and don't report the same errors.
There is however a standard, POSIX, which defines the POSIX shell specification (as a technical document in English):
The format for a pipeline is:
[!] command1 [ | command2 ...]
The standard output of command1 shall be connected to the standard input of command2.
As you can see, you can't end your command with a |.
Your interactive bash shell is giving a different prompt when an incomplete line has been input. It is using the GNU readline library for interactive editable input (and completion).
All the shells I know on Linux are free software, so you could study their source code. sash is a quite simple shell whose code is quite readable (but a bit buggy); it lacks most of the interactive facilities (notably auto-completion) of more sophisticated shells.
You'll need to understand most of Advanced Linux Programming before coding your own shell...
For a homework, you probably can afford giving an error message on the first encountered error.

Related

Print text to shell without advancing buffer

I would like to know if there is a way to print text into shell's current buffer/cursor so it can be edited. I am building a program that will store some text values in memory and need a simple way to edit them in the shell without rewriting the whole value. So somehow referencing the current edit buffer in shell and printing to it would be quite nice.
However, I am only using common sense here. Maybe it is more complicated. Looking forward to possible solutions.
Every shell handles user input differently. If there is an "edit buffer", it is most likely to be implemented in the shell itself. (Linux terminals do have a primitive line-editing function but as far as I know, there's no way to inject data into the line. However, only very primitive shells rely on native line-editing.)
So the question must be asked with respect to some specific shell.
Just in case, here's the answer for bash.
Bash relies on the readline library to perform command input from a terminal(-like) standard input. (Command-line history is provided using a related history library.) These libraries have a lot of features and bash does not provide access to all of them; if necessary, you could write a program in C or some scripting language with a readline binding.
But bash does give you the tools necessary to preload input with a line of text. The standard way of collecting input from a shell script is through the read command, which is (almost) always a shell "built-in". Basic ooeration of read is defined by the Posix standard, but the bash version provides a lot of useful extension options, including:
-e: use the readline library (with tab-completion, history and line editing enabked).
-i text: if readline is being used, preload readline's edit buffer with the specified text.
The
bash manual has lots more information about the read command's options.

How can we check if a given input is a valid system command like "ls" or "cd" in c program?

I am writing a c program that takes system commands such as "ls" or "cd" as inputs.However user can give any type of commands out of which some are not commands.How can i find which command is valid and which is not?I am writing the code in Ubuntu.
Off the top of my head there are two ways to check if the input is a valid system command, without actually attempting to run it:
A long list of if()s and else if()s which strcmp() the input string with a hard-coded, predetermined list of valid commands - this may be relatively slow, both to write and to run, but with conditional-compilation with #ifdef, can be nearly perfectly portable (i.e, can be made to work with Windows, Linux, BSD et all from one codebase with enough hard work).
If you don't mind being restricted to a UNIX-like platform only, parse the $PATH variable, and search for executables with the same filename as the input string in the directories found from $PATH, and handle errors if no match is met.
You may wish to implement a hybrid of 2. and 1. by hard-coding some exceptions which may not be found in $PATH.
IMHO, however, I fail to see why you would want to do this; it seems puzzling to me.

Why does popen() invoke a shell to execute a process?

I'm currently reading up on and experimenting with the different possibilities of running programs from within C code on Linux. My use cases cover all possible scenarios, from simply running and forgetting about a process, reading from or writing to the process, to reading from and writing to it.
For the first two, popen() is very easy to use and works well. I understand that it uses some version of fork() and exec() internally, then invokes a shell to actually run the command.
For the third scenario, popen() is not an option, as it is unidirectional. Available options are:
Manually fork() and exec(), plus pipe() and dup2() for input/output
posix_spawn(), which internally uses the above as need be
What I noticed is that these can achieve the same that popen() does, but we can completely avoid the invoking of an additional sh. This sounds desirable, as it seems less complex.
However, I noticed that even examples on posix_spawn() that I found on the Internet do invoke a shell, so it would seem there must be a benefit to it. If it is about parsing command line arguments, wordexp() seems to do an equally good job.
What is the reason behind benefit of invoking a shell to run the desired process instead of running it directly?
Edit: I realized that my wording of the question didn't precisely reflect my actual interest - I was more curious about the benefits of going through sh rather than the (historical) reason, though both are obviously connected, so answers for both variations are equally relevant.
Invoking a shell allows you to do all the things that you can do in a shell.
For example,
FILE *fp = popen("ls *", "r");
is possible with popen() (expands all files in the current directory).
Compare it with:
execvp("/bin/ls", (char *[]){"/bin/ls", "*", NULL});
You can't exec ls with * as argument because exec(2) will interpret * literally.
Similarly, pipes (|), redirection (>, <, ...), etc., are possible with popen.
Otherwise, there's no reason to use popen if you don't need shell - it's unnecessary. You'll end up with an extra shell process and all the things that can go wrong in a shell go can wrong in your program (e.g., the command you pass could be incorrectly interpreted by the shell and a common security issue). popen() is designed that way. fork + exec solution is cleaner without the issues associated with a shell.
The glib answer is because the The POSIX standard ( http://pubs.opengroup.org/onlinepubs/9699919799/functions/popen.html ) says so. Or rather, it says that it should behave as if the command argument is passed to /bin/sh for interpretation.
So I suppose a conforming implementation could, in principle, also have some internal library function that would interpret shell commands without having to fork and exec a separate shell process. I'm not actually aware of any such implementation, and I suspect getting all the corner cases correct would be pretty tricky.
The 2004 version of the POSIX system() documentation has a rationale that is likely applicable to popen() as well. Note the stated restrictions on system(), especially the one stating "that the process ID is different":
RATIONALE
...
There are three levels of specification for the system() function. The
ISO C standard gives the most basic. It requires that the function
exists, and defines a way for an application to query whether a
command language interpreter exists. It says nothing about the command
language or the environment in which the command is interpreted.
IEEE Std 1003.1-2001 places additional restrictions on system(). It
requires that if there is a command language interpreter, the
environment must be as specified by fork() and exec. This ensures, for
example, that close-on- exec works, that file locks are not inherited,
and that the process ID is different. It also specifies the return
value from system() when the command line can be run, thus giving the
application some information about the command's completion status.
Finally, IEEE Std 1003.1-2001 requires the command to be interpreted
as in the shell command language defined in the Shell and Utilities
volume of IEEE Std 1003.1-2001.
Note the multiple references to the "ISO C Standard". The latest version of the C standard requires that the command string be processed by the system's "command processor":
7.22.4.8 The system function
Synopsis
#include <stdlib.h>
int system(const char *string);
Description
If string is a null pointer, the system function determines
whether the host environment has a command processor. If string
is not a null pointer, the system function passes the string
pointed to by string to that command processor to be executed
in a manner which the implementation shall document; this might then
cause the program calling system to behave in a non-conforming
manner or to terminate.
Returns
If the argument is a null pointer, the system function
returns nonzero only if a command processor is available. If
the argument is not a null pointer, and the system function
does return, it returns an implementation-defined value.
Since the C standard requires that the systems "command processor" be used for the system() call, I suspect that:
Somewhere there's a requirement in POSIX that ties popen() to the system() implementation.
It's much easier to just reuse the "command processor" entirely since there's also a requirement to run as a separate process.
So this is the glib answer twice-removed.

how to write unix "time" like utility

I am new to unix and learning to write some c programs that we can execute using gcc compiler in ubuntu. question:I need to write something similar to this: "time ls" where time should be replaced by my program. I know how to write c program for this, however, I cannot understand how unix will figure out what to execute if I replace time with my utility lets say "mytime" for instance? Some background for this will really help
Read some good Linux programming book, perhaps ALP - a bit old, but freely downloadable.
Read also intro(2) & syscalls(2).
For time related stuff, start with time(7). It explains that there are several notions of time. Then consider time(2), gettimeofday(2), getrusage(2), clock_gettime(2), times(2), localtime(3), strftime(3) etc...
Notice also that time(1) is either a builtin command of your shell, or an external one in /usr/bin/time. So it is some free software, whose source code you could download and study.
I cannot understand how unix will figure out what to execute
Be aware of the PATH variable (see also environ(7)), used by shells and in execvp(3). You could set your PATH to suit your needs. You might also be interested by strace(1) to understand what system calls a command or a process is doing. Notice that shells are ordinary programs, and you can write your own one (and that is a very useful exercise). Most shells are free software whose source code you can study. sash is a very simple shell...

How to display a custom prompt during the execution of C program?

I'm trying to emulate a terminal using a C program in Linux and need my program to display a custom prompt while the program executes. Is there a way to display it using my C program? (I can always try to printf "My-prompt" every line manually, but I'm looking for a better way). Also I can't use any additional libraries other than the basic ones so GNU Readline library and editline library wouldn't work (as suggested in another thread).
for example:
user#mypc:~$ ./a.out
my_custom_prompt>3+5
my_custom_prompt>8
my_custom_prompt>exit
user#mypc:~$
I believe what the OP wants is to simply have the "prompt" printed along with any program output, without having to add this manually every time. There is a way to do this, if you write a wrapper function on top of printf to do this, and call that instead of printf directly.
Probably this will help: http://www.ozzu.com/cpp-tutorials/tutorial-writing-custom-printf-wrapper-function-t89166.html
In your example, you already have got a terminal. You want to write a command-line interface with a prompt, not a terminal.
I can always try to printf "My-prompt" every line manually, but I'm looking for a better way
There’s nothing wrong with this approach. You have a loop which prints the prompt and waits for input afterwards. As Kunerd said in the comment, one line of code.
Normally, a prompt is printed to stderr rather than stdout. This has the advantage, that the prompt appears before a newline is written, as stderr is unbuffered (and in combination with piping and redirection it seems reasonable to me, that this stuff doesn’t go to the same stream as the actual output).
Also I can't use any additional libraries other than the basic ones so GNU Readline library and Editline library wouldn't work
Doing this in a way strictly conforming to the C standard and not using any libraries but the standard one makes things like line editing (other than using backspace) or a command history (close to) impossible. If that’s OK for you, look for fgets etc. and keep in mind, that stdin is usually line-buffered.
POSIX specifies some additional properties of terminals, see e.g. http://pubs.opengroup.org/onlinepubs/9699919799/. Maybe curses is also of interest for you.
Perhaps you're looking for fgets() documentation?

Resources