I'm making a custom shell in C and I wonder on which fd I should write my prompts.
mycoolshell $
Looking into other classic shells, I found that dash uses STDERR for its prompts. csh and tcsh use STDOUT. For bash, zsh and BSD sh I wasn't able to find anything. I used
% dash 2>file
echo qwe
echo qwe
% cat file
(dashprompt$)
(dashprompt$)
to check dash's prompt fd. Same with csh with csh 1>file but I was unlucky with the other ones.
Is there a standard or POSIX fd for this? Is it ok to use STDIN?
If you wish to be Posix compatible, you'll need to write the prompt to stderr. (See the specification of the PS1 environment variable, below.)
Regardless of strict Posix compatibility, stdin is definitely not correct, since it may not allow write operations. stdout is also not a good idea, since it is usually line-buffered. Some shells (including zsh, I believe) write the prompt to a file descriptor connected to the current terminal (such as /dev/tty) which is probably what stderr is opened as if not redirected, although it is not necessarily the same file descriptor. But using /dev/tty or equivalent is non-standard.
The prompt is only printed if the shell is interactive. A shell is interactive, according to Posix, if it is invoked in one of two ways:
If the -i option is present, or if there are no operands and the shell's standard input and standard error are attached to a terminal, the shell is considered to be interactive. (sh utility, Options)
Clearly, you wouldn't want the shell to spew out prompts if you are using it to execute a script. So you need some mechanism to tell if the shell is being used interactively or as a script processor; Posix's requirement seems reasonably accurate. (See the isatty() library function to see one way to do this test.)
That also shows why your test failed to capture the prompt when stderr was redirected to a file. Redirecting stderr causes the shell to be non-interactive so there will not be a prompt. To do the test properly, you need to force the shell to be interactive using the -i option.
Posix requires that the prompt be modifiable by changing the value of the PS1 environment variable. Here's what Posix has to say, including the requirement that the prompt be printed to stderr: (emphasis added)
PS1
Each time an interactive shell is ready to read a command, the value of this variable shall be subjected to parameter expansion and written to standard error. The default value shall be "$ ". For users who have specific additional implementation-defined privileges, the default may be another, implementation-defined value. The shell shall replace each instance of the character '!' in PS1 with the history file number of the next command to be typed. Escaping the '!' with another '!' (that is, "!!" ) shall place the literal character '!' in the prompt. (Shell command language, Shell Variables)
Most shells allow a much richer set of substitutions in PS1. But the fact that the value is subject to parameter expansion allows extensive customisation. That means that (unlike usual variable expansion) parameter and command references appearing in the value of the PS1 variable are expanded every time the prompt is printed.
Related
I am trying to, in C:
Read data from a file
Manipulate the data
Write manipulated data to another file
In the assignment requirements, it says to compile and run the program with the following commands:
gcc -o name name.c
./name inputFileName.ext > outputFileName.ext
I am unfamiliar with the " > " command. I have a couple of questions:
Online, it says that " > " redirects command output to a file, and I'm not sure exactly what "command output" means. I'm redirecting the output from my name.c file to the outputFileName.ext file. Does command output mean stdout? If so, which C keyword would I use to write information to the outputFileName.ext file from name.c as stdout?
When I open and read my input file, I need to access the file that was passed in from the command line. Does the " > " character count as another command line argument? Can I still access inputFileName.ext from main() with the statement " argv[1] " ?
Online, it says that > redirects command output to a file, and I'm not sure exactly what "command output" means.
"command output" refers to the stdout (Standard Output) stream of the program.
Do note that some shell commands are not separate programs but are actually shell builtins, though they'll still support output redirection. On Windows, most shell commands (like dir and del) are built-ins whereas on Linux/BSD/etc most shell commands are separate programs (like ls and mkdir)
If your program calls puts( "foobar" ); then running ./name from Bash will display "foobar" in your terminal emulator. But if you run ./name > file.txt then the "foobar" text will be written to file.txt and it will not be displayed in your terminal emulator.
Try it with the ls command, for example: ls -al > files.txt. This works on Windows too (dir /s > files.txt).
I'm redirecting the output from my name.c file to the outputFileName.ext file. Does command output mean stdout?
Yes.
If so, which C keyword would I use to write information to the outputFileName.ext file from name.c as stdout?
You don't. This is a shell/OS feature and is not part of C.
Let's clarify a few things:
>, < and a few other symbols (that are not relevant to your question) are control operators for your command line interpreter (a.k.a the shell). When the shell sees any of those, it assumes the command line arguments to your program are now finished. So in your case, your program will have argc=2 and argv = ["name ", "inputFileName.ext"].
The "redirection" thing means that whatever your program would normally write to the screen via the stdout (which is ulitized by default when calling printf() putchar(), puts()) will be written to the filename that comes after >. Your pogram is completely unaware of this fact. In your code, you should just assume you are printing on the screen. It is the responsibility of the one who executes the command to perform the redirection. (Also: "outputFileName.ext" does not need to exist, it will be created if it doesn't, but the redirection will override anything previously written in that file, so take extra care not to redirect to a .c file by accident or to your results of your previous execution, if you need them both)
< (not in your question, but closely related) works the opposite way around as you would imagine, with the program reading input from that file rather than from the keyboard. (obviously the file needs to exist now)
For the second part of your question, you can (and should) still access the name of the input file via the contents of argv[1]. You will open the file and read from it via some of the C functions that takes a file descriptor as an argument (like fscanf(), fgets(), getline()).
Finally, are you sure the command given to you is
./name inputFileName.ext > outputFileName.ext
and not
./name < inputFileName.ext > outputFileName.ext
?
The latter uses redirection both for input and for output, and you should not do anything different when reading, just read normally from stdin.
How do I redirect stderr (or stdout+stderr) to a file if I don't know which shell (bash, csh, dash) is interpreting my command?
My C code running on Linux/FreeBSD/OSX needs to call an external program via the system() function, which will use /bin/sh to interpret the supplied command line. I would like to capture the messages printed by that external program to stderr and save them to a file. The problem is that on different systems /bin/sh points to different shells that have different syntax for redirecting the stderr stream to a file.
The closest thing I found is that bash actually understands the csh-style syntax for redirecting stderr+stdout to a file:
some_program >& output.txt
but dash, which is the default shell on Ubuntu (i.e. very common), does not understand this syntax.
Is there a syntax for stderr redirection that would be correctly interpreted by all common shells? Alternatively, is there a way to tell system() (or some other similar C function?) to use /usr/bin/env bash instead of /bin/sh to interpret the supplied command line?
You have a mistaken assumption, that /bin/sh can be an "alternate" shell like csh that's incompatible with the standard shell syntax. If you had a system setup like that, it would be unusably broken; no shell scripts would work. Pretty much all modern systems attempt to conform, at least superficially, to the POSIX standard, where the sh command processes the Shell Command Language specified in POSIX, which is roughly equivalent to the historical Bourne shell and which bash, dash, ash, etc. (shells which are commonly installed as /bin/sh) are all 99.9% compatible with.
You can completely ignore csh and similar. They're never installed as sh, and only folks who actually want to use them, or who get stuck using them as their interactive shell because some evil sysadmin setup the login shell defaults that way, ever have to care about them.
On any POSIX-like system, you can use
system("some_program > output.txt 2>&1");
This is because POSIX system is equivalent to calling sh, and POSIX sh supports this kind of redirection. This works independently of whether or not a user opening a terminal on the system will see a Csh prompt.
How do I redirect stderr (or stdout+stderr) to a file if I don't know which shell (bash, csh, dash) is interpreting my command?
You don't. Bourne-family shells and csh-family shells have different, incompatible syntax for redirecting stderr. In fact, csh and tcsh do not have a syntax to redirect only stderr at all -- they can redirect it only together with stdout.
If you really could be in any shell at all, then you're pretty much hosed with respect to doing much of anything. One could imagine an obscure, esoteric shell with completely incompatible syntax. For that matter, even an unusual configuration of a standard shell could trip you up -- for example if the IFS variable is set to an unusual value in a Bourne-family shell, then you'll have trouble executing any commands that don't take that into account.
If you can count on executing at least simple commands, then you could execute a known shell within the unknown one to process your command, but that oughtn't to be necessary for the case that seems to interest you.
Alternatively, is there a way to tell system() (or some other similar
C function?) to use /usr/bin/env bash instead of /bin/sh to interpret
the supplied command line?
Not on a POSIX-conforming system. POSIX specifies explicitly that the system() function executes the command by use of /bin/sh -c [the_command]. But this shouldn't be something to worry about, as /bin/sh should be a conforming POSIX shell, or at least pretty close to one. Definitely it should be a Bourne-family shell, which both bash and dash are, but tcsh most definitely is not.
The way to redirect the standard error stream in a POSIX shell is to use the 2> redirection operator (which is a special case of a more general redirection feature applicable to any file descriptor). Whatever shell /bin/sh actually is should recognize that syntax, and in particular bash and dash both do:
some_program 2> output.txt
I think, there is another possibility worth mentioning: You could open the file you want to redirect on stderr in your c-code prior to calling system(). You can dup() the original stderr first, and then restore it again.
fflush(stderr); // Flush pending output
int saved_stderr = dup(fileno(stderr));
int fd = open("output.txt", O_RDWR|O_CREAT|O_TRUNC, 0600);
dup2(fd, fileno(stderr));
close(fd);
system("some_program");
dup2(saved_stderr, fileno(stderr));
close(saved_stderr);
This should perform the output redirection as you need it.
If you don't know the shell.... of course you don't know how to redirect from it, despite of the fact that you can see what value the $SHELL has, and act in consequence:
char *shell = getenv("SHELL");
if (*shell) { /* no SHELL variable defined */
/* ... */
} else if (!strcmp(shell, "/bin/sh")) { /* bourne shell */
/* ... */
} /* ... more shells */
Despite of what you say in your question, it is quite unusual to rename /bin/sh to use another shell, as shell scripts use syntax that depends on that. The only case I know is with bash(1), and I have seen this only in Linux (and remarkably, last versions of solaris), but the syntax of bash(1) is a superset of the syntax of sh(1), making it possible to run shell scripts made for sh(1) with it. Renaming /bin/sh to perl for example, would make your system probably completely unusable, as many system tools depend of /bin/sh to be a bourne compatible shell.
By the way, the system(3) library function always calls sh(1) as the command interpreter, so there should be no problem to use it, but there's no solution to capture the output and process it by the parent process (indeed, the parent process is the sh(1) that system(3) fork(2)s)
Another thing you can do is to popen(3) a process. This call gives you a FILE pointer to a pipe of a process. You popen its input in case you popen(3) it for writing, and you popen its output if you want or read its output. Look at the manual for details, as I don't know now if it redirects only its standard output or it also redirects the standard error (I think only redirects standard output, for reasons discussed below, and only if you popen(3) it with a "r" flag).
FILE *f_in = popen("ps aux", "r");
/* read standard output of 'ps aux' command. */
pclose(f_in); /* closes the descriptor and waits for the child to finish */
Another thing you can do is to redirect yourself after fork(2)ing the child, and before the exec(2) call (this way you can decide if you want only stdout or if you want also stderr redirected back to you):
int fd[2];
int res = pipe(fd);
if (res < 0) {
perror("pipe");
exit(EXIT_FAILURE);
}
if ((res = fork()) < 0) {
perror("fork");
exit(EXIT_FAILURE);
} else if (res == 0) { /* child process */
dup2(fd[1], 1); /* redirect pipe to stdout */
dup2(fd[1], 2); /* redirect pipe also to stderr */
close(fd[1]); close(fd[0]); /* we don't need these */
execvp(program, argv);
perror("execvp");
exit(EXIT_FAILURE);
} else { /* parent process */
close(fd[1]); /* we are not going to write in the pipe */
FILE *f_in = fdopen(fd[0]);
/* read standard output and standard error from program from f_in FILE descriptor */
fclose(f_in);
wait(NULL); /* wait for child to finish */
}
You can see a complete example of this (not reading standard error, but it is easy to add --- you have only to add the second dup2() call from above) here. The program executes repeatedly a command you pass to it on the command line. It needs to get access to the output of the subprocess to count the lines, as between invocations, the program goes up as many lines as the program output, to make the next invocation to overlap the output of the last invocation. You can try it and play, making modifications as you like.
NOTE
In your sample redirection, when you use >&, you need to add a number after the ampersand, to indicate which descriptor you are dup()ing. As the number before the > is optional, the one after the & is mandatory. So, if you have not used it, prepare to receive an error (which probably you don't see if you are redirecting stderr) The idea of having two separate output descriptors is to allow you to redirect stdout and at the same time, conserve a channel where to put error messages.
I know how to get the stdout into a file using dup/dup2 system calls, but how do I get the entire output that would be normally shown on my terminal(including the prompt that says my username along with the $ symbol and the current working directory) to a file?
Yes you can, but this may be difficult in many details (depending on your expert level). For the shell to behave normally (I would mean exactly as in a terminal), then it needs to interact with a terminal (special system object). So you need to create a program that behave like a terminal, this what pseudo-terminals devices (/dev) are intended for. Read documentation about this to implement it but roughly, your application should behave like the user so should be connected to the slave side of the pseudo-terminal, and the shell to the master side of the pseudo-terminal. Then you can easily log real inputs made by the user and catch outputs made by the shell.
Can't comment cause of low reputation.
I would say there is no way to do that inside a code in C. Instead, you could use bash for example to redirect everything to a file, and leave the code in C as it is.
In this way you have all the info you want to save: prompt, current directory, call to the program (including flags), and of course the output of the program.
Well, you can do:
-For bash prompt PS1: Echo expanded PS1 (in case you want it expanded, if not there is a simple way to do it just echong PS1)
- For executed command: https://unix.stackexchange.com/questions/169259/how-to-capture-command-line-input-into-logfile-and-execute-it-at-the-same-time
- Standard output and error output: Redirect stderr and stdout in a Bash script
And that's all you want to capture, I think.
Look up the script command in Unix systems. If you want to capture all keyboard and std in/out for a command, use the script executable. If you want to see how it's done, look up the source.
I have one C program and one shell script and I'd like to "source" shell script using my C.
I tried use system() function, after it I can run script properly, but my colors doesn't work.
For example instead of CYAN - I defined it as:
CYAN='\e[96m'
it shows only \e[96m and some functions just failed with message:
./myscript.sh: 27: [: y: unexpected operator
Is there some solution?
A program that is not itself the shell cannot "source" a file of shell commands as the shell itself can do. A program can run such a file as a script, either directly or by invoking a shell to run it, but the script then gets its own environment, and any changes it applies to that environment do not propagate to the parent process's environment.
Programs receive their environment as a function of program startup. If you want a variable to be set in a program's environment then by far the easiest thing to do is arrange for it to be set when the program is invoked, either by exporting it from the parent process's environment or by wrapping program launch in a script that arranges for the same. There are additional alternatives on the process startup side, as well.
If a C program wants to alter its environment after startup, then it can use the setenv() and unsetenv() functions. Those are defined by POSIX, not C itself, but if we're talking about sourcing shell commands then it seems reasonable to assume a POSIX context.
Additionally, if you are trying to define CYAN as a shell variable whose contents are an ANSI escape sequence, then your syntax is wrong. No escape sequences at all are recognized within ordinary single quotes (even closing single quote cannot be escaped). Within double quotes the backslash does function as an escape character, but in a strict sense: C-style character codes are not supported there. If, again, you're processing that in the shell, as opposed to in C, then you appear to want
CYAN=$'\e[96m'
(Note the $, which is essential for \e to be recognized as representing the "escape" character, and which causes the shell to recognize a few other C-style escape sequences as well.)
int main(void)
{
char buf[] = "standard err, output.\n";
printf("standard output.\n");
if (write(STDERR_FILENO,buf, 22) != 22)
printf("write err!\n");
exit(0);
}
Compile using:
gcc -Wall text.c
Then running in the shell:
./a.out > outfile 2 >& 1
Result:outfile´s content are:
standard err, output.
standard output.
./a.out 2 >& 1 >outfile
Result:
This first prints to the terminal: standard err, output.
and the content of outfile are: standard output.
Questions:
I want to ask the difference between 2 >& fd and 2 > file.
Are they all equal to the function dup()?
Another question: why are the contents of outfile:
standard err, output.
standard output.
I expected the content of outfile to be:
standard output.
standard err, output
Actually, in bash, >& is quite similar to dup2. That is, the file descriptor to which it is applied will refer to the same file as the descriptor to the right. So:
$ ./a.out > outfile 2>& 1
It will redirect stdout(1) to the file outfile and, after that, will dup2 stderr(2) to refer to the same file as stdout(1). That is, both stdout and stderr are being redirected to the file.
$ ./a.out 2>& 1 >outfile
It will redirect stderr(2) to refer to the same file as stdout(1), that is, the console, and after that, will redirect stdout(1) to refer to the file outfile. That is, stderr will output to the console and stdout to the file.
And that's exactly what you are getting.
Paradigm Mixing
While there are reasons to do all of these things deliberately, as a learning experience it is probably going to be confusing to mix operations over what I might call "domain boundaries".
Buffered vs non-buffered I/O
The printf() is buffered, the write() is a direct system call. The write happens immediately no matter what, the printf will be (usually) buffered line-by-line when the output is a terminal and block-by-block when the output is a real file. In the file-output case (redirection) your actual printf output will happen only when you return from main() or in some other fashion call exit(3), unless you printf a whole bunch of stuff.
Historic csh redirection vs bash redirection
The now-forgotten (but typically still in a default install) csh that Bill Joy wrote at UCB while a grad student had a few nice features that have been imported into kitchen-sink shells that OR-together every shell feature ever thought of. Yes, I'm talking about bash here. So, in csh, the way to redirect both standard output and standard error was simply to say cmd >& file which was really more civilized that the bag-of-tools approach that the "official" Bourne shell provided. But the Bourne syntax had its good points elsewhere and in any case survived as the dominant paradigm.
But the bash "native" redirection features are somewhat complex and I wouldn't try to summarize them in a SO answer, although others seem to have made a good start. In any case you are using real bash redirection in one test and the legacy-csh syntax that bash also supports in another, and with a program that itself mixes paradigms. The main issue from the shell's point of view is that the order of redirection is quite important in the bash-style syntax while the csh-style syntax simply specifies the end result.
There are several loosely related issues here.
Style comment: I recommend using 2>&1 without spaces. I wasn't even aware that the spaced-out version works (I suspect it didn't in Bourne shell in the mid-80s) and the compressed version is the orthodox way of writing it.
The file-descriptor I/O redirection notations are not all available in the C shell and derivatives; they are avialable in Bourne shell and its derivatives (Korn shell, POSIX shell, Bash, ...).
The difference between >file or 2>file and 2>&1 is what the shell has to do. The first two arrange for output written to a file descriptor (1 in the first case, aka standard output; 2 in the second case, aka standard error) to go to the named file. This means that anything written by the program to standard output goes to file instead. The third notation arranges for 2 (standard error) to go to the same file descriptor as 1 (standard output); anything written to standard error goes to the same file as standard output. It is trivially implemented using dup2(). However, the standard error stream in the program will have its own buffer and the standard output stream in the program will have its own buffer, so the interleaving of the output is not completely determinate if the output goes to a file.
You run the command two different ways, and (not surprisingly) get two different results.
./a.out > outfile 2>&1
I/O redirections are processed left to right. The first one sends standard output to outfile. The second sends standard error to the same place as standard output, so it goes to outfile too.
./a.out 2>&1 >outfile
The first redirection sends standard error to the place where standard output is going, which is currently the terminal. The second redirection then sends standard output to the file (but leaves standard error going to the terminal).
The program uses the printf() function and the write() system call. When the printf() function is used, it buffers its output. If the output is going to a terminal, then it is normally 'line buffered', so output appears when a newline is added to the buffer. However, when the output is going to a file, it is 'fully buffered' and output does not appear until the file stream is flushed or closed or the buffer fills. Note that stderr is not fully buffered, so output written to it appears immediately.
If you run your program without any I/O redirection, you will see:
standard output.
standard err, output
By contrast, the write() system call immediately transfers data to the output file descriptor. In the example, you write to standard error, and what you write will appear immediately. The same would have happened if you had used fprintf(stderr, ...). However, suppose you modified the program to write to STDOUT_FILENO; then when the output is to a file, the output would appear in the order:
standard err, output
standard output.
because the write() is unbuffered while the printf() is buffered.
The 2>&1 part makes the shell do something like that:
dup2(1, 2);
This makes fd 2 a "copy" of fd 1.
The 2> file is interpreted as
fd = open(file, ...);
dup2(fd, 2);
which opens a file and puts the filedescriptor into slot 2.