Another Linux command output (Piped) as input to my C program

Another Linux command output (Piped) as input to my C program - c

I'm now working on a small C program in Linux. Let me explain you what I want to do with a sample Linux command below
ls | grep hello
The above command is executed in the below passion (Let me know if I've got this wrong)
ls command will be executed first
Output will be given to grep command which will again generate output by matching "hello"
Now I would like to write a C program which takes the piped output of one command as input. Means, In the similar passion of how "grep" program was able to get the input from ls command (in my example above).
Similar question has been asked by another user here, but for some reason this thread has been marked as "Not a valid question"
I initially thought we can get this as a command line argument to C program. But this is not the case.

If you pipe the output from one command into another, that output will be available on the receiving process's standard input (stdin).
You can access it using the usual scanf or fread functions. scanf and the like operate on stdin by default (in the same way that printf operates on stdout by default; in the absence of a pipe, stdin is attached to the terminal), and the C standard library provides a FILE *stdin for functions like fread that read from a FILE stream.
POSIX also provides a STDIN_FILENO macro in unistd.h, for functions that operate one file descriptors instead. This will essentially always be 0, but it's bad form to rely on that being the case.

If fact, ls and grep starts at the same time.
ls | grep hello means, use ls's standard output as grep's standard input. ls write results to standard output, grep waits and reads any output from standard input at once.
Still have doubts? Do an experiment. run
find / | grep usr
find / will list all files on the computer, it should take a lot of time.
If ls runs first, then OS gives the output to grep, we should wait a long time with blank screen until find finished and grep started. But, we can see the results at once, that's a proof for that.

Related

How can I access the file in C when the user used the '<' command on the shell?

I am trying to make a program that can process sentences in C in the POSIX environment. Assume that my program's name is "test". If the user entered just "./test", then my program will ask the user to enter some sentences. This one so far is easy.
However, if the user entered "./test < file.txt", the program should get the characters from that txt file. I do not know how I can get the characters of the file in C. I tried something like file = open(argv[2]);, but it did not work.
I will really appreciate it if you give me the answer to this question.

TL;DR: If you start your program like
./test
and you have to type in the input, then exactly the same program will read from file.txt if you start it as
./test < file.txt
Longer explanation starts here. (The following explanation is not 100% precise, but shall help to get an understanding what is going on in principle.)
In a C program you can open files with fopen. As a return value, fopen gives you a FILE pointer. However, when you start a program under Unix, three FILE pointers are already available. These default FILE pointers are stored in variables named stdin, stdout and stderr.
Of these, stdin can be used to read from, stdout and stderr can be written to. And, stdin is used as default in several C library calls, like, gets or scanf. Similarly, stdout is used by default for calls like printf.
Now, although they are called FILE pointers, they can in fact represent other things than just files. stdin could be a file, but it can also be a console where you can type in stuff.
This latter scenario is what you observe when you start your test program from the shell with the command
./test
In this case, the test process will be started with stdin just using the console from the shell from which you started the test program. Therefore, if in your test program you call, say, gets(), then your program will implicitly read from stdin, which represents the console input that was inherited from the shell. Consequently, in this case the user has to provide input by typing it in.
Now let's look at what happens if you start your process from the shell in the following way:
./test < file.txt
Here, the shell does a bit of extra work before it actually creates your test process. This is because the < file.txt part of your command line is interpreted by the shell - this is not passed as arguments to your program. Instead, what the shell does is, to open the file.txt and, when the test process is started, hand the opened file.txt over to the process such that in your test process stdin is connected to file.txt.
Then, the call to gets() in your program will again read from stdin, but this time stdin is not the console. This time stdin really corresponds to a file, that is, file.txt.

Checking if a file via stdin exists (C)

I'm having difficulty writing a function in C that checks whether a user inputed file (via stdin) exists. For instance if the program is run as ./a.out <myfile.txt, I want it to return false if this file does not exist. I can do this by passing the file as an argument (i.e ./a.out myfile.txt)using fopen(), but not sure how to do this using 'stdin' (i.e ./a.out <myfile.txt)
Ok to clarify:
The larger program is supposed to take the contents of a text file and perform actions on it. The program must be run in the command line as ./a.out arg1 arg2 <myfile.txt. If user ran the program as ./a.out arg1 arg2 or ./a.out (i.e not specifying the file to perform actions on), I want to prompt the user to include a file (using stdin <, not passed as an argument).

Stdin might not be coming from a file at all. Even if it is, when the user types "< myfile.txt" at the command line, the shell swallows that part of the command, and never passes it to the program. As far as the program is concerned, it's an anonymous stream of bytes that might be from a file, a device, a terminal, a pipe, or something else. It is possible to query which of these you have, but even if you know it's a file you won't get the name of the file given on the command line, only an inode.

Since the shell is responsible for opening the file for redirection, it will refuse to execute the command if the file doesn't open.

Input redirection is something done by the shell, not your program. It simply attaches the file to standard input.
Hence, if you try to redirect input from a non-existent file, the shell should complain bitterly and not even run your program, as shown in the following transcript:
pax> echo hello >qq.in
pax> cat <qq.in
hello
pax> cat <nosuchfile.txt
bash: nosuchfile.txt: No such file or directory
In any case, your program generally doesn't know where the input is coming from, since you can do something like:
echo hello | cat
in which no file is involved.
If you want your program to detect the existence of a file, it will have to open the file itself, meaning you should probably give the filename as an argument rather than using standard input.
Or, you could detect the file existence before running your program, with something like the following bash segment:
fspec=/tmp/infile
if [[ -f ${fspec} ]] ; then
my_prog <${fspec}
else
echo What the ...
fi

The OS prevent calling your program since it can provide a valid stdin if myfile.txt does not exists. You program will not run thus there is no way you can signal the file is missing, and this diagnostics is done at the OS level.

If user ran the program as ./a.out arg1 arg2 or ./a.out (i.e not specifying the file to perform actions on), I want to prompt the user to include a file (using stdin <, not passed as an argument).
You could use OS-specific functions to check whether stdin is terminal. Checking whether it's file is a very bad idea, because it's very useful to pipe into stdin ... in fact, that's a major reason that there is such a thing as stdin in the first place. If you only want to read from a file, not a terminal or pipe, then you should take the file name as a required argument and not read from the orginal stdin (you can still read from stdin by using freopen). If you insist that you don't want to do it that way, then I will insist that you want to do it wrong.

In the Unix/Linux shell programming：the difference between > and >&

int main(void)
{
char buf[] = "standard err, output.\n";
printf("standard output.\n");
if (write(STDERR_FILENO,buf, 22) != 22)
printf("write err!\n");
exit(0);
}
Compile using:
gcc -Wall text.c
Then running in the shell:
./a.out > outfile 2 >& 1
Result：outfile´s content are：
standard err, output.
standard output.
./a.out 2 >& 1 >outfile
Result：
This first prints to the terminal: standard err, output.
and the content of outfile are: standard output.
Questions:
I want to ask the difference between 2 >& fd and 2 > file.
Are they all equal to the function dup()?
Another question: why are the contents of outfile:
standard err, output.
standard output.
I expected the content of outfile to be:
standard output.
standard err, output

Actually, in bash, >& is quite similar to dup2. That is, the file descriptor to which it is applied will refer to the same file as the descriptor to the right. So:
$ ./a.out > outfile 2>& 1
It will redirect stdout(1) to the file outfile and, after that, will dup2 stderr(2) to refer to the same file as stdout(1). That is, both stdout and stderr are being redirected to the file.
$ ./a.out 2>& 1 >outfile
It will redirect stderr(2) to refer to the same file as stdout(1), that is, the console, and after that, will redirect stdout(1) to refer to the file outfile. That is, stderr will output to the console and stdout to the file.
And that's exactly what you are getting.

Paradigm Mixing
While there are reasons to do all of these things deliberately, as a learning experience it is probably going to be confusing to mix operations over what I might call "domain boundaries".
Buffered vs non-buffered I/O
The printf() is buffered, the write() is a direct system call. The write happens immediately no matter what, the printf will be (usually) buffered line-by-line when the output is a terminal and block-by-block when the output is a real file. In the file-output case (redirection) your actual printf output will happen only when you return from main() or in some other fashion call exit(3), unless you printf a whole bunch of stuff.
Historic csh redirection vs bash redirection
The now-forgotten (but typically still in a default install) csh that Bill Joy wrote at UCB while a grad student had a few nice features that have been imported into kitchen-sink shells that OR-together every shell feature ever thought of. Yes, I'm talking about bash here. So, in csh, the way to redirect both standard output and standard error was simply to say cmd >& file which was really more civilized that the bag-of-tools approach that the "official" Bourne shell provided. But the Bourne syntax had its good points elsewhere and in any case survived as the dominant paradigm.
But the bash "native" redirection features are somewhat complex and I wouldn't try to summarize them in a SO answer, although others seem to have made a good start. In any case you are using real bash redirection in one test and the legacy-csh syntax that bash also supports in another, and with a program that itself mixes paradigms. The main issue from the shell's point of view is that the order of redirection is quite important in the bash-style syntax while the csh-style syntax simply specifies the end result.

There are several loosely related issues here.
Style comment: I recommend using 2>&1 without spaces. I wasn't even aware that the spaced-out version works (I suspect it didn't in Bourne shell in the mid-80s) and the compressed version is the orthodox way of writing it.
The file-descriptor I/O redirection notations are not all available in the C shell and derivatives; they are avialable in Bourne shell and its derivatives (Korn shell, POSIX shell, Bash, ...).
The difference between >file or 2>file and 2>&1 is what the shell has to do. The first two arrange for output written to a file descriptor (1 in the first case, aka standard output; 2 in the second case, aka standard error) to go to the named file. This means that anything written by the program to standard output goes to file instead. The third notation arranges for 2 (standard error) to go to the same file descriptor as 1 (standard output); anything written to standard error goes to the same file as standard output. It is trivially implemented using dup2(). However, the standard error stream in the program will have its own buffer and the standard output stream in the program will have its own buffer, so the interleaving of the output is not completely determinate if the output goes to a file.
You run the command two different ways, and (not surprisingly) get two different results.
./a.out > outfile 2>&1
I/O redirections are processed left to right. The first one sends standard output to outfile. The second sends standard error to the same place as standard output, so it goes to outfile too.
./a.out 2>&1 >outfile
The first redirection sends standard error to the place where standard output is going, which is currently the terminal. The second redirection then sends standard output to the file (but leaves standard error going to the terminal).
The program uses the printf() function and the write() system call. When the printf() function is used, it buffers its output. If the output is going to a terminal, then it is normally 'line buffered', so output appears when a newline is added to the buffer. However, when the output is going to a file, it is 'fully buffered' and output does not appear until the file stream is flushed or closed or the buffer fills. Note that stderr is not fully buffered, so output written to it appears immediately.
If you run your program without any I/O redirection, you will see:
standard output.
standard err, output
By contrast, the write() system call immediately transfers data to the output file descriptor. In the example, you write to standard error, and what you write will appear immediately. The same would have happened if you had used fprintf(stderr, ...). However, suppose you modified the program to write to STDOUT_FILENO; then when the output is to a file, the output would appear in the order:
standard err, output
standard output.
because the write() is unbuffered while the printf() is buffered.

The 2>&1 part makes the shell do something like that:
dup2(1, 2);
This makes fd 2 a "copy" of fd 1.
The 2> file is interpreted as
fd = open(file, ...);
dup2(fd, 2);
which opens a file and puts the filedescriptor into slot 2.

Use output of unfinished process in a C program

I am using tcpstat in a linux environment. I want to capture its output in a C program even though it has not finished. I tried using the popen() function, but it can only process the output after the program has finished. I want to process the output of tcpstat on the fly as and when it prints it on standard output. How do i do so?
For example,
$ tcpstat -i wlan0 1
Time:1297790227 n=2 avg=102.50 stddev=42.50 bps=1640.00
Time:1297790228 n=11 avg=86.36 stddev=19.05 bps=7600.00
Time:1297790229 n=32 avg=607.97 stddev=635.89 bps=155640.00
Time:1297790230 n=13 avg=582.92 stddev=585.55 bps=60624.00
The above output keeps going on till infinity. So I want to process the output in a C program as and when tcpstat outputs something onto stdout.
Thanks and Regards,
Hrishikesh Murali

Run tcpstat -i wlan0 -a 1 | your_program and read from the standard input in your program. This way the shell will take care of the piping.
The popen library function and the pipe system call can be used to achieve the same result at a lower level. You may want to take a look at named pipes too - they appear like files in userspace and can be manipulated in the same way.

Run tcpstat with the -F option, this will cause it to flush its output on every interval. (instead of using the default block buffering for stdout)
In addition, you may want to explicitly disable the buffering on your popen FILE handle using setbuf, eg.
setbuf(popen_fd, NULL);
Alternately, you can set it to be line buffered, using setlinebuf
setlinebuf(popen_fd);

Transferring output of a program to a file in C

I have written a C program to get all the possible combinations of a string. For example, for abc, it will print abc, bca, acb etc. I want to get this output in a separate file. What function I should use? I don't have any knowledge of file handling in C. If somebody explain me with a small piece of code, I will be very thankful.

Using function fopen (and fprintf(f,"…",…); instead of printf("…",…); where f is the FILE* obtained from fopen) should give you that result. You may fclose() your file when you are finished, but it will be done automatically by the OS when the program exits if you don't.

If you're running it from the command line, you can just redirect stdout to a file.
On Bash (Mac / Linux etc):
./myProgram > myFile.txt
or on Windows
myProgram.exe > myFile.txt

Been a while since I did this, but IIRC there is a freopen that lets you open a file at given handle. If you open myfile.txt at 1, everything you write to stdout will go there.

You can use the tee command (available in *nix and cmd.exe) - this allows output to be sent to both the standard output and a named file.
./myProgram | tee myFile.txt