Infinite read(2) with piped command - c

I'd like to have some precision about read(2) behaviour on Linux systems,
I'm trying to make a shell and I got a problem while reading the input.
I do something like
read(0, BUF, 4096);
So the thing is on bash (and most shell) you can pipe a command like this :
echo ls | bash
Bash will execute ls only once but when I do this read always return me the same buffer, "ls"
so it goes in infinite loop.
I wanted to know why read(2) doesn't return me 0 thanks you.

You need to examine read's return value. If it returns 0, then it reached the end of the input stream. That is, instead of this:
read(0, BUF, 4096);
you need to write:
int bytes_read = read(0, BUF, 4096);
and then check if bytes_read is zero.

You're probably not clearing the buffer after each read. If there's nothing to read, read() will return 0 and write nothing to the buffer. If the buffer's not cleared, whatever was in it before hand will still be there, which is why you keep getting your infinite ls - there's still only the original ls in there, but you keep treating it as new input.

Related

how to echo std output exactly in C

I'm making essentially a terminal wrapper. I want my program to read from stdin, and for any inputs that come in, echo it EXACTLY the way it is, even if it's wrong.
Right now, I'm doing:
FILE *output = popen(buffer, "r");
memset(buffer, '\0', BUFF_SIZE * 2);
while (fgets(buffer, sizeof(buffer), output) != 0)
{
printf("%s", buffer);
memset(buffer, '\0', BUFF_SIZE * 2);
}
pclose(output);
}
where I'm calling shell commands using popen, but in certain situations, for example, when the command is not found, the output returned by using popen won't be exactly the same compared to when using the terminal without the wrapper. For example, if I input $ asd, linux terminal will return
No command 'asd' found, but there are 24 similar ones
asd: command not found
whereas popen will return:
sh: 1: asd not found
I would like to have the default terminal response rather than what popen returns, would this be possible? If so, how?

Buffering while popen working

I'm using the following command to get a constant output of a two digit value:
pipe = popen("hcidump -a | egrep --line-buffered 'RSSI|bdaddr' | grep -A1 --line-buffered --no-group-separator 'bdaddr 78:A5:04:17:9F:66' | grep -Po --line-buffered 'RSSI:\\s+\\K.*'", "r");
I want to buffer that, so I can transform it into a integer value and work with it. but I'm not sure how to achieve that. As long as I have popen running my C-programm will not go on. I checked it with strace and he will read the value in popen and will not termit.
the rest of the code looks like that:
if (pipe)
{
printf("entered pipe-if");
while(!feof(pipe))
{
if(fgets(buffer, 128, pipe) != NULL){}
}
pclose(pipe);
printf("pclose");
buffer[strlen(buffer)-1] = '\0';
}
The idea behind it is that I want to use the Data to calculate a distnace in "realtime". A possibility would be that I could tell popen to end the process after x seconds, then use the buffered data and then start the process from the beginning.
Thanks for help and advice.
After using popen to open the pipe you should work with the file descriptor rather than the FILE pointer. You can then turn the file to non-blocking and process the data as it comes in.
int fd = fileno(pipe);
fcntl(fd, F_SETFL, O_NONBLOCK);
Then you can read data from the pipe using
bytes = read(fd, buf, bufsize);
If bytes is greater than 0, then you have some more data to process. If bytes is -1 and errno is EAGAIN, then there's nothing in the pipe. Anything else you're done. You'll have to deal with the data however it comes in (ie you don't get fgets() nicely doing things per-line)

C - popen not showing right output

anyone know how I can fix this?
char bash_cmd[256] = "curl";
char buffer[1000];
FILE *pipe;
int len;
pipe = popen(bash_cmd, "r");
if (NULL == pipe) {
perror("pipe");
exit(1);
}
fgets(buffer, sizeof(buffer), pipe);
printf("OUTPUT: %s", buffer);
pclose(pipe);
The above code snippit is returning the following:
OUTPUT: (�3B
instead of what it should be returning which is:
curl: try 'curl --help' or 'curl --manual' for more information
Something is wrong, I can't figure out what. When I replace "curl" with, say, "ls -la" it works fine, but for whatever reason only when I use curl, it doesn't properly save the output into buffer. What could I do to fix this?? thanks in advance
Also, replacing "curl" with the full path to curl, (/usr/bin/curl) doesn't work either. ;(
When I run your code, I find that the output is indeed approximately what you describe, but that the output you expect is also printed immediately previous. It seems highly likely, therefore, that curl is printing the usage message to its stderr rather than to its stdout, as indeed it should do.
You do not check the return value of fgets(); I suspect you would find that it is NULL, indicating that the end of the stream occurred before any data was read. In that case, I do not think fgets() modifies the provided buffer.
If you want to capture curl's stderr in addition to its stdout, then you can apply I/O redirection to the problem:
char bash_cmd[256] = "curl 2>&1";
That would not work (directly) with the execve()-family functions, but popen() runs the given command via a shell, which should handle the redirection operator just fine.
For general purposes, however, combining curl's output and error streams may not be what you want. If both real output and real diagnostics were emitted then they would be intermingled.
The output you expect from curl is going to stderr not stdout. In fact nothing is written to stdout. The output you are printing is the uninitialized contents of the buffer.
Your code should check the return value of fgets, which will be null if no characters were read (or if an error occurred).

stdout stream changes order after redirection?

These days I was learning the "apue", a result of an typical case confused me. The following are the sample codes of "sample.c":
#include "apue.h"
#include <stdio.h>
#define BUFF_SZ 4096
int main()
{
int n = 0;
char buff[BUFF_SZ] = {'\0'};
while ((n = read(STDIN_FILENO, buff, BUFF_SZ)) > 0) {
printf("read %d bytes\n", n);
if (write(STDOUT_FILENO, buff, n) != n) {
err_sys("write error");
}
}
if (n < 0) {
err_sys("read error");
}
return 0;
}
After compilation gcc sample.c, you can use this command echo Hello | ./a.out and get the following std output on terminal:
read 6 bytesHello
However, if you redirect the output to a file echo Hello | ./a.out > outfile, then use cat outfile to see the content:
Helloread 6 bytes
The ouput changes order after redirection! I wonder if some one could tell me the reason?
For the standard I/O function printf, when you output to a terminal, the standard output is by default line buffered.
printf("read %d bytes\n", n);
\n here cause the output to flush.
However, when you output to a file, it's by default fully buffered. The output won't flush unless the buffer is full, or you explicitly flush it.
The low level system call write, on the other hand, is unbuffered.
In general, intermixing standard I/O calls with system calls is not advised.
printf(), by default, buffers its output, while write() does not, and there is no synchronisation between then.
So, in your code, it is possible that printf() stores its data in a buffer and returns, then write() is called, and - as main() returns, printf()s buffer is flushed so that buffered output appears. From your description, that is happening when output is redirected.
It is also possible that printf() writes data immediately, then write() is called. From your description, that happens when output is not redirected.
Typically, one part of redirection of a stream is changing the buffer - and therefore the behaviour when buffering - for streams like stdout and stdin. The precise change depends on what type of redirection is happening (e.g. to a file, to a pipe, to a different display device, etc).
Imagine that printf() writes data to a buffer and, when flushing that buffer, uses write() to produce output. That means all overt calls of write() will have their output produced immediately, but data that is buffered may be printed out of order.
The problem is that the writes are handled by write(2) call, so you effectively lose control of what happens.
If we look at the documentation for write(2) we can see that the writes are not guaranteed to be actually written until a read() occurs. More specifically:
A successful return from write() does not make any guarantee that data has
been committed to disk. In fact, on some buggy implementations, it does not even
guarantee that space has successfully been reserved for the data. The only way to
be sure is to call fsync(2) after you are done writing all your data.
This means that depending on the implementation and buffering of the write(2) (which may differ even between redirects and printing to screen), you can get different results.

read() from stdin

Consider the following line of code:
while((n = read(STDIN_FILENO, buff, BUFSIZ)) > 0)
As per my understanding read/write functions are a part of non-buffered I/O. So does that mean read() function will read only one character per call from stdio? Or in other words, the value of n will be
-1 in case of error
n = 0 in case of EOF
1 otherwise
If it is not the case, when would the above read() function will return and why?
Note: I was also thinking that read() will wait until it successfully reads BUFSIZ number of characters from stdin. But what happens in a case number of characters available to read are less than BUFSIZ? Will read wait forever or until EOF arrives (Ctrl + D on unix or Ctrl + Z on windows)?
Also, lets say BUFSIZ = 100 and stdin = ACtrl+D (i.e EOF immediately following a single character). Now how many times the while loop will iterate?
The way read() behaves depends on what is being read. For regular files, if you ask for N characters, you get N characters if they are available, less than N if end of file intervenes.
If read() is reading from a terminal in canonical/cooked mode, the tty driver provides data a line at a time. So if you tell read() to get 3 characters or 300, read will hang until the tty driver has seen a newline or the terminal's defined EOF key, and then read() will return with either the number of characters in the line or the number of characters you requested, whichever is smaller.
If read() is reading from a terminal in non-canonical/raw mode, read will have access to keypresses immediately. If you ask read() to get 3 characters it might return with anywhere from 0 to 3 characters depending on input timing and how the terminal was configured.
read() will behave differently in the face of signals, returning with less than the requested number of characters, or -1 with errno set to EINTR if a signal interrupted the read before any characters arrived.
read() will behave differently if the descriptor has been configured for non-blocking I/O. read() will return -1 with errno set to EAGAIN or EWOULDBLOCK if no input was immediately available. This applies to sockets.
So as you can see, you should be ready for surprises when you call read(). You won't always get the number of characters you requested, and you might get non-fatal errors like EINTR, which means you should retry the read().
Your code reads:
while((n = read(0, buff, BUFSIZ) != 0))
This is flawed - the parentheses mean it is interpreted as:
while ((n = (read(0, buff, BUFSIZ) != 0)) != 0)
where the boolean condition is evaluated before the assignment, so n will only obtain the values 0 (the condition is not true) and 1 (the condition is true).
You should write:
while ((n = read(0, buff, BUFSIZ)) > 0)
This stops on EOF or a read error, and n lets you know which condition you encountered.
Apparently, the code above was a typo in the question.
Unbuffered I/O will read up to the number of characters you read (but not more). It may read less on account of EOF or an error. It may also read less because less is available at the time of the call. Consider a terminal; typically, that will only read up to the end of line because there isn't any more available than that. Consider a pipe; if the feeding process has generated 128 unread bytes, then if BUFSIZ is 4096, you'll only get 128 bytes from the read. A non-blocking file descriptor may return because nothing is available; a socket may return fewer bytes because there isn't more information available yet; a disk read may return fewer bytes because there are fewer than the requested number of bytes left in the file when the read is performed.
In general, though, read() won't return just one byte if you request many bytes.
As the read() manpage states:
Return Value
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. In this case it is left unspecified whether the file position (if any) changes.
So, each read() will read up to the number of specified bytes; but it may read less. "Non-buffered" means that if you specify read(fd, bar, 1), read will only read one byte. Buffered IO attempts to read in quanta of BUFSIZ, even if you only want one character. This may sound wasteful, but it avoids the overhead of making system calls, which makes it fast.
read attempts to get all of characters requested.
if EOF happens before all of the requested characters can be returned, it returns what it got
after it does this the next read returns -1, to let you know you the file end.
What happens when it tries to read and there is nothing there involves something called blocking. You can call open to read a file blocking or non-blocking. "blocking" means wait until there is something to return.
This is what you see in a shell waiting for input. It sits there. Until you hit return.
Non-blocking means that read will return no bytes of data if there are none. Depending on a lot of other factors which would make a completely correct answer unusable for you, read will set errno to something like EWOULDBLOCK, which lets you know why your read returned zero bytes. It is not necessarily a fatal error.
Your code could test for a minus to find EOF or errors
When we say read is unbuffered, it means no buffering takes place at the level of your process after the data is pulled off the underlying open file description, which is a potentially-shared resource. If stdin is a terminal, there are likely at least 2 additional buffers in play, however:
The terminal buffer, which can probably hold 1-4k of data off the line until.
The kernel's cooked/canonical mode buffer for line entry/editing on a terminal, which lets the user perform primitive editing (backspace, backword, erase line, etc.) on the line until it's submitted (to the buffer described above) by pressing enter.
read will pull whatever has already been submitted, up to the max read length you passed to it, but it cannot pull anything from the line editing buffer. If you want to disable this extra layer of buffering, you need to lookup how to disable cooked/canonical mode for a terminal using tcsetattr, etc.

Resources