How does the statements inside this IF statement work? - c

I just recently started my programming education within Inter-process commmunications and this piece of code was written within the parent processs code section. From what I have read about write(), it returns -1 if it failed, 0 if nothing was written to the pipe() and a positive integer if successful. How exactly does sizeof(value) help us identify this? Isn't if(write(request[WRITE],&value,sizeof(value) < 1) a much more reading friendlier alternative to what the sizeof(value).
if(sizeof(value)!=write(request[WRITE],&value,sizeof(value)))
{
perror("Cannot write thru pipe.\n");
return 1;
}
Code clarification: The variable value is an input of a digit in the parent process which the parent then sends to the child process through a pipe the child to do some arithmic operation on it.
Any help of clarification on the subject is very much apprecaited.
Edit: How do I highlight my system functions here when asking questions?

This also captures a successful but partial write, which the application wants to consider being a failure.
It's slightly easier to read without the pointless parnethesis:
if(write(request[WRITE], &value, sizeof value) != sizeof value)
So, for instance, if value is an int, it might occupy 4 bytes, but if the write() just writes 2 of those it will return 2 which is captured by this test.
At least in my opinion. Remember that sizeof is not a function.

That's not a read, that's a write. The principle is almost the same, but there's a bit of a twist.
As a general rule you are correct: write() could return a "short count", indicating a partial write. For instance, you might ask to write 2000 bytes to some file descriptor, and write might return a value like 1024 instead, indicating that 976 (2000 - 1024) bytes were not written but no actual error occurred. (This occurs, for instance, when receiving a signal while writing on a "slow" device like a tty or pty. Of course, the application must decide what to do about the partial write: should it consider this an error? Should it retry the remaining bytes? It's pretty common to wrap the write in a loop, that retries remaining bytes in case of short counts; the stdio fwrite code does this, for instance.)
With pipes, however, there's a special case: writes of sufficiently small size (less than or equal to PIPE_BUF) are atomic. So assuming sizeof(value) <= PIPE_BUF, and that this really is writing on a pipe, this code is correct: write will return either sizeof(value) or -1.
(If sizeof(value) is 1, the code is correct—albeit misleading—for any descriptor: write never returns zero. The only possible return values are -1 and some positive value between 1 and the number of bytes requested-to-write, inclusive. This is where read and write are not symmetric with respect to return values: read can, and does, return zero.)

Related

Is there a way to test for end of file in less than 3 syscalls?

I want to test whether the given file's position, referenced by fd, is at the end of file. E. g. current position == file size. Is there a way to do this in less than 3 sys calls? The 3 calls being:
Get current position with lseek
lseek to end of file and store that position (i. e. the file size)
Compare the two, and if they're different, lseek back to the original position.
You can test for end-of-file in just one syscall: a single read! If it returns 0, you're at end-of-file. If it doesn't, you weren't.
...and, of course, if it returns greater than 0, you're not where you were any more, so this might not be a good solution. But if your primary task was reading the file, then the data you've just read with your one read call is quite likely to be data you wanted anyway.
In a comment you said that code that merely calls read can be "convoluted and produce code that is harder to work with", and I kind of know what you mean. I can vaguely remember, once or twice in my career, wishing I could know whether the next read was going to succeed, before I had to do it. But that was just once or twice. The vast, vast majority of the time, for me at least, code that just reads reads reads until one read call returns 0 ends up being perfectly natural and straightforward.
Addendum:
There's some pseudocode from K&R that always sticks with me, for the basic version of grep that they introduce as an example in a fairly early chapter:
while (there's another line) {
if (line contains pattern) {
print it;
}
}
That's for line-based input, but the more-general pattern
while (there's some input)
process it;
has equal merit, and the fleshing-out to an actual read call doesn't involve that big a change:
while (n = (read(fd, buf, bufsize)) > 0) {
process n bytes from buf;
}
At first the embedded read-and-test — that is, the assignment to n, and the test against 0, buried in the single control expression of the while loop — used to really bug me, seemed unnecessarily cryptic. But it really, really does encapsulate the "while there's input / process it" idiom rather perfectly, or at least, given a C/Unix-style read call that can only indicate EOF after you call it.
(This is by contrast to Pascal-style I/O, which does indicate EOF before you call it, and is, or used to be, a prime motivator for all the questions that led to Why is while( !feof(file) ) always wrong? being a canonical SO question. Brian Kernighan has a description, probably in Why Pascal Is Not My Favorite Programming Language, of how frustratingly difficult and unnatural it is to implement a Pascal-style input methodology that can explicitly indicate EOF before it happens.)
If you have a file descriptor, you can use fstat() to get the size of the file:
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
struct stat sb;
/* Upon successful completion, 0 shall be returned.
* Otherwise, -1 shall be returned and
* errno set to indicate the error
*/
if (fstat(fd, &sb) == -1) {
perror ("fstat()");
/* Handle error here */
}
off_t size = buf.st_size;
The call lseek() to get the current location.
But as noted in the comments:
"This is essentially impossible with any number of system calls because, whether a test tells you the position is or is not at the end of file at one moment, another process could truncate or extend the file at the next moment. No result will be reliable the moment after it is obtained" — Eric Postpischil

The magic of STREAMS in Linux. When to finish?

Today at 5am I read an article about read system call. And things become significantly clear for me.
ssize_t read(int fd, void *buf, size_t count);
The construction of *nix like operation system become amazing in it's simplicity. File interface for any entity, just ask to write some date from this fd interface into some memory by *buf pointer. All the same for network, files, streams.
But some question appears.
How to distinguish two cases?:
1) Stream is empty need to wait for new data. 2) Stream is closed need to close program.
Here is a scenario:
Reading data from STDIN in loop, this STDIN redirected by pipe.
some text_data appears
just read bite by bite until what EOF in memory, or 0 as result of read call?
How program will understand: wait for a new input, or exit?
This is unclear. In case of endless or continuous streams.
UPD After speak with #Bailey Kocin and reading some docs I have this understanding. Fix me if I'm wrong.
read holds the program execution and waits for count amount of bites.
When count amount of bites appears read writes it into buf and execution continues.
When stream is closed read returns 0, and it is a signal that program may be finished.
Question Do EOF appears in buf?
UPD2 EOF is a constant that can be in the output of getc function
while (ch != EOF) {
/* display contents of file on screen */
putchar(ch);
ch = getc(fp);
}
But in case of read the EOF value dose not appears in a buf. read system call signalize about file ending by returning 0. Instead of writing EOF constant into the data-area, as like ak in case of getc.
EOF is a constant that vary in different systems. And it used for getc.
Let's deal first with your original question. Note that man 7 pipe should give some useful information on this.
Say we have the standard input redirected to the input side of a descriptor created by a pipe call, as in:
pipe(p);
// ... fork a child to write to the output side of the pipe ...
dup2(p[0], 0); // redirect standard input to input side
and we call:
bytes = read(0, buf, 100);
First, note that this behaves no differently than simply reading directly from p[0], so we could have just done:
pipe(p);
// fork child
bytes = read(p[0], buf, 100);
Then, there are essentially three cases:
If there are bytes in the pipe (i.e., at least one byte has been written but not yet read), then the read call will return immediately, and it will return all bytes available up to a maximum of 100 bytes. The return value will be the number of bytes read, and it will always be a positive number between 1 and 100.
If the pipe is empty (no bytes) and the output side has been closed, the buffer won't be touched, and the call will return immediately with return value of 0.
Otherwise, the read call will block until something is written to the pipe or the output side is closed, and then the read call will return immediately using the rules in cases 1 and 2.
So, if a read() call returns 0, that means the end-of-file was reached, and no more bytes are expected. Waiting for additional data happens automatically, and after the wait, you'll either get data (positive return value) or an end-of-file signal (zero return value). In the special case that another process writes some bytes and then immediately closes (the output side of) the pipe, the next read() call will return a positive value up to the specified count. Subsequent read() calls will continue to return positive values as long as there's more data to read. When the data are exhausted, the read() call will return 0 (since the pipe is closed).
On Linux, the above is always true for pipes and any positive count. There can be differences for things other than pipes. Also, if the count is 0, the read() call will always return immediately with return value 0. Note that, if you are trying to write code that runs on platforms other than Linux, you may have to be more careful. An implementation is allowed to return a non-zero number of bytes less than the number requested, even if more bytes are available in the pipe -- this might mean that there's an implementation-defined limit (so you never get more than 4096 bytes, no matter how many you request, for example) or that this implementation-defined limit changes from call to call (so if you request bytes over a page boundary in a kernel buffer, you only get the end of the page or something). On Linux, there's no limit -- the read call will always return everything available up to count, no matter how big count is.
Anyway, the idea is that something like the following code should reliably read all bytes from a pipe until the output side is closed, even on platforms other than Linux:
#define _GNU_SOURCE 1
#include <errno.h>
#include <unistd.h>
/* ... */
while ((count = TEMP_FAILURE_RETRY(read(fd, buffer, sizeof(buffer)))) > 0) {
// process "count" bytes in "buffer"
}
if (count == -1) {
// handle error
}
// otherwise, end of data reached
If the pipe is never closed ("endless" or "continuous" stream), the while loop will run forever because read will block until it can return a non-zero byte count.
Note that the pipe can also be put into a non-blocking mode which changes the behavior substantially, but the above is the default blocking mode behavior.
With respect to your UPD questions:
Yes, read holds the program execution until data is available, but NO, it doesn't necessarily wait for count bytes. It will wait for a least one non-empty write to the pipe, and that will wake the process; when the process gets a chance to run, it will return whatever's available up to but not necessarily equal to count bytes. Usually, this means that if another process writes 5 bytes, a blocked read(fd, buffer, 100) call will return 5 and execution will continue. Yes, if read returns 0, it's a signal that there's no more data to be read and the write side of the pipe has been closed (so no more data will ever be available). No, an EOF value does not appear in the buffer. Only bytes read will appear there, and the buffer won't be touched when read() returns 0, so it'll contain whatever was there before the read() call.
With respect to your UPD2 comment:
Yes, on Linux, EOF is a constant equal to the integer -1. (Technically, according to the C99 standard, it is an integer constant equal to a negative value; maybe someone knows of a platform where it's something other than -1.) This constant is not used by the read() interface, and it is certainly not written into the buffer. While read() returns -1 in case of error, it would be considered bad practice to compare the return value from read() with EOF instead of -1. As you note, the EOF value is really only used for C library functions like getc() and getchar() to distinguish the end of file from a successfully read character.

Handling large size of Read operation

I am interposing a read operation with my own implementation of read that prints some log and calls the libc read. I am wondering what should be the right way to handle read with a huge nbyte parameter. Since nbyte is size_t, what is the right way to handle out of range read request? From the read manpage:
If the value of nbyte is greater than {SSIZE_MAX}, the result is implementation-defined
What does this mean and if I have to handle a large read request, what should I do?
Don't change the behavior of the read() call - just wrap the OS-provided call and allow it to do what it does.
ssize_t read( int fd, void *buf, size_t bytes )
{
ssize_t result;
.
.
.
result = read_read( fd, buf, bytes );
.
.
.
return( result );
}
What could you possibly do if you're implementing a 64-bit library a caller passes you a size_t value that's greater than SSIZE_MAX? You can't split that up into anything reasonable anyway.
And if you're implementing a 32-bit library, how would you pass the proper result back if you did split up the read?
You could break up the one large request into several smaller ones.
Besides, SSIZE_MAX is positively huge. Are you really sure you need to read >2GB of data, in one go?
You could simply use strace(1) to get some logs of your read syscalls.
In practice the read count is the size of some buffer (in memory), so it is very unusual to have it being bigger than a dozen of megabytes. It is often some kilobytes.
So I believe you should not care about SSIZE_MAX limit in real life
The last parameter of read is the buffer size. It's not the number of bytes to read.
So:
if the buffer size you received is lesser than SSIZE_MAX, call the syscall 'read' with buffer size.
If the buffer size you received is greater than SSIZE_MAX, 'read' SSIZE_MAX
If the read syscall return -1, return -1 too
If the read syscall return 0 or less than SSIZE_MAX --> return the sum of bytes read.
If the read call return exactly SSIZE_MAX, decrement the buffer size received of SSIZE_MAX
and loop (goto "So")
Do not forget to adjust the buffer pointer and to count the total number of bytes read.
Being implementation defined means that there is no correct answer, and callers should never do this (because they can’t be certain how it will be handled). Given that you are interposing the syscall, I suggest you just assert(2) that the value is in range. If you end up failing that assert somewhere, fix the calling code to be compliant.

fwrite not return 0 when harddisk is full in linux why?

when hard disk is 100% full, fwrite(say 1000 Bytes) return 0 [Fail , as expected]
but when hard disk is has little empty space say 600Bytes , then fwrite (1000 bytes) not
return 0 [fail] but return say 300 Bytes , again calling fwrite still return 300bytes, fwrite never fails , even if we call 100 times ?
The Error no is setting properly to 28. My question is Why is behaviour of fwrite ? is this right? if fwrite return less than bytes which we want to write then this means disk is full ?
anh suggestion to handle this situation ?
From the fwrite man page:
On success, fread() and fwrite() return the number of items read or written. This number equals the number of bytes transferred only when size is 1. If an error occurs, or the end of the file is reached, the return value is a short item count (or zero).
So, a short write is an indication of an error, and errno should be consulted to see why the write did not complete. After consulting errno, Mat`s answer explains (and the C Standard says basically the same thing, except it uses the term indeterminate), the file position is invalid after an error, so trying to continue to use it after failure is not well defined.
POSIX says this about fwrite:
If an error occurs, the resulting value of the file-position indicator for the stream is unspecified.
This means that repeating that same fwrite on the same stream is an error in your code - you don't know where the stream is positioned, so attempting to write to it is not a good idea at all.
You handle this like it seems you're doing already: check errno after a short fwrite and do whatever is appropriate for your application.

C Unix socket programming, ensuring read/write byte counts?

I'm writing client and server programs and I'm looking for a way to ensure all bytes are read and all bytes are sent when using read() or write() to/from the sockets opened by the client/server.
I'm assuming I'll have to use a loop to check the number of bytes returned from the read or write functions.
Something like this probably:
#define BUFFER 20
char buffer[BUFFER];
while (I haven't read all bytes from the buffer){
int bytesRead = read(theSocket, myWord, BUFFER);
}
And how would I ensure that all the bytes I am trying to transmit using write() have been transmitted?
Thanks for any help!
Yes, exactly like that. Typical read logic goes like this:
1) Call read.
2) Did we get EOF or error? If so, return.
3) Did we receive all the bytes? If so, return.
4) Go to step 1.
Note that when you call read, you'll need to pass it a pointer to the buffer after the data that was already read, and you'll need to try to read an appropriate amount of bytes that won't overflow the buffer. Also, how you tell if you received all the bytes depends on the protocol.
To write:
1) Call write, passing it a pointer to the first unwritten byte and the number of unwritten bytes.
2) Did we get zero or error? If so, return.
3) Did we write all the bytes? If so, return.
4) Go to step 1.
Note that you have to adjust appropriately for blocking or non-blocking sockets. For example, for non-blocking sockets, you have to handle EWOULDBLOCK.

Resources