EAGAIN on a blocking read system call on regular file

EAGAIN on a blocking read system call on regular file - c

So, this is a weird case that I am seeing sometimes and not able to figure out a reason.
We have a C program that reads from a regular file. And there are other processes which write into the same file. The application is based on the fact that the writes are atomic in Linux for write size up to 4096 bytes.
The file is NOT opened with non blocking flag, so my assumption is that reads would be blocking.
But sometimes during the startup, we see "Resource temporarily unavailable" error set in errno. And the size returned by read != -1 but some partially read size.
An error message would look something like:
2018-08-07T06:40:52.991141Z, Invalid message size, log_s.bin, fd 670, Resource temporarily unavailable, read size 285, expected size 525
My questions are:
Why are we getting EAGAIN on blocking file read?
Why is the return value not -1?
This happens only during the initial time when it is started. It works fine thereafter. What are some edge cases that can get us in such situation?

Why are we getting EAGAIN on blocking file read ?
You aren't (see below).
Why is the return value not -1 ?
Because the operation did not fail.
The value of errno only carries a sane value if the call to read() failed. A call to read() failed if and only if -1 is returned.
From the Linux man-page for read():
RETURN VALUE
On success, the number of bytes read is returned (zero indicates end
of file), and the file position is advanced by this number. It is
not an error if this number is smaller than the number of bytes
requested;
[...]
On error, -1 is returned, and errno is set appropriately.
A common pattern to read() would be
char buffer[BUFFER_MAX];
char * p = buffer;
size_t to_read = ... /* not larger then BUFFER_MAX! */
while (to_read > 0)
{
ssize_t result = read(..., p, to_read);
if (-1 == result)
{
if (EAGAIN == errno || EWOULDBLOCK == errno)
{
continue;
}
if (EINTR == errno)
{
continue; /* or break depending on application design. */
}
perror("read() failed");
exit(EXIT_FAILURE);
}
else if (0 < result)
{
to_read -= (size_t) result;
p += (size_t) result;
}
else if (0 == result) /* end of file / connection shut down for reading */
{
break;
}
else
{
fprintf(stderr, "read() returned the unexpected value of %zd. You probably hit a (kernel) bug ... :-/\n", result);
exit(EXIT_FAILURE);
}
}
If (0 < to_read)
{
fprintf(stderr, "Encountered early end of stream. %zu bytes not read.\n", to_read);
}

Related

read from a pipe: How to handle the return values in c?

I'm writing a program to read from a pipe and I want to know what is the correct way of handling the return values. According to the read man page,
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal.
I'm worried about the case where it may read only half of the data. Also, what is the correct way to handle the case when the return value is zero?
Here is my sample code.
struct day
{
int date;
int month;
};
while(1)
{
ret = select(maxfd+1, &read_fd, NULL, &exc_fd,NULL);
if(ret < 0)
{
perror("select");
continue;
}
if(FD_ISSET(pipefd[0], &read_fd))
{
struct day new_data;
if((ret = read(pipefd[0], &new_data, sizeof(struct day)))!= sizeof(struct day))
{
if(ret < 0)
{
perror("read from pipe");
continue;
}
else if(ret == 0)
{
/*how to handle?*/
}
else
/* truncated read. How to handle?*/
}
}
...
}
I believe read() cannot read more data than the size specified. please correct me if I'm wrong.
Please help me with the handling of return value of read.

When you read you request for a given amount of data, but nothing can guarantee you that there is as much available data to read as you requested. For example, you may encounter the end of file, or the writer part didn't write too much data in your pipe. So read returns you what was effectively read, aka the number of bytes read is returned (zero indicates end of file).
If read returns a strictly positive number, it's clear.
If read returns 0, then that means end of file. For a regular file that means that you are currently at the end of the file. For a pipe this means that the pipe is empty and that no single byte will ever be written to. For pipes that means that you already read all data and that there is no more writer on the other end (so that no more byte will be written), so you can then close the now unuseful pipe.
If read returns -1 this means that an error happened, and you must consult errno variable to determine the cause of the trouble.
So, a general schema could be something like:
n = read(descriptor,buffer,size);
if (n==0) { // EOF
close(descriptor);
} else if (n==-1) { // error
switch(errno) { // consult documentations for possible errors
case EAGAIN: // blahblah
}
} else { // available data
// exploit data from buffer[0] to buffer[n-1] (included)
}

If read returns 0, then your process has read all of the data that will come from that file descriptor. Take it out of read_fd and if it was the maxfd reset maxfd to the newest max. Depending on what your process is doing, you might have other cleanup to do as well. If you get a short read, then either process the data you've received or discard it or store it until you get all the data and can process it.
It's hard to give more specific answers to a very general question.

linux c socket error: Input/output error

I've got a "Input/output error" error when I try to send data to a tcp server. What does this mean in terms of sockets? Its basically the same code I was used always worked fine. I was hoping someone could tell me what are the reasons of inpput/output error when I tried to send over a socket and how I could check/fix them. Any help is appreciated.
struct SOCKETSTUS {
int sendSockFd;
int recvSockFd;
short status;
long heartBeatSendTime;
long heartBeatRecTime;
long loginPackSendTime;
};
struct SOCKETSTUS sockArr[128];
if (tv.tv_sec - sockArr[i].heartBeatSendTime >= beatTim)
{
if (send(sockArr[i].sendSockFd, szBuffer, packetSize, 0) != packetSize)
{
fprintf(stderr, "Heartbeat package send failed：[%d][%s]\n", errno, strerror(errno));
if (errno == EBADF || errno == ECONNRESET || errno == ENOTCONN || errno == EPIPE)
{
Debug("link lose connection\n"); Reconn(i); continue;
}
}
else
{
sockArr[i].heartBeatSendTime = tv.tv_sec;
if (sockArr[i].status == SOCK_IN_FLY)
sockArr[i].heartBeatRecTime = tv.tv_sec;
}
}
The error occured in send() calls.

Your error check is incorrect. send() returns the number of bytes sent or -1 on error. You check only that the return value equals packetSize, not that the return value indicates error. Sometimes send() on a stream socket will return fewer bytes than requested.
So, some previous syscall (perhaps a harmlessly failed tty manipulation? a dodgy signal handler?) set errno to EIO.
Change your code to treat -1 different from a "short" send.

Is there a cleaner way to use the write() function reliably?

I read the man pages, and my understanding is that if write() fails and sets the errno to EAGAIN or EINTR, I may perform the write() again, so I came up with the following code:
ret = 0;
while(ret != count) {
write_count = write(connFD, (char *)buf + ret, count);
while (write_count < 0) {
switch(errno) {
case EINTR:
case EAGAIN:
write_count = write(connFD, (char *)buf + ret, count -ret);
break;
default:
printf("\n The value of ret is : %d\n", ret);
printf("\n The error number is : %d\n", errno);
ASSERT(0);
}
}
ret += write_count;
}
I am performing read() and write() on sockets and handling the read() similarly as above. I am using Linux, with gcc compiler.

You have a bit of a "don't repeat yourself" problem there - there's no need for two separate calls to write, nor for two nested loops.
My normal loop would look something like this:
for (int n = 0; n < count; ) {
int ret = write(fd, (char *)buf + n, count - n);
if (ret < 0) {
if (errno == EINTR || errno == EAGAIN) continue; // try again
perror("write");
break;
} else {
n += ret;
}
}
// if (n < count) here some error occurred

EINTR and EAGAIN handling should often be slightly different. EAGAIN is always some kind of transient error representing the state of the socket buffer (or perhaps, more precisely, that your operation may block).
Once you've hit an EAGAIN you'd likely want to sleep a bit or return control to an event loop (assuming you're using one).
With EINTR the situation is a bit different. If your application is receiving signals non-stop, then it may be an issue in your application or environment, and for that reason I tend to have some kind of internal eintr_max counter so I am not stuck in the theoretical situation where I just continue infinitely looping on EINTR.
Alnitak's answer (sufficient for most cases) should also be saving errno somewhere, as it may be clobbered by perror() (although it may have been omitted for brevity).

I would prefer to poll the descriptor in case of EAGAIN instead of just busy looping and burning up CPU for no good reason. This is kind of a "blocking wrapper" for a non-blocking write I use:
ssize_t written = 0;
while (written < to_write) {
ssize_t result;
if ((result = write(fd, buffer, to_write - written)) < 0) {
if (errno == EAGAIN) {
struct pollfd pfd = { .fd = fd, .events = POLLOUT };
if (poll(&pfd, 1, -1) <= 0 && errno != EAGAIN) {
break;
}
continue;
}
return written ? written : result;
}
written += result;
buffer += result;
}
return written;
Note that I'm not actually checking the results of poll other than the return value; I figure the following write will fail if there is a permanent error on the descriptor.
You may wish to include EINTR as a retryable error as well by simply adding it to the conditions with EAGAIN, but I prefer it to actually interrupt I/O.

Yes, there are cleaner ways to use write(): the class of write functions taking a FILE* as an argument. That is, most importantly, fprintf() and fwrite(). Internally, these library functions use the write() syscall to do their job, and they handle stuff like EAGAIN and EINTR.
If you only have a file descriptor, you can always wrap it into a FILE* by means of fdopen(), so you can use it with the functions above.
However, there is one pitfall: FILE* streams are usually buffered. This can be a problem if you are communicating with some other program and are waiting for its response. This may deadlock both programs even though there is no logical error, simply because fprintf() decided to defer the corresponding write() a bit. You can switch the buffering off, or fflush() output streams whenever you actually need the write() calls to be performed.

write() and send() solving errors => difference?

could be any difference in solving errors between this two functions?:
To this question brought me another question ... is number of characters always same as number of bytes?
For more info: I use it in C on Linux for TCP socket comunication(sys/socket.h)
Thanks for your responses.
send()
write()
Return:
write():
On success, the number of bytes written are returned (zero indicates nothing was written). On error, -1 is returned, and errno is set appropriately. If count is zero and the file descriptor refers to a regular file, 0 will be returned without causing any other effect. For a special file, the results are not portable.
send():
The calls return the number of characters sent, or -1 if an error occurred.
Question from stackoverflow which says that this methods should be same with using flag zero.
here
int client_sockfd;
char* msg;
int length = strlen(msg);
//first option
if(send(client_sockfd, msg, length, 0) != length) return 1;
else return 0;
//second option
if(write(client_sockfd, msg, length) != length) return 1;
else return 0;

They will both return the same number of written bytes (== characters in this case. EXCEPT note this:
If the message is too long to pass atomically through the underlying protocol, the
error EMSGSIZE is returned, and the message is not transmitted.
In other words, depending on the size of the data being written, write() may succeed where send() may fail.

Number of bytes == number of characters, since the C standard reuires that char be an 1-byte integer.
write():
Yes, it returns the number of bytes written. But: it's not always an error if it doesn't return as many bytes as it should heva written. Especially not for TCP communication. A socket may be nonblocking or simply busy, in which case you'll need to rewrite the not-yet-written bytes. This behavior can be achieved like this:
char *buf = (however you acquire your byte buffer);
ssize_t len = (total number of bytes to be written out);
while (len > 0)
{
ssize_t written = write(sockfd, buf, len);
if (written < 0)
{
/* now THAT is an error */
break;
}
len -= written;
buf += written; /* tricky pointer arythmetic */
}
read():
Same applies here, with the only difference that EOF is indicated by returning 0, and it's not an error. Again, you have to retry reading if you want to receive all the available data from a socket.
int readbytes = 0;
char buf[512];
do {
readbytes = read(sockfd, buf, 512);
if (readbytes < 0)
{
/* error */
break;
}
if (readbytes > 0)
{
/* process your freshly read data chunk */
}
} while (readbytes > 0); /* until EOF */
You can see my implementation of a simple TCP helper class using this technique at https://github.com/H2CO3/TCPHelper/blob/master/TCPHelper.m

Is a return value of 0 from write(2) in C an error?

In the man page for the system call write(2) -
ssize_t write(int fd, const void *buf, size_t count);
it says the following:
Return Value
On success, the number of bytes
written are returned (zero indicates
nothing was written). On error, -1 is
returned, and errno is set
appropriately. If count is zero and
the file descriptor refers to a
regular file, 0 may be returned, or an
error could be detected. For a special
file, the results are not portable.
I would interpret this to mean that returning 0 simply means that nothing was written, for whatever arbitrary reason.
However, Stevens in UNP treats a return value of 0 as a fatal error when dealing with a file descriptor that is a TCP socket ( this is wrapped by another function which calls exit(1) on a short count ):
ssize_t /* Write "n" bytes to a descriptor. */
writen(int fd, const void *vptr, size_t n)
{
size_t nleft;
ssize_t nwritten;
const char *ptr;
ptr = vptr;
nleft = n;
while (nleft > 0) {
if ( (nwritten = write(fd, ptr, nleft)) <= 0) {
if (nwritten < 0 && errno == EINTR)
nwritten = 0; /* and call write() again */
else
return(-1); /* error */
}
nleft -= nwritten;
ptr += nwritten;
}
return(n);
}
He only treats 0 as a legit return value if the errno indicates that the call to write was interrupted by the process receiving a signal.
Why?

Stevens probably does this to catch old implementations of
write() that behaved differently. For instance, the Single Unix Spec
says (http://www.opengroup.org/onlinepubs/000095399/functions/write.html)
Where this volume of IEEE Std
1003.1-2001 requires -1 to be returned and errno set to [EAGAIN], most
historical implementations return zero

This will ensure that the code does not spin indefinitely, even if the file descriptor is not a TCP socket or unexpected non-blocking flags are in effect. On some systems, certain legacy non-blocking modes (e.g. O_NDELAY) cause write() to return 0 (without setting errno) if no data can be written without blocking, at least for certain types of file descriptors. (The POSIX standard O_NONBLOCK uses an error return for this case.) And some of the non-blocking modes on some systems apply to the underlying object (e.g. socket, fifo) rather than the file descriptor, and so could even have been enabled by another process having an open file descriptor for the same object. The code protects itself from spinning in such a situation by simply treating it as an error, since it is not intended for use with non-blocking modes.

Also, and just to be somewhat pedantic here, if you are not writing to a socket, i would check to make sure that the buffer length ("count" in the first example) is actually being calculated correctly. In the Stevens example, you wouldn't even execute the write() call if the buffer length was 0.

As your man page says, the return value of 0 is "not portable" for special files. Sockets are special files, so the result could mean something different for them.
Usually for sockets, a value of 0 bytes from read() or write() is an indication that the socket has closed, and after receiving 0, subsequent calls will return -1 with an error code.