open file O_NONBLOCKING gets lost in kernel module - c

I am opening a file in my C program:
pcm->dfd = open(fname, O_RDONLY|O_NONBLOCK);
and later call select() and read() on it.
But my problem is, that the O_NONBLOCK gets lost somewere:
ssize_t my_read(struct file *filp, char __user *user_buffer, size_t bytes_requested, loff_t *capture_ptr) {
if (filp->f_flags & O_NONBLOCK){
LOGI("mode: O_NONBLOCK");
}
else{
LOGI("mode: BLOCKING"); // <-- this is printed
}
..
}
I also tried
pcm->dfd=open(fname, O_RDONLY|O_NONBLOCK);
// O_NONBLOCK does not work :/
int flags = fcntl(pcm->dfd, F_GETFL, 0);
fcntl(pcm->dfd, F_SETFL, flags | O_NONBLOCK);
It's not a logging-problem, the driver also behaves as in blocking-mode.
Anyone an idea?
EDIT:
The code which reads from the opened file is absolutely simple:
size=read(pcm->dfd,inBuffer,inBufferBytes);
I also checked the program if there's a fcntl() somewere else, but no..
EDIT 2:
May it be possible, that the O_NONBLOCK has an other value in my user-program (Android NDK) than in the kernel? I searched for O_NONBLOCK in the kernel-headers and already there are 2 different definitions.
I also checked the open-implementation in my kernel module and already there filp->f_flags is not O_NONBLOCK.

According to open(2) man-page, passing O_NONBLOCK only makes the open call itself non-blocking (which you, probably, don't want). It does not imply, that the opened file descriptor will also be in non-blocking mode -- you have to set that with a fcntl() after opening.

Related

Linux named fifo non-blocking read select returns bogus read_fds

Similar to the problem asked a while ago on kernel 3.x, but I'm seeing it on 4.9.37.
The named fifo is created with mkfifo -m 0666. On the read side it is opened with
int fd = open(FIFO_NAME, O_RDONLY | O_NONBLOCK);
The resulting fd is passed into a call to select(). Everything works ok, till I run echo >> <fifo-name>.
Now the fd appears in the read_fds after the select() returns. A read() on the fd will return one byte of data. So far so good.
The next time when select() is called and it returns, the fd still appears in the read_fds, but read() will always return zero meaning with no data. Effectively the read side would consume 100% of the processor capacity. This is exactly the same problem as observed by the referenced question.
Has anybody seen the same issue? And how can it be resolved or worked-around properly?
I've figured out if I close the read end of the fifo, and re-open it again, it will work properly. This probably is ok because we are not sending a lot of data. Though this is not a nice or general work-around.
This is expected behaviour, because the end-of-input case causes a read() to not block; it returns 0 immediately.
If you look at man 2 select, it says clearly that a descriptor in readfds is set if a read() on that descriptor would not block (at the time of the select() call).
If you used poll(), it too would immediately return with POLLHUP in revents.
As OP notes, the correct workaround is to reopen the FIFO.
Because the Linux kernel maintains exactly one internal pipe object to represent each open FIFO (see man 7 fifo and man 7 pipe), the robust approach in Linux is to open another descriptor to the FIFO whenever an end of input is encountered (read() returning 0), and close the original. During the time when both descriptors are open, they refer to the same kernel pipe object, so there is no race window or risk of data loss.
In pseudo-C:
fifoflags = O_RDONLY | O_NONBLOCK;
fifofd = open(fifoname, fifoflags);
if (fifofd == -1) {
/* Error checking */
}
/* ... */
/* select() readfds contains fifofd, or
poll() returns POLLIN for fifofd: */
n = read(fifofd, buffer, sizeof buffer)
if (!n) {
int tempfd;
tempfd = open(fifopath, fifoflags);
if (tempfd == -1) {
const int cause = errno;
close(fifofd);
/* Error handling */
}
close(fifofd);
fifofd = tempfd;
/* A writer has closed the FIFO. */
} else
/* Handling for the other read() result cases */
The file descriptor allocation policy in Linux is such that tempfd will be the lowest-numbered free descriptor.
On my system (Core i5-7200U laptop), reopening a FIFO in this way takes less than 1.5 µs. That is, it can be done about 680,000 times a second. I do not think this reopening is a bottleneck for any sensible scenario, even on low-powered embedded Linux machines.

How to see the error of open()

I am working with pipes and one pipe won't open, even though mkfifo() was successful.
I have this:
/* create the FIFO (named pipe) */
int ret_mk = mkfifo(out_myfifo, 0666);
if(ret_mk < 0) {
perror(out_myfifo);
unlink(out_myfifo);
return -1;
}
printf("ret_mk = %d\n", ret_mk);
/* write to the FIFO */
out_fd = open(out_myfifo, O_WRONLY);
printf("out_fd = %d\n", out_fd);
but nothing gets printed after open(), even a print of random text won't show up.
From here we have:
The open() function returns an integer value, which is used to refer to the file. If unsuccessful, it returns -1, and sets the global variable errno to indicate the error type.
What can I do to see why it won't open?
Read fifo(7). For FIFOs, an open call may block. To make open(2) non-blocking, use O_NONBLOCK in the flag argument:
out_fd = open(out_myfifo, O_WRONLY|O_NONBLOCK);
if (out_fd<0) perror(out_myfifo);
printf("%d\n", out_fd);
But usually you want a blocking open for write on a FIFO, because some other process should open the same FIFO for reading (and you want your writing process to wait that to happen).
Notice that there is no way to poll(2) the event that someone else has opened the other end of your FIFO (because poll wants an opened file descriptor). See also inotify(7); you could also want to use unix(7) sockets.
BTW, you could also use strace(1) for debugging purposes.
See also intro(2) and Advanced Linux Programming.

What is the purpose of calling fcntl() be called with the file descriptor as -1 and cmd as F_GETFL?

I am trying to understand what this line of code means:
flags = fcntl(-1,F_GETFL,0);
The usual reason for calling fcntl() with the F_GETFL flag is to modify the flags and set them with fcntl() and F_SETFL; the alternative reason for calling fcntl() with F_GETFL is to find out the characteristics of the file descriptor. You can find the information about which flags can be manipulated by reading (rather carefully) the information about <fcntl.h>. The flags include:
O_APPEND — Set append mode.
O_DSYNC — Write according to synchronized I/O data integrity completion.
O_NONBLOCK — Non-blocking mode.
O_RSYNC — Synchronized read I/O operations.
O_SYNC — Write according to synchronized I/O file integrity completion.
Plus (POSIX 2008) O_ACCMODE which can then be used to distinguish O_RDONLY, O_RDWR, and O_WRONLY, if I'm reading the referenced pages correctly.
However, it makes no sense whatsoever to call fcntl() with a definitively invalid file descriptor such as -1. All that happens is that the function returns -1 indicating failure and sets errno to EBADF (bad file descriptor).
Assuming we are talking about the function described by man 2 fcntl:
flags = fcntl(-1,F_GETFL,0);
tries to perform some action on an invalid file descriptor (-1) and therefore will never do anything else but returning -1 and set errno to EBADF.
I'd say you can savely replace this line by:
flags = -1; errno = EBADF;
The fcntl() function performs various actions on open descriptors. Its syntax is:
int fcntl(int descriptor,
int command,
...)
read about Return Value:
-1 then fcntl() was not successful. The errno global variable is set to indicate the error.
this code:
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
int main(){
int flags;
if((flags = fcntl(-1,F_GETFL,0)) < 0){
perror("fcntl: ");
}
printf("\n %d\n", flags);
}
output is:
~$ gcc xx.c
~$ ./a.out
fcntl: : Bad file descriptor
-1
Notice the printed flags value is -1 that indicates not successful call of fcntl(-1,F_GETFL,0); because -1 is not a valid file descriptor. And valid file descriptors starts from 0. (that is what perror() prints error message Bad file descriptor, EBADF)
note: I run this code in Linux System.
Edit:
F_GETFL is for GET flags command in fcntl().

Open new device descriptor with same options as reference descriptor

I have an open device descriptor, where I don't know the device name and the options
passed to open(...).
I want to open a new device descriptor with the same options passed to open.
int newFd = copy(referenceFd);
Where copy would do the job. dup() is certainly the wrong choice as a further ioctl() on newFd would also alter the referenceFd, therefore I want to open a new descriptor.
Is there a system call which provides such functionality?
I have not been able to find something yet.
You can probably do it with a series of fcntl calls:
F_GETFD - Get the file descriptor flags defined in that are associated with the file descriptor fildes.
F_GETFL - Get the file status flags and file access modes, defined in , for the file description associated with fildes.
I linked the SUSv4 page above; you might also be interested in the Linux version.
First, get the descriptor flags of file descriptor fd using
int flags = fcntl(fd, F_GETFL);
Then, reopen the descriptor using the Linux-specific /proc/self/fd/fd entry:
int newfd;
char buffer[32];
if (snprintf(buffer, sizeof buffer, "/proc/self/fd/%d", fd) < sizeof buffer) {
do {
newfd = open(buffer, flags & ~(O_TRUNC | O_EXCL), 0666);
} while (newfd == -1 && errno == EINTR);
if (newfd == -1) {
/* Error: Cannot reopen file. Error in errno. */
}
} else {
/* Error: the path does not fit in the buffer. */
}
The reopened file descriptor is then in newfd.
Note that you most likely want to make sure O_TRUNC (truncate the file) and O_EXCL (fail if exists) are not in the flags, to avoid the cases where reopening using the exact original flags would cause unwanted results.
You do not want to use lstat(), because that opens up a race condition with respect to rename -- a user renaming the target file between the lstat() and the open() in above code. The above avoids that completely.
If you don't want to be Linux-specific, add code that tries the same with /dev/fd/fd if the above fails. It is supported in some Unix systems (and most Linux distros too).

Pseudo-terminal (pty) reporting Resource Temporarily Unavailable

I have a Pseudo-terminal slave that's giving me a read/write error of Resource Temporarily Unavailable (11). I have been unable to solve this problem, but up until a week ago I didn't know anything pty's. So, I might be missing something obvious.
From what I have read, this can be caused by calling read() on a non-blocking pty. However, when I check the F_GETFL after I open() the slave pty, the value shows that it is a blocking file descriptor.
The output of F_GETFL shows that the O_NONBLOCK flag is disabled, and the O_RDWR flag is enabled:
printf("F_GETFL: %x\n", fcntl( slavefd, F_GETFL)); // outputs F_GETFL: 2
I have even tried treating slavefd as a non-blocking file by using select() to determine when it's ready. But, it just times out every time.
So, why does read() set errno to Resource Temporarily Unavailable if slavefd is set to blocking? Do the flags of F_GETFL look correct? What else can I try to narrow the cause of this problem?
Update: (more info)
I'm not sure yet, but I think the pty slave device node is being locked somehow by pppd. I have been told that you can echo into pty slave, which seems to be true except when pppd is using it.
Update 2: (code added)
if (argc!=2)
return;
printf("opening %s\n", argv[1]);
slavefd = open(argv[1], O_RDWR );
if (slavefd < 0)
return;
This update shows how I'm opening the slave device. Since I am using this application for debugging, I'm just directly using argv[1].
PROBLEM SOLVED:
The slave node that I was attempting to read/write to was being modified by pppd. When pppd takes control of a tty/pty device, it changes the line discipline from N_TTY to N_PPP. This means that when when you open() and then read() or write() to the slave node, the PPP intermediate driver is being used instead of the TTY driver. So, a read() and write() boils down to a totally different function. Looking at the N_PPP driver I found the following. This answers my question as to why EAGAIN was being returned.
/*
* Read does nothing - no data is ever available this way.
* Pppd reads and writes packets via /dev/ppp instead.
*/
static ssize_t
ppp_asynctty_read(struct tty_struct *tty, struct file *file,
unsigned char __user *buf, size_t count)
{
return -EAGAIN;
}
/*
* Write on the tty does nothing, the packets all come in
* from the ppp generic stuff.
*/
static ssize_t
ppp_asynctty_write(struct tty_struct *tty, struct file *file,
const unsigned char *buf, size_t count)
{
return -EAGAIN;
}
The slave node that I was attempting to read/write to was being modified by pppd. When pppd takes control of a tty/pty device, it changes the line discipline from N_TTY to N_PPP. This means that when when you open() and then read() or write() to the slave node, the PPP intermediate driver is being used instead of the TTY driver. So, a read() and write() boils down to a totally different function. Looking at the N_PPP driver I found the following. This answers my question as to why EAGAIN was being returned.
/*
* Read does nothing - no data is ever available this way.
* Pppd reads and writes packets via /dev/ppp instead.
*/
static ssize_t
ppp_asynctty_read(struct tty_struct *tty, struct file *file,
unsigned char __user *buf, size_t count)
{
return -EAGAIN;
}
/*
* Write on the tty does nothing, the packets all come in
* from the ppp generic stuff.
*/
static ssize_t
ppp_asynctty_write(struct tty_struct *tty, struct file *file,
const unsigned char *buf, size_t count)
{
return -EAGAIN;
}

Resources