I use timerfd with zmq.
How can I use timerfd_create and timerfd_set to wait one second for the timer (https://man7.org/linux/man-pages/man2/timerfd_create.2.html)?
I have looked through the link but I still do not get how I can initilize a timer that waits one second per tick with create and set. This is exactly my task:
We start a timer with timerfd_create(), which is 1 / sec. ticking. When setting a timer with timer_set_(..) a
counter is simply incremented, which is decremented with every tick. When the counter reaches 0, the timer
has expired.
In this project we have a function timer _ set _(), where the timer is set with the function timerfd_create and timerfd_settimer(). I hope you can help me.
This is my progress (part of my code):
struct itimerspec timerValue;
g_items[n].socket = nullptr;
g_items[n].events = ZMQ_POLLIN;
g_items[n].fd = timerfd_create(CLOCK_REALTIME, 0);
if(g_items[n].fd == -1 ){
printf("timerfd_create() failed: errno=%d\n", errno);
return -1;
}
timerValue.it_value.tv_sec = 1;
timerValue.it_value.tv_nsec = 0;
timerValue.it_interval.tv_sec = 1;
timerValue.it_interval.tv_nsec = 0;
timerfd_settime(g_items[n].fd, 0, &timerValue, NULL);
The question appears about setting correctly the timeouts of the timer.
With the settings
timerValue.it_value.tv_sec = 1;
timerValue.it_value.tv_nsec = 0;
timerValue.it_interval.tv_sec = 1;
timerValue.it_interval.tv_nsec = 0;
You are correctly setting the initial timeout to 1s (field timerValue.it_value). But you are also setting a periodic interval of 1s, and you didn't mention the will to do it.
About the timeouts
This behavior is described by the following passage of the manual:
int timerfd_create(int clockid, int flags);
new_value.it_value specifies the initial expiration of the timer, in seconds and nanoseconds. Setting either field of new_value.it_value to a nonzero value arms the timer.Setting both fields of new_value.it_value to zero disarms the timer.
Setting one or both fields of new_value.it_interval to nonzero values specifies the period, in seconds and nanoseconds, for repeated timer expirations after the initial expiration. If both fields of new_value.it_interval are zero, the timer expires just once, at the time specified by new_value.it_value.
The emphasis on the last paragraph is mine, as it shows what to do in order to have a single-shot timer.
The benefits of timerrfd. How to detect timer expiration?
The main advantage provided by timerfd is that the timer is associated to a file descriptor, and this means that it
may be monitored by select(2), poll(2), and epoll(7).
The information contained in the other answer about read() is valid as well: let's just say that, even using functions such as select(), read() function will be required in order to consume data in the file descriptor.
A complete example
In the following demonstrative program, a timeout of 4 seconds is set; after that a periodic interval of 5 seconds is set.
The good old select() is used in order to wait for timer expiration, and read() is used to consume data (that is the number of expired timeouts; we will ignore it).
#include <stdio.h>
#include <sys/timerfd.h>
#include <sys/select.h>
#include <time.h>
int main()
{
int tfd = timerfd_create(CLOCK_REALTIME, 0);
printf("Starting at (%d)...\n", (int)time(NULL));
if(tfd > 0)
{
char dummybuf[8];
struct itimerspec spec =
{
{ 5, 0 }, // Set to {0, 0} if you need a one-shot timer
{ 4, 0 }
};
timerfd_settime(tfd, 0, &spec, NULL);
/* Wait */
fd_set rfds;
int retval;
/* Watch timefd file descriptor */
FD_ZERO(&rfds);
FD_SET(0, &rfds);
FD_SET(tfd, &rfds);
/* Let's wait for initial timer expiration */
retval = select(tfd+1, &rfds, NULL, NULL, NULL); /* Last parameter = NULL --> wait forever */
printf("Expired at %d! (%d) (%d)\n", (int)time(NULL), retval, read(tfd, dummybuf, 8) );
/* Let's wait (twice) for periodic timer expiration */
retval = select(tfd+1, &rfds, NULL, NULL, NULL);
printf("Expired at %d! (%d) (%d)\n", (int)time(NULL), retval, read(tfd, dummybuf, 8) );
retval = select(tfd+1, &rfds, NULL, NULL, NULL);
printf("Expired at %d! (%d) (%d)\n", (int)time(NULL), retval, read(tfd, dummybuf, 8) );
}
return 0;
}
And here it is the output. Every row contains also the timestamp, so that the actual elapsed time can be checked>
Starting at (1596547762)...
Expired at 1596547766! (1) (8)
Expired at 1596547771! (1) (8)
Expired at 1596547776! (1) (8)
Please note:
We just performed 3 reads, for test
The intervals are 4s + 5s + 5s (initial timeout + two interval timeouts)
8 bytes are returned by read(). We ignored them, but they contained the number of the expired timeouts
With timerfds, the idea is that a read on the fd will return the number of times the timer has expired.
From the timerfd_settime(2) man page:
Operating on a timer file descriptor
The file descriptor returned by timerfd_create() supports the following operations:
read(2)
If the timer has already expired one or more times since its settings were last modified using
timerfd_settime(), or since the last successful read(2), then the buffer given to read(2) returns
an unsigned 8-byte integer (uint64_t) containing the number of expirations that have occurred.
If no timer expirations have occurred at the time of the read(2), then the call either blocks
until the next timer expiration, or fails with the error EAGAIN if the file descriptor has been
made nonblocking (via the use of the fcntl(2) F_SETFL operation to set the O_NONBLOCK flag).
So, basically, you create an unsigned 8 byte integer (uint64_t on Linux), and pass that to your read call.
uint64_t buf;
int expired = read( g_items[n].fd, &buf, sizeof(uint64_t));
if( expired < 0 ) perror("read");
Something like that, if you want to block until you get an expiry.
Related
I have a list of clients and their descriptors.
First, I would like to start a timer when each client connects to my server.
And my problem is that I want to disconnect clients that are inactive for x seconds (for example 120 seconds).
I just would like to have an idea of how to proceed (or with a code sample)
A method that works independently of the system chosen to listen to the client (fork, pthread, select) is to use poll with a timeout, this example works with stdin as file descriptor, you just have to adapt it to your environment, basically change:
struct pollfd pfd = {.fd = STDIN_FILENO, .events = POLLIN};
to
struct pollfd pfd = {.fd = fd, .events = POLLIN};
#include <stdio.h>
#include <stdlib.h>
#include <sys/poll.h>
#include <unistd.h>
int main(void)
{
struct pollfd pfd = {.fd = STDIN_FILENO, .events = POLLIN};
/**
* poll()
* Waits for one of a set of file descriptors to become ready to perform I/O.
* --------------------------------------------------------------------------
* Arguments:
* 1) Pointer to pollfd
* 2) Number of pollfds
* 3) Timeout in milliseconds
* --------------------------------------------------------------------------
* Returns:
* -1 on error
* 0 on timeout
* Another value if "ready"
*/
int ready = poll(&pfd, 1, 120000); // 120 seconds
if (ready == -1)
{
perror("poll");
exit(EXIT_FAILURE);
}
if (ready == 0)
{
// close(fd);
puts("Timeout");
// return from your pthread handler or exit your forked process here
}
if (pfd.revents & POLLIN)
{
// Handle client here
// ssize_t size = recv(...);
}
return 0;
}
In each client structure you need to keep track of the disconnect time.
In your main loop (I assume you are using poll or select or similar) you need to check the earliest disconnect time, calculate how far that is from now, and use that as the timeout. If the earliest disconnect time is 5 seconds after now, then the timeout should be 5 seconds.
If you get a timeout, then disconnect that client.
Optionally you may put the clients in a sorted list based on their timeout so it's easy to find the next one; optionally you may check if more than one client times out at the same time, etc; that's out of scope.
Registering a level triggered eventfd on epoll_ctl only fires once, when not decrementing the eventfd counter. To summarize the problem, I have observed that the epoll flags (EPOLLET, EPOLLONESHOT or None for level triggered behaviour) behave similar. Or in other words: Does not have an effect.
Could you confirm this bug?
I have an application with multiple threads. Each thread waits for new events with epoll_wait with the same epollfd. If you want to terminate the application gracefully, all threads have to be woken up. My thought was that you use the eventfd counter (EFD_SEMAPHORE|EFD_NONBLOCK) for this (with level triggered epoll behavior) to wake up all together. (Regardless of the thundering herd problem for a small number of filedescriptors.)
E.g. for 4 threads you write 4 to the eventfd. I was expecting epoll_wait returns immediately and again and again until the counter is decremented (read) 4 times. epoll_wait only returns once for every write.
Yep, I read all related manuals carefully ;)
#include <sys/epoll.h>
#include <sys/eventfd.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
static int event_fd = -1;
static int epoll_fd = -1;
void *thread(void *arg)
{
(void) arg;
for(;;) {
struct epoll_event event;
epoll_wait(epoll_fd, &event, 1, -1);
/* handle events */
if(event.data.fd == event_fd && event.events & EPOLLIN) {
uint64_t val = 0;
eventfd_read(event_fd, &val);
break;
}
}
return NULL;
}
int main(void)
{
epoll_fd = epoll_create1(0);
event_fd = eventfd(0, EFD_SEMAPHORE| EFD_NONBLOCK);
struct epoll_event event;
event.events = EPOLLIN;
event.data.fd = event_fd;
epoll_ctl(epoll_fd, EPOLL_CTL_ADD, event_fd, &event);
enum { THREADS = 4 };
pthread_t thrd[THREADS];
for (int i = 0; i < THREADS; i++)
pthread_create(&thrd[i], NULL, &thread, NULL);
/* let threads park internally (kernel does readiness check before sleeping) */
usleep(100000);
eventfd_write(event_fd, THREADS);
for (int i = 0; i < THREADS; i++)
pthread_join(thrd[i], NULL);
}
When you write to an eventfd, a function eventfd_signal is called. It contains the following line which does the wake up:
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
With wake_up_locked_poll being a macro:
#define wake_up_locked_poll(x, m) \
__wake_up_locked_key((x), TASK_NORMAL, poll_to_key(m))
With __wake_up_locked_key being defined as:
void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key)
{
__wake_up_common(wq_head, mode, 1, 0, key, NULL);
}
And finally, __wake_up_common is being declared as:
/*
* The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
* wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
* number) then we wake all the non-exclusive tasks and one exclusive task.
*
* There are circumstances in which we can try to wake a task which has already
* started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
* zero in this (rare) case, and we handle it by continuing to scan the queue.
*/
static int __wake_up_common(struct wait_queue_head *wq_head, unsigned int mode,
int nr_exclusive, int wake_flags, void *key,
wait_queue_entry_t *bookmark)
Note the nr_exclusive argument and you will see that writing to an eventfd wakes only one exclusive waiter.
What does exclusive mean? Reading epoll_ctl man page gives us some insight:
EPOLLEXCLUSIVE (since Linux 4.5):
Sets an exclusive wakeup mode for the epoll file descriptor that is being attached to the target file descriptor, fd. When a wakeup event occurs and multiple epoll file descriptors are attached to the same target file using EPOLLEXCLUSIVE, one or more of the epoll file descriptors will receive an event with epoll_wait(2).
You do not use EPOLLEXCLUSIVE when adding your event, but to wait with epoll_wait every thread has to put itself to a wait queue. Function do_epoll_wait performs the wait by calling ep_poll. By following the code you can see that it adds the current thread to a wait queue at line #1903:
__add_wait_queue_exclusive(&ep->wq, &wait);
Which is the explanation for what is going on - epoll waiters are exclusive, so only a single thread is woken up. This behavior has been introduced in v2.6.22-rc1 and the relevant change has been discussed here.
To me this looks like a bug in the eventfd_signal function: in semaphore mode it should perform a wake-up with nr_exclusive equal to the value written.
So your options are:
Create a separate epoll descriptor for each thread (might not work with your design - scaling problems)
Put a mutex around it (scaling problems)
Use poll, probably on both eventfd and epoll
Wake each thread separately by writing 1 with evenfd_write 4 times (probably the best you can do).
I have a requirement to set more than one interval-timers (alarms of same type : ITIMER_REAL) in the same process. so I used setitimer() system call to create 3 alarms with each timer having separate structures to hold time interval values. when any timer expires it will give a signal SIGALRM to the calling process, but i couldn't find which timer among three has given the signal and I don't even know whether all the timers are running or not. Is there any way to find which timer has given the signal...
Thank you.
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
void timer_handler (int signum)
{
static int count = 0;
printf ("timer1 expired %d times\n", ++count);
}
int main ()
{
int m = 0;
struct sigaction sa;
struct itimerval timer1, timer2, timer3;
memset (&sa, 0, sizeof (sa));
sa.sa_handler = &timer_handler;
sigaction (SIGALRM/*SIGVTALRM*/, &sa, NULL);
timer1.it_value.tv_sec = 1;
timer1.it_value.tv_usec = 0;
timer1.it_interval.tv_sec = 5;
timer1.it_interval.tv_usec = 0;
timer2.it_value.tv_sec = 2;
timer2.it_value.tv_usec = 0/* 900000*/;
timer2.it_interval.tv_sec = 5;
timer2.it_interval.tv_usec = 0/*900000*/;
timer3.it_value.tv_sec = 3;
timer3.it_value.tv_usec = 0/* 900000*/;
timer3.it_interval.tv_sec = 5;
timer3.it_interval.tv_usec = 0/*900000*/;
setitimer (ITIMER_REAL/*ITIMER_VIRTUAL*/, &timer1, NULL);
setitimer (ITIMER_REAL/*ITIMER_VIRTUAL*/, &timer2, NULL);
setitimer (ITIMER_REAL/*ITIMER_VIRTUAL*/, &timer3, NULL);
while (1)
{
//printf("\nin main %d",m++);
//sleep(1);
}
}
No, you only have one ITIMER_REAL timer per process. Using setitimer multiple times overwrites the previous value, see man setitimer
A process has only one of each of the three types of timers.
You can also see this, when you modify the intervals in your example code
timer1.it_interval.tv_sec = 1;
timer2.it_interval.tv_sec = 2;
and using nanosleep instead of sleep, because it might interfere with SIGALRM.
Now running the code, you will see only 5 second intervals.
You can also retrieve the previous set value by providing a second struct itimerval
struct itimerval old1, old2, old3;
setitimer(ITIMER_REAL, &timer1, &old1);
setitimer(ITIMER_REAL, &timer2, &old2);
setitimer(ITIMER_REAL, &timer3, &old3);
old1 will contain zero values, because it is the first time you use setitimer. old2 contains it_interval = 1 sec, and old3 contains it_interval = 2 sec. The it_values will be different, depending on how much time elapsed between the calls to setitimer.
So, if you need multiple timers, you need to do some bookkeeping. Each time a timer expires, you must calculate which timer is next and call setitimer accordingly.
As an alternative, you may look into POSIX timers. This allows to create multiple timers
A program may create multiple interval timers using timer_create().
and also pass some id to the handler via sigevent. Although the example at the end of the man page looks a bit more involved.
If I understand your question right you want to know the status of the different timers.
In the reference a getitimer function avalable:
The function getitimer() fills the structure pointed to by curr_value
with the current setting for the timer specified by which (one of
ITIMER_REAL, ITIMER_VIRTUAL, or ITIMER_PROF). The element it_value is
set to the amount of time remaining on the timer, or zero if the timer
is disabled. Similarly, it_interval is set to the reset value.
You can find the full reference here Link
Hope that helps
I am writing a data acquisitioning program that needs to
wait for serial with select()
read serial data (RS232 at 115200 baud),
timestamp it (clock_gettime()),
read an ADC on SPI,
interpret it,
send new data over another tty device
loop and repeat
The ADC is irrelevant for now.
At the end of the loop I use select() again with a 0 timeout to poll and see whether data is available already, if it is it means I have overrun , I.e. I expect the loop to end before more data and for the select() at the start of the loop to block and get it as soon as it arrives.
The data should arrive every 5ms, my first select() timeout is calculated as (5.5ms - loop time) - which should be about 4ms.
I get no timeouts but many overruns.
Examining the timestamps reveals that select() blocks for longer than the timeout (but still returns>0).
It looks like select() returns late after getting data before timeout.
This might happen 20 times in 1000 repeats.
What might be the cause? How do I fix it?
EDIT:
Here is cut down version of the code (I do much more error checking than this!)
#include <bcm2835.h> /* for bcm2835_init(), bcm2835_close() */
int main(int argc, char **argv){
int err = 0;
/* Set real time priority SCHED_FIFO */
struct sched_param sp;
sp.sched_priority = 30;
if ( pthread_setschedparam(pthread_self(), SCHED_FIFO, &sp) ){
perror("pthread_setschedparam():");
err = 1;
}
/* 5ms between samples on /dev/ttyUSB0 */
int interval = 5;
/* Setup tty devices with termios, both totally uncooked, 8 bit, odd parity, 1 stop bit, 115200baud */
int fd_wc=setup_serial("/dev/ttyAMA0");
int fd_sc=setup_serial("/dev/ttyUSB0");
/* Setup GPIO for SPI, SPI mode, clock is ~1MHz which equates to more than 50ksps */
bcm2835_init();
setup_mcp3201spi();
int collecting = 1;
struct timespec starttime;
struct timespec time;
struct timespec ftime;
ftime.tv_nsec = 0;
fd_set readfds;
int countfd;
struct timeval interval_timeout;
struct timeval notime;
uint16_t p1;
float w1;
uint8_t *datap = malloc(8);
int data_size;
char output[25];
clock_gettime(CLOCK_MONOTONIC, &starttime);
while ( !err && collecting ){
/* Set timeout to (5*1.2)ms - (looptime)ms, or 0 if looptime was longer than (5*1.2)ms */
interval_timeout.tv_sec = 0;
interval_timeout.tv_usec = interval * 1200 - ftime.tv_nsec / 1000;
interval_timeout.tv_usec = (interval_timeout.tv_usec < 0)? 0 : interval_timeout.tv_usec;
FD_ZERO(&readfds);
FD_SET(fd_wc, &readfds);
FD_SET(0, &readfds); /* so that we can quit, code not included */
if ( (countfd=select(fd_wc+1, &readfds, NULL, NULL, &interval_timeout))<0 ){
perror("select()");
err = 1;
} else if (countfd == 0){
printf("Timeout on select()\n");
fflush(stdout);
err = 1;
} else if (FD_ISSET(fd_wc, &readfds)){
/* timestamp for when data is just available */
clock_gettime(CLOCK_MONOTONIC, &time)
if (starttime.tv_nsec > time.tv_nsec){
time.tv_nsec = 1000000000 + time.tv_nsec - starttime.tv_nsec;
time.tv_sec = time.tv_sec - starttime.tv_sec - 1;
} else {
time.tv_nsec = time.tv_nsec - starttime.tv_nsec;
time.tv_sec = time.tv_sec - starttime.tv_sec;
}
/* get ADC value, which is sampled fast so corresponds to timestamp */
p1 = getADCvalue();
/* receive_frame, receiving is slower so do it after getting ADC value. It is timestamped anyway */
/* This function consists of a loop that gets data from serial 1 byte at a time until a 'frame' is collected. */
/* it uses select() with a very short timeout (enough for 1 byte at baudrate) just to check comms are still going */
/* It never times out and behaves well */
/* The interval_timeout is passed because it is used as a timeout for responding an ACK to the device */
/* That select also never times out */
ireceive_frame(&datap, fd_wc, &data_size, interval_timeout.tv_sec, interval_timeout.tv_usec);
/* do stuff with it */
/* This takes most of the time in the loop, about 1.3ms at 115200 baud */
snprintf(output, 24, "%d.%04d,%d,%.2f\n", time.tv_sec, time.tv_nsec/100000, pressure, w1);
write(fd_sc, output, strnlen(output, 23));
/* Check how long the loop took (minus the polling select() that follows */
clock_gettime(CLOCK_MONOTONIC, &ftime);
if ((time.tv_nsec+starttime.tv_nsec) > ftime.tv_nsec){
ftime.tv_nsec = 1000000000 + ftime.tv_nsec - time.tv_nsec - starttime.tv_nsec;
ftime.tv_sec = ftime.tv_sec - time.tv_sec - starttime.tv_sec - 1;
} else {
ftime.tv_nsec = ftime.tv_nsec - time.tv_nsec - starttime.tv_nsec;
ftime.tv_sec = ftime.tv_sec - time.tv_sec - starttime.tv_sec;
}
/* Poll with 0 timeout to check that data hasn't arrived before we're ready yet */
FD_ZERO(&readfds);
FD_SET(fd_wc, &readfds);
notime.tv_sec = 0;
notime.tv_usec = 0;
if ( !err && ( (countfd=select(fd_wc+1, &readfds, NULL, NULL, ¬ime)) < 0 )){
perror("select()");
err = 1;
} else if (countfd > 0){
printf("OVERRUN!\n");
snprintf(output, 25, ",,,%d.%04d\n\n", ftime.tv_sec, ftime.tv_nsec/100000);
write(fd_sc, output, strnlen(output, 24));
}
}
}
return 0;
}
The timestamps I see on the serial stream that I output is fairly regular (a deviation is caught up by the next loop usually). A snippet of output:
6.1810,0,225.25
6.1867,0,225.25
6.1922,0,225.25
6,2063,0,225.25
,,,0.0010
Here, up to 6.1922s everything is OK. The next sample is 6.2063 - 14.1ms after the last, but it didn't time out nor did the previous loop from 6.1922-6.2063 catch the overrun with the polling select(). My conclusion is that the last loop was withing the sampling time and select took -10ms too long return without timing out.
The ,,,0.0010 indicates the loop time (ftime) of the loop after - I should really be checking what the loop time was when it went wrong. I'll try that tomorrow.
The timeout passed to select is a rough lower bound — select is allowed to delay your process for slightly more than that. In particular, your process will be delayed if it is preempted by a different process (a context switch), or by interrupt handling in the kernel.
Here is what the Linux manual page has to say on the subject:
Note that the timeout interval will be rounded up to the system clock
granularity, and kernel scheduling delays mean that the blocking
interval may overrun by a small amount.
And here's the POSIX standard:
Implementations may
also place limitations on the granularity of timeout intervals. If the
requested timeout interval requires a finer granularity than the
implementation supports, the actual timeout interval shall be
rounded up to the next supported value.
Avoiding that is difficult on a general purpose system. You will get reasonable results, especially on a multi-core system, by locking your process in memory (mlockall) and setting your process to a real-time priority (use sched_setscheduler with SCHED_FIFO, and remember to sleep often enough to give other processes a chance to run).
A much more difficult approach is to use a real-time microcontroller that is dedicated to running the real-time code. Some people claim to reliably sample at 20MHz on fairly cheap hardware using that technique.
If values for struct timeval are set to zero, then select will not block, but if timeout argument is a NULL pointer, it will...
If the timeout argument is not a NULL pointer, it points to an object of type struct timeval that specifies a maximum interval to
wait for the selection to complete. If the timeout argument points to
an object of type struct timeval whose members are 0, select() does
not block. If the timeout argument is a NULL pointer, select()
blocks until an event causes one of the masks to be returned with
a valid (non-zero) value or until a signal occurs that needs to be
delivered. If the time limit expires before any event occurs that
would cause one of the masks to be set to a non-zero value, select()
completes successfully and returns 0.
Read more here
EDIT to address comments, and add new information:
A couple of noteworthy points.
First - in the comments, there is a suggestion to add sleep() to your worker loop. This is a good suggestion. The reasons stated here, although dealing with thread entry points, still apply, since you are instantiating a continuous loop.
Second - Linux select() is a system call with an interesting implemantation history, and as such has a range of varying behaviours from implementation to implementation, some which may contribute to the unexpected behaviours you are seeing. I am not sure which of the major blood lines of Linux Arch Linux comes from, but the man7.org page for select() includes the following two segments, which per your descriptions appear to describe conditions that could possibly contribute to the delays you are experiencing.
Bad checksum:
Under Linux, select() may report a socket file descriptor as "ready
for reading", while nevertheless a subsequent read blocks. This could
for example happen when data has arrived but upon examination has wrong
checksum and is discarded.
Race condition: (introduces and discusses pselect())
...Suppose the signal handler sets a global flag and returns. Then a test
of this global flag followed by a call of select() could hang indefinitely
if the signal arrived just after the test but just before the call...
Given the description of your observations, and depending on how your version of Linux is implemented, either one of these implementation features may be a possible contributor.
I'm working on an embedded processor running Yocto. I have a modified uio_pdrv_genirq.c UIO driver.
I am writing a library to control the DMA. There is one function which writes to the device file and initiates the DMA. A second function is intended to wait for the DMA to complete by calling select(). Whilst DMA is in progress the device file blocks. On completion the DMA controller issues an interrupt which releases the block on the device file.
I have the system working as expected using read() but I want to switch to select() so that I can include a time out. However, when I use select(), it doesn't seem to be recognising the block and always returns immediately (before the DMA has completed). I have included a simple version of the code:
int gannet_dma_interrupt_wait(dma_device_t *dma_device,
dma_direction dma_transfer_direction) {
fd_set rfds;
struct timeval timeout;
int select_res;
/* Initialize the file descriptor set and add the device file */
FD_ZERO(&rfds);
FD_SET(dma_device->fd, &rfds);
/* Set the timeout period. */
timeout.tv_sec = 5;
timeout.tv_usec = 0;
/* The device file will block until the DMA transfer has completed. */
select_res = select(FD_SETSIZE, &rfds, NULL, NULL, &timeout);
/* Reset the channel */
gannet_dma_reset(dma_device, dma_transfer_direction);
if (select_res == -1) {
/* Select has encountered an error */
perror("ERROR <Interrupt Select Failed>\n");
exit(0);
}
else if (select_res == 1) {
/* The device file descriptor block released */
return 0;
}
else {
/* The device file descriptor block exceeded timeout */
return EINTR;
}
}
Is there anything obviously wrong with my code? Or can anyone suggest an alternative to select?
It turns out that the UIO driver contains two counters. One records the
number of events (event_count), the other records how many events the
calling function is aware of (listener->event_count).
When you do a read() on a UIO driver it returns the number of events and
makes listener->event_count equal to event_count. ie. the listener is
now up to date with all the events that have occurred.
When you use poll() or select() on a UIO driver, it checks if these two
numbers are different and returns if they are (if they are the same it
waits until they differ and then returns). It does NOT update the
listener->event_count.
Clearly if you do not do a read() between calls to select() then
the listener->event_count will not match the event_count and the second
select() will return immediately. Therefore it is necessary to call
read() in between calls to select().
With hindsight it seems clear that select() should work in this way but it wasn't obvious to me at the time.
This answer assumes that it is possible to use select() as intented for the specified device file (I use select() for socket descriptors only). As an alternative function to select(), you may want to check out the poll() family of functions. What follows will hopefully at least offer hints as to what can be done to resolve your problem with calling select().
The first parameter to the select() function has to be the maximum despriptor number plus 1. Since you have only one descriptor, you can pass it directly to select() as its first parameter and add 1. Also consider that the file descriptor in dma_device could be invalid. Returning EINTR on a timeout may actually be what you intend to do but should that not be the case and to test for an invalid descriptor, here is a different version for you to consider. The select() call could be interrupted by a signal, in which case, the return value is -1 and errno will be set to EINTR. This could be handled internally by your function as in:
FD_ZERO(&rfds);
FD_SET(dma_device->fd, &rfds);
timeout.tv_sec = 5;
timeout.tv_usec = 0;
// restart select() if it's interrupted by a signal;
do {
select_res = select(dma_device->fd + 1, &rfds, NULL, NULL, &timeout);
}
while( select_res < 0 && errno == EINTR);
if (select_res > 0) {
// a file descriptor is legible
}
else {
if (select_res == 0) {
// select() timed-out
}
else {
// an error other than a signal occurred
if (errno == EBADF) {
// your file descriptor is invalid
}
}
}