Time remaining on a select() call - c

I'm using select() on a Linux/ARM platform to see if a udp socket has received a packet. I'd like to know how much time was remaining in the select call if it returns before the timeout (having detected a packet).
Something along the lines of:
int wait_fd(int fd, int msec)
{
struct timeval tv;
fd_set rws;
tv.tv_sec = msec / 1000ul;
tv.tv_usec = (msec % 1000ul) * 1000ul;
FD_ZERO( & rws);
FD_SET(fd, & rws);
(void)select(fd + 1, & rws, NULL, NULL, & tv);
if (FD_ISSET(fd, &rws)) { /* There is data */
msec = (tv.tv_sec * 1000) + (tv.tv_usec / 1000);
return(msec?msec:1);
} else { /* There is no data */
return(0);
}
}

The safest thing is to ignore the ambiguous definition of select() and time it yourself.
Just get the time before and after the select and subtract that from the interval you wanted.

If I recall correctly, the select() function treats the timeout and an I/O parameter and when select returns the time remaining is returned in the timeout variable.
Otherwise, you will have to record the current time before calling, and again after and obtain the difference between the two.

From "man select" on OSX:
Timeout is not changed by select(), and may be reused on subsequent calls, however it
is good style to re-ini-tialize it before each invocation of select().
You'll need to call gettimeofday before calling select, and then gettimeofday on exit.
[Edit] It seems that linux is slightly different:
(ii) The select function may update the timeout parameter to indicate
how much time was left. The pselect function does not change
this parameter.
On Linux, the function select modifies timeout to reflect the amount of
time not slept; most other implementations do not do this. This causes
problems both when Linux code which reads timeout is ported to other
operating systems, and when code is ported to Linux that reuses a
struct timeval for multiple selects in a loop without reinitializing
it. Consider timeout to be undefined after select returns.

Linux select() updates the timeout argument to reflect the time that has past.
Note that this is not portable across other systems (hence the warning in the OS X manual quoted above) but does work with Linux.
Gilad

Do not use select, try with fd larger than 1024 with your code and see what you will get.

Related

select() not working as it should, where is the bug?

Given the following code, the expectation is for there to be a one-second sleep each time select() is called. However, the sleep only occurs on the first call and all subsequent calls result in no delay:
#include <stdio.h>
#include <stdlib.h>
int main()
{
struct timeval tv;
tv.tv_sec = 1;
tv.tv_usec = 0;
for (;;)
{
/* Sleep for one second */
int result=select(0, NULL, NULL, NULL, &tv);
printf("select returned: %d\n",result);
}
}
Why do all calls to select() except the first return immediately?
Compiler: gcc 4.9.2
OS: Centos 7 (Linux)
Kernel info: 3.10.0-327.36.3.el7.x86_64
From the man page:
On Linux, select() modifies timeout to reflect the amount of time not
slept
So, set tv [in the loop] before calling select
As stated in the manpage
On Linux, select() modifies timeout to reflect the amount of time not
slept; most other implementations do not do this. (POSIX.1 permits
either behavior.) This causes problems both when Linux code which
reads timeout is ported to other operating systems, and when code is
ported to Linux that reuses a struct timeval for multiple select()s
in a loop without reinitializing it. Consider timeout to be unde‐
fined after select() returns.
As the first run ended by timeout, the tv value is reset to 0 seconds. Solution: reinitialize tv on every run.

C - select() seems to block for longer than timeout

I am writing a data acquisitioning program that needs to
wait for serial with select()
read serial data (RS232 at 115200 baud),
timestamp it (clock_gettime()),
read an ADC on SPI,
interpret it,
send new data over another tty device
loop and repeat
The ADC is irrelevant for now.
At the end of the loop I use select() again with a 0 timeout to poll and see whether data is available already, if it is it means I have overrun , I.e. I expect the loop to end before more data and for the select() at the start of the loop to block and get it as soon as it arrives.
The data should arrive every 5ms, my first select() timeout is calculated as (5.5ms - loop time) - which should be about 4ms.
I get no timeouts but many overruns.
Examining the timestamps reveals that select() blocks for longer than the timeout (but still returns>0).
It looks like select() returns late after getting data before timeout.
This might happen 20 times in 1000 repeats.
What might be the cause? How do I fix it?
EDIT:
Here is cut down version of the code (I do much more error checking than this!)
#include <bcm2835.h> /* for bcm2835_init(), bcm2835_close() */
int main(int argc, char **argv){
int err = 0;
/* Set real time priority SCHED_FIFO */
struct sched_param sp;
sp.sched_priority = 30;
if ( pthread_setschedparam(pthread_self(), SCHED_FIFO, &sp) ){
perror("pthread_setschedparam():");
err = 1;
}
/* 5ms between samples on /dev/ttyUSB0 */
int interval = 5;
/* Setup tty devices with termios, both totally uncooked, 8 bit, odd parity, 1 stop bit, 115200baud */
int fd_wc=setup_serial("/dev/ttyAMA0");
int fd_sc=setup_serial("/dev/ttyUSB0");
/* Setup GPIO for SPI, SPI mode, clock is ~1MHz which equates to more than 50ksps */
bcm2835_init();
setup_mcp3201spi();
int collecting = 1;
struct timespec starttime;
struct timespec time;
struct timespec ftime;
ftime.tv_nsec = 0;
fd_set readfds;
int countfd;
struct timeval interval_timeout;
struct timeval notime;
uint16_t p1;
float w1;
uint8_t *datap = malloc(8);
int data_size;
char output[25];
clock_gettime(CLOCK_MONOTONIC, &starttime);
while ( !err && collecting ){
/* Set timeout to (5*1.2)ms - (looptime)ms, or 0 if looptime was longer than (5*1.2)ms */
interval_timeout.tv_sec = 0;
interval_timeout.tv_usec = interval * 1200 - ftime.tv_nsec / 1000;
interval_timeout.tv_usec = (interval_timeout.tv_usec < 0)? 0 : interval_timeout.tv_usec;
FD_ZERO(&readfds);
FD_SET(fd_wc, &readfds);
FD_SET(0, &readfds); /* so that we can quit, code not included */
if ( (countfd=select(fd_wc+1, &readfds, NULL, NULL, &interval_timeout))<0 ){
perror("select()");
err = 1;
} else if (countfd == 0){
printf("Timeout on select()\n");
fflush(stdout);
err = 1;
} else if (FD_ISSET(fd_wc, &readfds)){
/* timestamp for when data is just available */
clock_gettime(CLOCK_MONOTONIC, &time)
if (starttime.tv_nsec > time.tv_nsec){
time.tv_nsec = 1000000000 + time.tv_nsec - starttime.tv_nsec;
time.tv_sec = time.tv_sec - starttime.tv_sec - 1;
} else {
time.tv_nsec = time.tv_nsec - starttime.tv_nsec;
time.tv_sec = time.tv_sec - starttime.tv_sec;
}
/* get ADC value, which is sampled fast so corresponds to timestamp */
p1 = getADCvalue();
/* receive_frame, receiving is slower so do it after getting ADC value. It is timestamped anyway */
/* This function consists of a loop that gets data from serial 1 byte at a time until a 'frame' is collected. */
/* it uses select() with a very short timeout (enough for 1 byte at baudrate) just to check comms are still going */
/* It never times out and behaves well */
/* The interval_timeout is passed because it is used as a timeout for responding an ACK to the device */
/* That select also never times out */
ireceive_frame(&datap, fd_wc, &data_size, interval_timeout.tv_sec, interval_timeout.tv_usec);
/* do stuff with it */
/* This takes most of the time in the loop, about 1.3ms at 115200 baud */
snprintf(output, 24, "%d.%04d,%d,%.2f\n", time.tv_sec, time.tv_nsec/100000, pressure, w1);
write(fd_sc, output, strnlen(output, 23));
/* Check how long the loop took (minus the polling select() that follows */
clock_gettime(CLOCK_MONOTONIC, &ftime);
if ((time.tv_nsec+starttime.tv_nsec) > ftime.tv_nsec){
ftime.tv_nsec = 1000000000 + ftime.tv_nsec - time.tv_nsec - starttime.tv_nsec;
ftime.tv_sec = ftime.tv_sec - time.tv_sec - starttime.tv_sec - 1;
} else {
ftime.tv_nsec = ftime.tv_nsec - time.tv_nsec - starttime.tv_nsec;
ftime.tv_sec = ftime.tv_sec - time.tv_sec - starttime.tv_sec;
}
/* Poll with 0 timeout to check that data hasn't arrived before we're ready yet */
FD_ZERO(&readfds);
FD_SET(fd_wc, &readfds);
notime.tv_sec = 0;
notime.tv_usec = 0;
if ( !err && ( (countfd=select(fd_wc+1, &readfds, NULL, NULL, &notime)) < 0 )){
perror("select()");
err = 1;
} else if (countfd > 0){
printf("OVERRUN!\n");
snprintf(output, 25, ",,,%d.%04d\n\n", ftime.tv_sec, ftime.tv_nsec/100000);
write(fd_sc, output, strnlen(output, 24));
}
}
}
return 0;
}
The timestamps I see on the serial stream that I output is fairly regular (a deviation is caught up by the next loop usually). A snippet of output:
6.1810,0,225.25
6.1867,0,225.25
6.1922,0,225.25
6,2063,0,225.25
,,,0.0010
Here, up to 6.1922s everything is OK. The next sample is 6.2063 - 14.1ms after the last, but it didn't time out nor did the previous loop from 6.1922-6.2063 catch the overrun with the polling select(). My conclusion is that the last loop was withing the sampling time and select took -10ms too long return without timing out.
The ,,,0.0010 indicates the loop time (ftime) of the loop after - I should really be checking what the loop time was when it went wrong. I'll try that tomorrow.
The timeout passed to select is a rough lower bound — select is allowed to delay your process for slightly more than that. In particular, your process will be delayed if it is preempted by a different process (a context switch), or by interrupt handling in the kernel.
Here is what the Linux manual page has to say on the subject:
Note that the timeout interval will be rounded up to the system clock
granularity, and kernel scheduling delays mean that the blocking
interval may overrun by a small amount.
And here's the POSIX standard:
Implementations may
also place limitations on the granularity of timeout intervals. If the
requested timeout interval requires a finer granularity than the
implementation supports, the actual timeout interval shall be
rounded up to the next supported value.
Avoiding that is difficult on a general purpose system. You will get reasonable results, especially on a multi-core system, by locking your process in memory (mlockall) and setting your process to a real-time priority (use sched_setscheduler with SCHED_FIFO, and remember to sleep often enough to give other processes a chance to run).
A much more difficult approach is to use a real-time microcontroller that is dedicated to running the real-time code. Some people claim to reliably sample at 20MHz on fairly cheap hardware using that technique.
If values for struct timeval are set to zero, then select will not block, but if timeout argument is a NULL pointer, it will...
If the timeout argument is not a NULL pointer, it points to an object of type struct timeval that specifies a maximum interval to
wait for the selection to complete. If the timeout argument points to
an object of type struct timeval whose members are 0, select() does
not block. If the timeout argument is a NULL pointer, select()
blocks until an event causes one of the masks to be returned with
a valid (non-zero) value or until a signal occurs that needs to be
delivered. If the time limit expires before any event occurs that
would cause one of the masks to be set to a non-zero value, select()
completes successfully and returns 0.
Read more here
EDIT to address comments, and add new information:
A couple of noteworthy points.
First - in the comments, there is a suggestion to add sleep() to your worker loop. This is a good suggestion. The reasons stated here, although dealing with thread entry points, still apply, since you are instantiating a continuous loop.
Second - Linux select() is a system call with an interesting implemantation history, and as such has a range of varying behaviours from implementation to implementation, some which may contribute to the unexpected behaviours you are seeing. I am not sure which of the major blood lines of Linux Arch Linux comes from, but the man7.org page for select() includes the following two segments, which per your descriptions appear to describe conditions that could possibly contribute to the delays you are experiencing.
Bad checksum:
Under Linux, select() may report a socket file descriptor as "ready
for reading", while nevertheless a subsequent read blocks. This could
for example happen when data has arrived but upon examination has wrong
checksum and is discarded.
Race condition: (introduces and discusses pselect())
...Suppose the signal handler sets a global flag and returns. Then a test
of this global flag followed by a call of select() could hang indefinitely
if the signal arrived just after the test but just before the call...
Given the description of your observations, and depending on how your version of Linux is implemented, either one of these implementation features may be a possible contributor.

Linux timer CLOCK_PROCESS_CPU_ID is not working

I need to use some timers in my code. I've got something like this:
struct sigevent ec;
ec.sigev_notify = SIGEV_THREAD;
ec.sigev_value.sival_ptr = &c_timer;
ec.sigev_notify_function = c_thread;
ec.sigev_notify_attributes = NULL;
secs = floor(c);
nsecs = (long long) SECOND * (c - secs);
printf("%ds\t%lldns\n\n",secs,nsecs);
it1c.it_value.tv_sec = secs;
it1c.it_value.tv_nsec = nsecs;
it1c.it_interval.tv_sec = 0;
it1c.it_interval.tv_nsec = 0;
timer_create(CLOCK_PROCESS_CPUTIME_ID, &ec, &c_timer);
timer_settime(c_timer, 0, &it1c, NULL);
Where c_thread is some simple function which is setting new timer, SECOND is:
#define SECOND 1000000000
c is something like 2.25
And my problem is that this timer doesn't call c_thread when it should. When i change CLOCK_PROCESS_CPUTIME_ID to CLOCK_REALTIME everything is ok, and it is called, but when I am using first one nothing happens. I am also checking CLOCK_PROCESS_CPUTIME_ID using other CLOCK_REALTIME timer with clock_gettime function and values of clock reach my it_value.
Any ideas what could be wrong?
And my second question: Is there any way to pass some arguments to function called as thread using timers?
#annamataris problem was not related to spinlock and nanosleep stuff. There is reason to use only CLOCK_REALTIME because.
The POSIX timers system calls first appeared in Linux 2.6. Prior to
this, glibc provided an incomplete user-space implementation
(CLOCK_REALTIME timers only) using POSIX threads, and in glibc
versions before 2.17, the implementation falls back to this technique
on systems running pre-2.6 Linux kernels.
Read man timer_create for more info.

Is select() with NULL timeout lighter than select() with timeout?

I would like to know if that code:
select(fd,..., NULL);
is less CPU consuming than that one:
struct timeval tv;
tv.tv_sec = X;
tv_tv_usec = Y;
select(fd,..., &tv);
and why. Thank you.
EDIT: I'm asking about one single call. It's a sys call, so it's system dependent and it's up to the system to unblock the select()ing program. So, for the system, is more CPU consuming to accomplish to a select with or without timeout?
Neither is "lighter". select is a system call and will instruct the OS to wake up your task when either an event occurs on one of the watched file descriptors or (if supplied) a timeout occurs. selecting with a NULL timeout will select indefinitely until a watched file descriptor event occurs of the process is interrupted in another way.
Clearly:
while (select(..., NULL) == 0) { /* ... */ }
is lighter than:
while (select(..., tv) == 0) { /* ... */ }
where the time in tv is small, otherwise the difference is likely too small to be noticed.

ways of implementing timer in worker thread in C

I have a worker thread that gets work from pipe. Something like this
void *worker(void *param) {
while (!work_done) {
read(g_workfds[0], work, sizeof(work));
do_work(work);
}
}
I need to implement a 1 second timer in the same thread do to some book-keeping about the work. Following is what I've in mind:
void *worker(void *param) {
prev_uptime = get_uptime();
while (!work_done) {
// set g_workfds[0] as non-block
now_uptime = get_uptime();
if (now_uptime - prev_uptime > 1) {
do_book_keeping();
prev_uptime = now_uptime;
}
n = poll(g_workfds[0], 1000); // Wait for 1 second else timeout
if (n == 0) // timed out
continue;
read(g_workfds[0], work, sizeof(work));
do_work(work); // This can take more than 1 second also
}
}
I am using system uptime instead of system time because system time can get changed while this thread is running. I was wondering if there is any other better way to do this. I don't want to consider using another thread. Using alarm() is not an option as it already used by another thread in same process. This is getting implemented in Linux environment.
I agree with most of what webbi wrote in his answer. But there is one issue with his suggestion of using time instead of uptime. If the system time is updated "forward" it will work as intended. But if the system time is set back by say 30 seconds, then there will be no book keeping done for 30 seconds as (now_time - prev_time) will be negative (unless an unsigned type is used, in which case it will work anyway).
An alternative would be to use clock_gettime() with CLOCK_MONOTONIC as clockid ( http://linux.die.net/man/2/clock_gettime ). A bit messy if you don't need smaller time units than seconds.
Also, adding code to detect a backwards clock jump isn't hard either.
I have found a better way but it is Linux specific using timerfd_create() system call. It takes care of system time change. Following is possible psuedo code:
void *worker(void *param) {
int timerfd = timerfd_create(CLOCK_MONOTONIC, 0); // Monotonic doesn't get affected by system time change
// set timerfd to non-block
timerfd_settime(timerfd, 1 second timer); // timer starts
while (!work_done) {
// set g_workfds[0] as non-block
n = poll(g_workfds[0] and timerfd, 0); // poll on both pipe and timerfd and Wait indefinetly
if (timerfd is readable)
do_book_keeping();
if (g_workfds[0] is readable) {
read(g_workfds[0], work, sizeof(work));
do_work(work); // This can take more than 1 second also
}
}
}
It seems cleaner and read() on timerfd returns extra time elapsed in case do_work() takes long time which is quite useful as do_book_keeping() expects to get called every second.
I found some things weird in your code...
poll() has 3 args, you are passing 2, the second arg is the number of structs that you are passing in the struct array of first param, the third param is the timeout.
Reference: http://linux.die.net/man/2/poll
Besides that, it's fine for me that workaround, it's not the best of course, but it's fine without involving another thread or alarm(), etc.
You use time and not uptime, it could cause you one error if the system date gets changed, but then it will continue working as it will be updated and continuing waiting for 1 sec, no matter what time is.

Resources