I have a worker thread that gets work from pipe. Something like this
void *worker(void *param) {
while (!work_done) {
read(g_workfds[0], work, sizeof(work));
do_work(work);
}
}
I need to implement a 1 second timer in the same thread do to some book-keeping about the work. Following is what I've in mind:
void *worker(void *param) {
prev_uptime = get_uptime();
while (!work_done) {
// set g_workfds[0] as non-block
now_uptime = get_uptime();
if (now_uptime - prev_uptime > 1) {
do_book_keeping();
prev_uptime = now_uptime;
}
n = poll(g_workfds[0], 1000); // Wait for 1 second else timeout
if (n == 0) // timed out
continue;
read(g_workfds[0], work, sizeof(work));
do_work(work); // This can take more than 1 second also
}
}
I am using system uptime instead of system time because system time can get changed while this thread is running. I was wondering if there is any other better way to do this. I don't want to consider using another thread. Using alarm() is not an option as it already used by another thread in same process. This is getting implemented in Linux environment.
I agree with most of what webbi wrote in his answer. But there is one issue with his suggestion of using time instead of uptime. If the system time is updated "forward" it will work as intended. But if the system time is set back by say 30 seconds, then there will be no book keeping done for 30 seconds as (now_time - prev_time) will be negative (unless an unsigned type is used, in which case it will work anyway).
An alternative would be to use clock_gettime() with CLOCK_MONOTONIC as clockid ( http://linux.die.net/man/2/clock_gettime ). A bit messy if you don't need smaller time units than seconds.
Also, adding code to detect a backwards clock jump isn't hard either.
I have found a better way but it is Linux specific using timerfd_create() system call. It takes care of system time change. Following is possible psuedo code:
void *worker(void *param) {
int timerfd = timerfd_create(CLOCK_MONOTONIC, 0); // Monotonic doesn't get affected by system time change
// set timerfd to non-block
timerfd_settime(timerfd, 1 second timer); // timer starts
while (!work_done) {
// set g_workfds[0] as non-block
n = poll(g_workfds[0] and timerfd, 0); // poll on both pipe and timerfd and Wait indefinetly
if (timerfd is readable)
do_book_keeping();
if (g_workfds[0] is readable) {
read(g_workfds[0], work, sizeof(work));
do_work(work); // This can take more than 1 second also
}
}
}
It seems cleaner and read() on timerfd returns extra time elapsed in case do_work() takes long time which is quite useful as do_book_keeping() expects to get called every second.
I found some things weird in your code...
poll() has 3 args, you are passing 2, the second arg is the number of structs that you are passing in the struct array of first param, the third param is the timeout.
Reference: http://linux.die.net/man/2/poll
Besides that, it's fine for me that workaround, it's not the best of course, but it's fine without involving another thread or alarm(), etc.
You use time and not uptime, it could cause you one error if the system date gets changed, but then it will continue working as it will be updated and continuing waiting for 1 sec, no matter what time is.
Related
My question is about exactly when the I/O thread in an asynchronous I/O call returns when a call back function is involved. Specifically, given this very general code for reading a file ...
#include<stdio.h>
#include<aio.h>
...
// callback function:
void finish_aio(sigval_t sigval) {
/* do stuff ... maybe close the file */
}
int main() {
struct aiocb my_aiocb;
int aio_return;
...
//Open file, take care of any other prelims, then
//Fill in file-specific info for my_aiocb, then
//Fill in callback information for my_aiocb:
my_aiocb.aio_sigevent.sigev_notify = SIGEV_THREAD;
my_aiocb.aio_sigevent.sigev_notify_function = finish_aio;
my_aiocb.aio_sigevent.sigev_notify_attributes = NULL;
my_aiocb.aio_sigevent.sigev_value.sival_ptr = &info_on_file;
// then read the file:
aio_return = aio_read(&my_aiocb);
// do stuff that doesn't need data that is being read ...
// then block execution until read is complete:
while(aio_error(&my_aiocb) == EINPROGRESS) {}
// etc.
}
I understand that the callback function is called as soon as the read of the file is completed. But what exactly happens then? Does the I/O thread start running the callback finish_aio()? Or does it spawn a new thread to handle that callback, while it returns to the main thread? Another way to put this would be: When does aio_error(&my_aiocb) stop returning EINPROGRESS? Is it just before the call to the callback, or when the callback is completed?
I understand that the callback function is called as soon as the read of the file is completed. But what exactly happens then?
What happens is that when the IO finishes it "behaves as if" it started a new thread (similar to calling pthread_create(&ignored, NULL, finish_aio, &info_on_file)).
When does aio_error(&my_aiocb) stop returning EINPROGRESS?
I'd expect that aio_error(&my_aiocb) stops returning EINPROGRESS as soon as the IO finishes, then the system (probably the standard library) either begins creating a new thread to call finish_aio() or "unblocks" a "previously created without you knowing" thread. However, I don't think the exact order is documented anywhere ("implementation defined") because it doesn't make much sense to call aio_error(&my_aiocb) from anywhere other than the finish_aio() anyway.
More specifically; if you're using polling (my_aiocb.aio_sigevent.sigev_notify = SIGEV_NONE) then you'd repeatedly check aio_error(&my_aiocb) yourself and you can't care if you're notified before or after this happens because you're not notified at all; and if you aren't using polling you'd wait until you are notified (via. a new thread or a signal) that there's a reason to check aio_error(&my_aiocb).
In other words, your finish_aio() would look more like this:
void finish_aio(sigval_t sigval) {
struct aiocb * my_aiocb = (struct aiocb *)sigval;
int status;
status = aio_error(&my_aiocb);
/* Figure out what to do (handle the error or handle the file's data) */
.. and for main() the while(aio_error(&my_aiocb) == EINPROGRESS) (which may waste a huge amount of CPU time for nothing) would be deleted and/or possibly replaced with something else (e.g. a pthread_cond_wait() to wait until the code in finish_aio() does a pthread_cond_signal() to tell the main thread it can continue).
To understand this, let's take a look at what pure polling would look like:
int main() {
struct aiocb my_aiocb;
int aio_return;
...
//Open file, take care of any other prelims, then
//Fill in file-specific info for my_aiocb, then
my_aiocb.aio_sigevent.sigev_notify = SIGEV_NONE; /* CHANGED! */
// my_aiocb.aio_sigevent.sigev_notify_function = finish_aio;
// my_aiocb.aio_sigevent.sigev_notify_attributes = NULL;
// my_aiocb.aio_sigevent.sigev_value.sival_ptr = &info_on_file;
// then read the file:
aio_return = aio_read(&my_aiocb);
// do stuff that doesn't need data that is being read ...
// then block execution until read is complete:
while(aio_error(&my_aiocb) == EINPROGRESS) {}
finish_aio(sigval_t sigval); /* ADDED! */
}
In this case it behaves almost the same as your original code, except that there's no extra thread (and you can't care if the "thread that doesn't exist" is started before or after aio_error(&my_aiocb) returns a value other than EINPROGRESS).
The problem with pure polling is that the while(aio_error(&my_aiocb) == EINPROGRESS) could waste a huge amount of CPU time constantly checking when nothing has happened yet.
The main purpose of using my_aiocb.aio_sigevent.sigev_notify = SIGEV_THREAD is to avoid wasting a possibly huge amount of CPU time polling when nothing changed (not forgetting that in some cases wasting CPU time polling like this can prevent other threads, including the finish_aio() thread, from getting CPU time). In other words, you want to delete the while(aio_error(&my_aiocb) == EINPROGRESS) loop, so you used SIGEV_THREAD so that you can delete that polling loop.
The new problem is that (if the main thread has to wait until the data is ready) you need some other way for the main thread to wait until the data is ready. However, typically it's not "the aio_read() completed" that you actually care about, it's something else. For example, maybe the raw file data is a bunch of values in a text file (like "12, 34, 56, 78") and you want to parse that data and create an array of integers, and want to notify the main thread that the array of integers is ready (and don't want to notify the main thread if you're starting to parse the file's data). It might be like:
int parsed_file_result = 0;
void finish_aio(sigval_t sigval) {
struct aiocb * my_aiocb = (struct aiocb *)sigval;
int status;
status = aio_error(&my_aiocb);
close(my_aiocb->aio_fildes);
if(status == 0) {
/* Read was successful */
parsed_file_result = parse_file_data(); /* Create the array of integers */
} else {
/* Read failed, handle the error somehow */
parsed_file_result = -1; /* Tell main thread we failed to create the array of integers */
}
/* Tell the main thread it can continue somehow */
}
One of the best ways to tell the main thread it can continue (at the end of finish_aio()) is to use pthread conditional variables (e.g. pthread_cond_signal() called at the end of finish_aio(); with pthread_cond_wait() in the main thread). In this case the main thread will simply block (the kernel/scheduler will not give it any CPU time until pthread_cond_signal() is called) so it wastes no CPU time polling.
Sadly, pthread conditional variables aren't trivial (they require a mutex, initialization, etc), and teaching/showing their use here is a little too far from the original topic. Fortunately; you shouldn't have much trouble finding a good tutorial elsewhere.
The important part is that if you used SIGEV_THREAD (so that you can delete that awful while(aio_error(&my_aiocb) == EINPROGRESS) polling loop) you're left with no reason to call aio_error(&my_aiocb) until after the finish_aio() has already been started; and no reason to care if aio_error(&my_aiocb) would've been changed (or not) before finish_aio() is started.
I'm trying to 'roughly' calculate the time of a thread context switch in a Linux system. I've written a program that uses pipes and multi-threading to achieve this. When running the program the calculated time is clearly wrong(see output below). I am unsure if this is due to me using the wrong clock_id for this procedure or perhaps my implementation
I have implemented sched_setaffinity() so as to only have the program run on core 0. I've tried to leave as much fluff out of code so to only measure the time of a context switch, so the tread process only writes a single character to the pipe and the parent does a 0 byte read.
I have a parent tread that creates one child thread with a one-way pipe between them to pass data, the child thread runs a simple function to write to a pipe.
void* thread_1_function()
{
write(fd2[1],"",sizeof("");
}
while the parent thread creates the child thread, starts the time counter and then calls a read on the pipe that the child thread writes to.
int main(int argc, char argv[])
{
//time struct declaration
struct timespec start,end;
//sets program to only use core 0
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(0,&cpu_set);
if((sched_setaffinity(0, sizeof(cpu_set_t), &cpu_set) < 1))
{
int nproc = sysconf(_SC_NPROCESSORS_ONLN);
int k;
printf("Processor used: ");
for(k = 0; k < nproc; ++k)
{
printf("%d ", CPU_ISSET(k, &cpu_set));
}
printf("\n");
if(pipe(fd1) == -1)
{
printf("fd1 pipe error");
return 1;
}
//fail on file descriptor 2 fail
if(pipe(fd2) == -1)
{
printf("fd2 pipe error");
return 1;
}
pthread_t thread_1;
pthread_create(&thread_1, NULL, &thread_1_function, NULL);
pthread_join(thread_1,NULL);
int i;
uint64_t sum = 0;
for(i = 0; i < iterations; ++i)
{
//initalize clock start
clock_gettime(CLOCK_MONOTONIC, &start);
//wait for child thread to write to pipe
read(fd2[0],input,0);
//record clock end
clock_gettime(CLOCK_MONOTONIC, &end);
write(fd1[1],"",sizeof(""));
uint64_t diff;
diff = billion * (end.tv_sec - start.tv_sec) + end.tv_nsec - start.tv_nsec;
diff = diff;
sum += diff;
}
The results i get while running this are typically in this manner:
3000
3000
4000
2000
12000
3000
5000
and so forth, when I inspect the time returned to the start and end timespec structs i see that tv_nsec seems to be a 'rounded' number as well:
start.tv_nsec: 714885000, end.tv_nsec: 714888000
Would this be caused by a clock_monotonic not being precise enough for what im attempting to measure, or some other problem that i'm overlooking?
i see that tv_nsec seems to be a 'rounded' number as well:
2626, 714885000, 2626, 714888000
Would this be caused by a clock_monotonic not being precise enough for
what im attempting to measure, or some other problem that i'm
overlooking?
Yes, that's a possibility. Every clock supported by the system has a fixed resolution. struct timespec is capable of supporting clocks with nanosecond resolution, but that does not mean that you can expect every clock to actually have such resolution. It looks like your CLOCK_MONOTONIC might have a resolution of 1 microsecond (1000 nanoseconds), but you can check that via the clock_getres() function.
If it is available to you, then you might try CLOCK_PROCESS_CPUTIME_ID. It is possible that that would have higher resolution than CLOCK_MONOTONIC for you, but do note that single-microsecond resolution is pretty precise -- that's on the order of one tick per 3000 CPU cycles on a modern machine.
Even so, I see several possible problems with your approach:
Although you set your process to have affinity for a single CPU, that does not prevent the system from scheduling other processes on that CPU, too. Thus, unless you've taken additional measures, you can't be certain -- it's not even likely -- that every context switch away from one of your program's threads is to the other thread.
You start your second thread and then immediately join it. There is no more context switching between your threads after that, because your second thread no longer exists after being successfully joined.
read() with a count of 0 may or may not check for errors, and it certainly does not transfer any data. It is totally unclear to me why you identify the time for that call with the time for a context switch.
If a context switch does occur in the space you're timing, then at least two need to occur there -- away from your program and back to it. Also, you're measuring the time consumed by whatever else runs in the other context as well, not just the switch time. The 1000-nanosecond steps may thus reflect time slices, rather than switching time.
Your main thread is writing null characters to the write end of a pipe, but there does not appear to be anything reading them. If indeed there isn't then this will eventually fill up the pipe's buffer and block. The purpose is lost on me.
I have a function and that is called at specific intervals. I need to check the time previously its called, and the current time. If the difference between the function call is 10 milliseconds then execute some piece of code. Sleep should not be used since some other things are executing in parallel. I have written the following code and the function is called at every 10 milliseconds but the difference i am calcuting is giving 1 or 2 milliseconds less sometimes. what is best way to calculate the difference?
fxn()
{
int logCurTime;
static int logPrevTime = 0, logDiffTime = 0;
getCurrentTimeInMilliSec(&logCurTime);
if (logPrevTime > 0)
logDiffTime += logCurTime - logPrevTime;
if (logCurTime <= logPrevTime)
return;
if (logDiffTime >= 10)
{
...
...
logDiffTime = 0;
}
logPrevTime = logCurTime;
}
For eg:
fxn is called 10 times with the interval of 10 milliseconds. some instance logDiffTime is just 8 or 9 and next instance it accounts the remaining time. i.e., 11 or 12.
Using sleep() to get code executed in specific time intervals is indeed a bad idea. Register your function as the handler for a timer interrupt. Then it will be called very precisely on time.
If you're doing heavy lifting stuff in your function, than you should do it in another thread, because you will run into trouble when you're function is taking too long. (it will just be called from the beginning again).
In posix (linux) you could do it like this
#include <sys/time.h>
#include <stdio.h>
#include <signal.h>
if (signal (SIGALRM, fxn) == SIG_ERR)
perror ("Setting your function as timer handler failed");
unsigned seconds = 42;//your time
struct itimerval old, new_time;
new_time.it_interval.tv_usec = 0;
new_time.it_interval.tv_sec = 0;
new_time.it_value.tv_usec = 0;
new_time.it_value.tv_sec = (long int) seconds;
if (setitimer (ITIMER_REAL, &new_time, &old) != 0)
perror ("Setting the timer failed");
or in windows:
#include <Windows.h>
void Fxn_Timer_Proc_Wrapper(HWND,UINT,UINT_PTR,DWORD){
fxn();
}
unsigned seconds = 42;//your time
UINT_PTR timer_id;
if ( (timer_id = SetTimer(NULL,NULL,seconds *1000,(TIMERPROC) Fxn_Timer_Proc_Wrapper) == NULL){
//failed to create a timer
}
It may not be exactly what you are looking for, however I feel it should be clarified:
The sleep call only suspends the calling thread, not all threads of the process. Thus, you can still run parallel threads while one of them sleeps.
See this question for more:
Do sleep functions sleep all threads or just the one who call it?
For a solution to your problem you should register your function with a timer interrupt. See the other answer on how to do that.
10ms is at the edge of what is achievable see stack overflow : 1ms timer . However, several suggestions on how to get 10ms did come out.
timerfd_create allows your program to wait using select.
timer_settime allows your program to request the 10ms interval.
The caveats on linux are :-
May not be scheduled - the OS could be busy doing something else.
May not be accurate - as 10ms appears to be the shortest interval that works, it may be +/- 1 or 2 ms.
I am just starting to look into multi-threaded programming and thread safety. I am familiar with busy-waiting and after a bit of research I am now familiar with the theory behind spin locks, so I thought I would have a look at OSSpinLock's implementation on the Mac. It boils down to the following function (defined in objc-os.h):
static inline void ARRSpinLockLock(ARRSpinLock *l)
{
again:
/* ... Busy-waiting ... */
thread_switch(THREAD_NULL, SWITCH_OPTION_DEPRESS, 1);
goto again;
}
(Full implementation here)
After doing a bit of digging, I now have an approximate idea of what thread_switch's parameters do (this site is where I found it). My interpretation of what I have read is that this particular call to thread_switch will switch to the next available thread, and decrease the current thread's priority to an absolute minimum for 1 cycle. 'Eventually' (in CPU time) this thread will become active again and immediately execute the goto again; instruction which starts the busy-waiting all over again.
My question though, is why is this call actually necessary? I found another implementation of a spin-lock (for Windows this time) here and it doesn't include a (Windows-equivalent) thread switching call at all.
You can implement a spin lock in many different ways. If you find another SpinLock implementation for Windows you'll see another algorithm for that (it may involves SetThreadPriority, Sleep or SwitchToThread).
Default implementation for ARRSpinLockLock is clever enough and after one first spinning cycle it "depress" thread priority for a while, this has following advantages:
it gives more opportunities to the thread that owns the lock to release it;
it wastes less CPU time (and power!) performing NOP or PAUSE.
Windows implementation doesn't do it because Windows API doesn't offer that opportunity (there is no equivalent thread_switch() function and multiple calls to SetThreadPriority could be less efficient).
I actually don't think they're that different. In the first case:
static inline void ARRSpinLockLock(ARRSpinLock *l)
{
unsigned y;
again:
if (__builtin_expect(__sync_lock_test_and_set(l, 1), 0) == 0) {
return;
}
for (y = 1000; y; y--) {
#if defined(__i386__) || defined(__x86_64__)
asm("pause");
#endif
if (*l == 0) goto again;
}
thread_switch(THREAD_NULL, SWITCH_OPTION_DEPRESS, 1);
goto again;
}
We try to acquire the lock. If that fails, we spin in the for loop and if it's become available in the meantime we immediately try to reacquire it, if not we relinquish the CPU.
In the other case:
inline void Enter(void)
{
int prev_s;
do
{
prev_s = TestAndSet(&m_s, 0);
if (m_s == 0 && prev_s == 1)
{
break;
}
// reluinquish current timeslice (can only
// be used when OS available and
// we do NOT want to 'spin')
// HWSleep(0);
}
while (true);
}
Note the comment below the if, which actually says that we could either spin or relinquish the CPU if the OS gives us that option. In fact the second example seems to just leave that part up to the programmer [insert your preferred way of continuing the code here], so in a sense it's not a complete implementation like the first one.
My take on the whole thing, and I'm commenting on the first snippet, is that they're trying to achieve a balance between being able to get the lock fast (within 1000 iterations) and not hogging the CPU too much (hence we eventually switch if the lock does not become available).
I'm using select() on a Linux/ARM platform to see if a udp socket has received a packet. I'd like to know how much time was remaining in the select call if it returns before the timeout (having detected a packet).
Something along the lines of:
int wait_fd(int fd, int msec)
{
struct timeval tv;
fd_set rws;
tv.tv_sec = msec / 1000ul;
tv.tv_usec = (msec % 1000ul) * 1000ul;
FD_ZERO( & rws);
FD_SET(fd, & rws);
(void)select(fd + 1, & rws, NULL, NULL, & tv);
if (FD_ISSET(fd, &rws)) { /* There is data */
msec = (tv.tv_sec * 1000) + (tv.tv_usec / 1000);
return(msec?msec:1);
} else { /* There is no data */
return(0);
}
}
The safest thing is to ignore the ambiguous definition of select() and time it yourself.
Just get the time before and after the select and subtract that from the interval you wanted.
If I recall correctly, the select() function treats the timeout and an I/O parameter and when select returns the time remaining is returned in the timeout variable.
Otherwise, you will have to record the current time before calling, and again after and obtain the difference between the two.
From "man select" on OSX:
Timeout is not changed by select(), and may be reused on subsequent calls, however it
is good style to re-ini-tialize it before each invocation of select().
You'll need to call gettimeofday before calling select, and then gettimeofday on exit.
[Edit] It seems that linux is slightly different:
(ii) The select function may update the timeout parameter to indicate
how much time was left. The pselect function does not change
this parameter.
On Linux, the function select modifies timeout to reflect the amount of
time not slept; most other implementations do not do this. This causes
problems both when Linux code which reads timeout is ported to other
operating systems, and when code is ported to Linux that reuses a
struct timeval for multiple selects in a loop without reinitializing
it. Consider timeout to be undefined after select returns.
Linux select() updates the timeout argument to reflect the time that has past.
Note that this is not portable across other systems (hence the warning in the OS X manual quoted above) but does work with Linux.
Gilad
Do not use select, try with fd larger than 1024 with your code and see what you will get.