Related
This might be a dumb question, i'm very sorry if that's the case. But i'm struggling to take advantage of the multiple cores in my computer to perform multiple computations at the same time in my Quad-Core MacBook. This is not for any particular project, just a general question, since i want to learn for when i eventually do need to do this kind of things
I am aware of threads, but the seem to run in the same core, so i don't seem to gain any performance using them for compute-bound operations (They are very useful for socket based stuff tho!).
I'm also aware of processed that can be created with fork, but i'm nor sure they are guaranteed to use more CPU, or if they, like threads, just help with IO-bound operations.
Finally i'm aware of CUDA, allowing paralellism in the GPU (And i think OpenCL and Compute Shaders also allows my code to run in the CPU in parallel) but i'm currently looking for something that will allow me to take advantage of the multiple CPU cores that my computer has.
In python, i'm aware of the multiprocessing module, which seems to provide an API very similar to threads, but there i do seem to gain an edge by running multiple functions performing computations in parallel. I'm looking into how could i get this same advantage in C, but i don't seem to be able
Any help pointing me to the right direction would be very much appreciated
Note: I'm trying to achive true parallelism, not concurrency
Note 2: I'm only aware of threads and using multiple processes in C, with threads i don't seem to be able to win the performance boost i want. And i'm not very familiar with processes, but i'm still not sure if running multiple processes is guaranteed to give me the advantage i'm looking for.
A simple program to heat up your CPU (100% utilization of all available cores).
Hint: The thread starting function does not return, program exit via [CTRL + C]
#include <pthread.h>
void* func(void *arg)
{
while (1);
}
int main()
{
#define NUM_THREADS 4 //use the number of cores (if known)
pthread_t threads[NUM_THREADS];
for (int i=0; i < NUM_THREADS; ++i)
pthread_create(&threads[i], NULL, func, NULL);
for (int i=0; i < NUM_THREADS; ++i)
pthread_join(threads[i], NULL);
return 0;
}
Compilation:
gcc -pthread -o thread_test thread_test.c
If i start ./thread_test, all cores are at 100%.
A word to fork and pthread_create:
fork creates a new process (the current process image will be copied and executed in parallel), while pthread_create will create a new thread, sometimes called a lightweight process.
Both, processes and threads will run in 'parallel' to the parent process.
It depends, when to use a child process over a thread, e.g. a child is able to replace its process image (via exec family) and has its own address space, while threads are able to share the address space of the current parent process.
There are of course a lot more differences, for that i recommend to study the following pages:
man fork
man pthreads
I am aware of threads, but the seem to run in the same core, so i don't seem to gain any performance using them for compute-bound operations (They are very useful for socket based stuff tho!).
No, they don't. Except if you block and your threads don't block, you'll see alll of them running. Just try this (beware that this consumes all your cpu time) that starts 16 threads each counting in a busy loop for 60 s. You will see all of them running and makins your cores to fail to their knees (it runs only a minute this way, then everything ends):
#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 16 /* had 16 cores, so I used this. Put here as many
* threads as cores you have. */
struct thread_data {
pthread_t thread_id; /* the thread id */
struct timespec end_time; /* time to get out of the tunnel */
int id; /* the array position of the thread */
unsigned long result; /* number of times looped */
};
void *thread_body(void *data)
{
struct thread_data *p = data;
p->result = 0UL;
clock_gettime(CLOCK_REALTIME, &p->end_time);
p->end_time.tv_sec += 60; /* 60 s. */
struct timespec now;
do {
/* just get the time */
clock_gettime(CLOCK_REALTIME, &now);
p->result++;
/* if you call printf() you will see them slowing, as there's a
* common buffer that forces all thread to serialize their outputs
*/
/* check if we are over */
} while ( now.tv_sec < p->end_time.tv_sec
|| now.tv_nsec < p->end_time.tv_nsec);
return p;
} /* thread_body */
int main()
{
struct thread_data thrd_info[N];
for (int i = 0; i < N; i++) {
struct thread_data *d = &thrd_info[i];
d->id = i;
d->result = 0;
printf("Starting thread %d\n", d->id);
int res = pthread_create(&d->thread_id,
NULL, thread_body, d);
if (res < 0) {
perror("pthread_create");
exit(EXIT_FAILURE);
}
printf("Thread %d started\n", d->id);
}
printf("All threads created, waiting for all to finish\n");
for (int i = 0; i < N; i++) {
struct thread_data *joined;
int res = pthread_join(thrd_info[i].thread_id,
(void **)&joined);
if (res < 0) {
perror("pthread_join");
exit(EXIT_FAILURE);
}
printf("PTHREAD %d ended, with value %lu\n",
joined->id, joined->result);
}
} /* main */
Linux and all multithread systems work the same, they create a new execution unit (if both don't share the virtual address space, they are both processes --not exactly so, but this explains the main difference between a process and a thread--) and the available processors are given to each thread as necessary. Threads are normally encapsulated inside processes (they share ---not in linux, if that has not changed recently--- the process id, and virtual memory) Processes run each in a separate virtual space, so they can only share things through the system resources (files, shared memory, communication sockets/pipes, etc.)
The problem with your test case (you don't show it so I have go guess) is that probably you will make all threads in a loop in which you try to print something. If you do that, probably the most time each thread is blocked trying to do I/O (to printf() something)
Stdio FILEs have the problem that they share a buffer between all threads that want to print on the same FILE, and the kernel serializes all the write(2) system calls to the same file descriptor, so if the most of the time you pass in the loop is blocked in a write, the kernel (and stdio) will end serializing all the calls to print, making it to appear that only one thread is working at a time (all the threads will become blocked by the one that is doing the I/O) This busy loop will make all the threads to run in parallel and will show you how the cpu is collapsed.
Parallelism in C can be achieved by using the fork() function. This function simulates a thread by allowing two threads to run simultaneously and share data. The first thread forks itself, and the second thread is then executed as if it was launched from main(). Forking allows multiple processes to be Run concurrently without conflicts arising.
To make sure that data is shared appropriately between the two threads, use the wait() function before accessing shared resources. Wait will block execution of the current program until all database connections are closed or all I/O has been completed, whichever comes first.
I am using pthreads with gcc. The simple code example takes the number of threads "N" as a user-supplied input. It splits up a long array into N roughly equally sized subblocks. Each subblock is written into by individual threads.
The dummy processing for this example really involves sleeping for a fixed amount of time for each array index and then writing a number into that array location.
Here's the code:
/******************************************************************************
* FILE: threaded_subblocks_processing
* DESCRIPTION:
* We have a bunch of parallel processing to do and store the results in a
* large array. Let's try to use threads to speed it up.
******************************************************************************/
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <math.h>
#define BIG_ARR_LEN 10000
typedef struct thread_data{
int start_idx;
int end_idx;
int id;
} thread_data_t;
int big_result_array[BIG_ARR_LEN] = {0};
void* process_sub_block(void *td)
{
struct thread_data *current_thread_data = (struct thread_data*)td;
printf("[%d] Hello World! It's me, thread #%d!\n", current_thread_data->id, current_thread_data->id);
printf("[%d] I'm supposed to work on indexes %d through %d.\n", current_thread_data->id,
current_thread_data->start_idx,
current_thread_data->end_idx-1);
for(int i=current_thread_data->start_idx; i<current_thread_data->end_idx; i++)
{
int retval = usleep(1000.0*1000.0*10.0/BIG_ARR_LEN);
if(retval)
{
printf("sleep failed");
}
big_result_array[i] = i;
}
printf("[%d] Thread #%d done, over and out!\n", current_thread_data->id, current_thread_data->id);
pthread_exit(NULL);
}
int main(int argc, char *argv[])
{
if (argc!=2)
{
printf("usage: ./a.out number_of_threads\n");
return(1);
}
int NUM_THREADS = atoi(argv[1]);
if (NUM_THREADS<1)
{
printf("usage: ./a.out number_of_threads (where number_of_threads is at least 1)\n");
return(1);
}
pthread_t *threads = malloc(sizeof(pthread_t)*NUM_THREADS);
thread_data_t *thread_data_array = malloc(sizeof(thread_data_t)*NUM_THREADS);
int block_size = BIG_ARR_LEN/NUM_THREADS;
for(int i=0; i<NUM_THREADS-1; i++)
{
thread_data_array[i].start_idx = i*block_size;
thread_data_array[i].end_idx = (i+1)*block_size;
thread_data_array[i].id = i;
}
thread_data_array[NUM_THREADS-1].start_idx = (NUM_THREADS-1)*block_size;
thread_data_array[NUM_THREADS-1].end_idx = BIG_ARR_LEN;
thread_data_array[NUM_THREADS-1].id = NUM_THREADS;
int ret_code;
long t;
for(t=0;t<NUM_THREADS;t++){
printf("[main] Creating thread %ld\n", t);
ret_code = pthread_create(&threads[t], NULL, process_sub_block, (void *)&thread_data_array[t]);
if (ret_code){
printf("[main] ERROR; return code from pthread_create() is %d\n", ret_code);
exit(-1);
}
}
printf("[main] Joining threads to wait for them.\n");
void* status;
for(int i=0; i<NUM_THREADS; i++)
{
pthread_join(threads[i], &status);
}
pthread_exit(NULL);
}
and I compile it with
gcc -pthread threaded_subblock_processing.c
and then I call it from command line like so:
$ time ./a.out 4
I see a speed up when I increase the number of threads. With 1 thread the process takes just a little over 10 seconds. This makes sense because I sleep for 1000 usec per array element, and there are 10,000 array elements. Next when I go to 2 threads, it goes down to a little over 5 seconds, and so on.
What I don't understand is that I get a speed-up even after my number of threads exceeds the number of cores on my computer! I have 4 cores, so I was expecting no speed-up for >4 threads. But, surprisingly, when I run
$ time ./a.out 100
I get a 100x speedup and the processing completes in ~0.1 seconds! How is this possible?
Some general background
A program's progress can be slowed by many things, but, in general, you can slow spots (otherwise known as hot spots) into two categories:
CPU Bound: In this case, the processor is doing some heavy number crunching (like trigonometric functions). If all the CPU's cores are engaged in such tasks, other processes must wait.
Memory bound: In this case, the processor is waiting for information to be retrieved from the hard disk or RAM. Since these are typically orders of magnitude slower than the processor, from the CPU's perspective this takes forever.
But you can also imagine other situations in which a process must wait, such as for a network response.
In many of these memory-/network-bound situations, it is possible to put a thread "on hold" while the memory crawls towards the CPU and do other useful work in the meantime. If this is done well then a multi-threaded program can well out-perform its single-threaded equivalent. Node.js makes use of such asynchronous programming techniques to achieve good performance.
Here's a handy depiction of various latencies:
Your question
Now, getting back to your question: you have multiple threads going, but they are performing neither CPU-intensive nor memory-intensive work: there's not much there to take up time. In fact, the sleep function is essentially telling the operating system that no work is being done. In this case, the OS can do work in other threads while your threads sleep. So, naturally, the apparent performance increases significantly.
Note that for low-latency applications, such as MPI, busy waiting is sometimes used instead of a sleep function. In this case, the program goes into a tight loop and repeatedly checks a condition. Externally, the effect looks similar, but sleep uses no CPU while the busy wait uses ~100% of the CPU.
I begin learn about the Linux C,but i meet the problem so confused me.
I use the function times.but return the value equals 0.
ok i made the mistak,I changed the code:
But the is not much relative with printf. clock_t is define with long in Linux.so i convert clock_t to long.
This is my code:
#include <sys/times.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
long clock_times;
struct tms begintime;
sleep(5);
if((clock_times=times(&begintime))==-1)
perror("get times error");
else
{
printf("%ld\n",(long)begintime.tms_utime);
printf("%ld\n",(long)begintime.tms_stime);
printf("%ld\n",(long)begintime.tms_cutime);
printf("%ld\n",(long)begintime.tms_cstime);
}
return 0;
}
the output:
0
0
0
0
the also return 0;
and I using gdb to debug,Also the variable of begintimes also to be zero.
there is no relative with printf function.
Please
This is not unusual; the process simply hasn't used enough CPU time to measure. The amount of time the process spends in sleep() doesn't count against the program's CPU time, as times() measures The CPU time charged for the execution of user instructions (among other related times) which is the amount of time the process has spent executing user/kernel code.
Change your program to the following which uses more CPU and can therefore be measured:
#include <sys/times.h>
#include <sys/time.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
long clock_times;
struct tms begintime;
unsigned i;
for (i = 0; i < 1000000; i++)
time(NULL); // An arbitrary library call
if((clock_times=times(&begintime))==-1)
perror("get times error");
else
{
printf("%ld %ld %ld %ld\n",
(long)begintime.tms_utime,
(long)begintime.tms_stime,
(long)begintime.tms_cutime,
(long)begintime.tms_cstime);
}
return 0;
}
Your code using close to none CPU time, so results are correct. Sleep suspends your program execution - everything that happens in this time is not your execution time, so it wouldn't be counted.
Add empty loop and you'll see the difference. (ofc., disable compiler optimisations - or empty loop will be removed).
Take a look at 'time' program output (time ./a.out) - it prints 'real' time (estimated by gettimeofday(), i suppose), user time (time wasted by your userspace code) and system time (time wasted within system calls - e.g. write to file, open network connection, etc.).
(sure, by 'wasted' i mean 'used', but whatever)
I'm debugging a multi-threaded problem with C, pthread and Linux. On my MacOS 10.5.8, C2D, is runs fine, on my Linux computers (2-4 cores) it produces undesired outputs.
I'm not experienced, therefore I attached my code. It's rather simple: each new thread creates two more threads until a maximum is reached. So no big deal... as I thought until a couple of days ago.
Can I force single-core execution to prevent my bugs from occuring?
I profiled the programm execution, instrumenting with Valgrind:
valgrind --tool=drd --read-var-info=yes --trace-mutex=no ./threads
I get a couple of conflicts in the BSS segment - which are caused by my global structs and thread counter variales. However I could mitigate these conflicts with forced signle-core execution because I think the concurrent sheduling of my 2-4 core test-systems are responsible for my errors.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define MAX_THR 12
#define NEW_THR 2
int wait_time = 0; // log global wait time
int num_threads = 0; // how many threads there are
pthread_t threads[MAX_THR]; // global array to collect threads
pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER; // sync
struct thread_data
{
int nr; // nr of thread, serves as id
int time; // wait time from rand()
};
struct thread_data thread_data_array[MAX_THR+1];
void
*PrintHello(void *threadarg)
{
if(num_threads < MAX_THR){
// using the argument
pthread_mutex_lock(&mut);
struct thread_data *my_data;
my_data = (struct thread_data *) threadarg;
// updates
my_data->nr = num_threads;
my_data->time= rand() % 10 + 1;
printf("Hello World! It's me, thread #%d and sleep time is %d!\n",
my_data->nr,
my_data->time);
pthread_mutex_unlock(&mut);
// counter
long t = 0;
for(t = 0; t < NEW_THR; t++){
pthread_mutex_lock(&mut);
num_threads++;
wait_time += my_data->time;
pthread_mutex_unlock(&mut);
pthread_create(&threads[num_threads], NULL, PrintHello, &thread_data_array[num_threads]);
sleep(1);
}
printf("Bye from %d thread\n", my_data->nr);
pthread_exit(NULL);
}
return 0;
}
int
main (int argc, char *argv[])
{
long t = 0;
// srand(time(NULL));
if(num_threads < MAX_THR){
for(t = 0; t < NEW_THR; t++){
// -> 2 threads entry point
pthread_mutex_lock(&mut);
// rand time
thread_data_array[num_threads].time = rand() % 10 + 1;
// update global wait time variable
wait_time += thread_data_array[num_threads].time;
num_threads++;
pthread_mutex_unlock(&mut);
pthread_create(&threads[num_threads], NULL, PrintHello, &thread_data_array[num_threads]);
pthread_mutex_lock(&mut);
printf("In main: creating initial thread #%ld\n", t);
pthread_mutex_unlock(&mut);
}
}
for(t = 0; t < MAX_THR; t++){
pthread_join(threads[t], NULL);
}
printf("Bye from program, wait was %d\n", wait_time);
pthread_exit(NULL);
}
I hope that code isn't too bad. I didn't do too much C for a rather long time. :) The problem is:
printf("Bye from %d thread\n", my_data->nr);
my_data->nr sometimes resolves high integer values:
In main: creating initial thread #0
Hello World! It's me, thread #2 and sleep time is 8!
In main: creating initial thread #1
[...]
Hello World! It's me, thread #11 and sleep time is 8!
Bye from 9 thread
Bye from 5 thread
Bye from -1376900240 thread
[...]
I don't now more ways to profile and debug this.
If I debug this, it works - sometimes. Sometimes it doesn't :(
Thanks for reading this long question. :) I hope I didn't share too much of my currently unresolveable confusion.
Since this program seems to be just an exercise in using threads, with no actual goal, it is difficult to suggest how treat your problem rather than treat the symptom. I believe can actually pin a process or thread to a processor in Linux, but doing so for all threads removes most of the benefit of using threads, and I don't actually remember how to do it. Instead I'm going to talk about some things wrong with your program.
C compilers often make a lot of assumptions when they are doing optimizations. One of the assumptions is that unless the current code being examined looks like it might change some variable that variable does not change (this is a very rough approximation to this, and a more accurate explanation would take a very long time).
In this program you have variables which are shared and changed by different threads. If a variable is only read by threads (either const or effectively const after threads that look at it are created) then you don't have much to worry about (and in "read by threads" I'm including the main original thread) because since the variable doesn't change if the compiler only generates code to read that variable once (remembering it in a local temporary variable) or if it generates code to read it over and over the value is always the same so that calculations based on it always come out the same.
To force the compiler not do this you can use the volatile keyword. It is affixed to variable declarations just like the const keyword, and tells the compiler that the value of that variable can change at any instant, so reread it every time its value is needed, and rewrite it every time a new value for it is assigned.
NOTE that for pthread_mutex_t (and similar) variables you do not need volatile. It if were needed on the type(s) that make up pthread_mutex_t on your system volatile would have been used within the definition of pthread_mutex_t. Additionally the functions that access this type take the address of it and are specially written to do the right thing.
I'm sure now you are thinking that you know how to fix your program, but it is not that simple. You are doing math on a shared variable. Doing math on a variable using code like:
x = x + 1;
requires that you know the old value to generate the new value. If x is global then you have to conceptually load x into a register, add 1 to that register, and then store that value back into x. On a RISC processor you actually have to do all 3 of those instructions, and being 3 instructions I'm sure you can see how another thread accessing the same variable at nearly the same time could end up storing a new value in x just after we have read our value -- making our value old, so our calculation and the value we store will be wrong.
If you know any x86 assembly then you probably know that it has instructions that can do math on values in RAM (both getting from and storing the result in the same location in RAM all in one instruction). You might think that this instruction could be used for this operation on x86 systems, and you would almost be right. The problem is that this instruction is still executed in the steps that the RISC instruction would be executed in, and there are several opportunities for another processor to change this variable at the same time as we are doing our math on it. To get around this on x86 there is a lock prefix that may be applied to some x86 instructions, and I believe that glibc header files include atomic macro functions to do this on architectures that can support it, but this can't be done on all architectures.
To work right on all architectures you are going to need to:
int local_thread_count;
int create_a_thread;
pthread_mutex_lock(&count_lock);
local_thread_count = num_threads;
if (local_thread_count < MAX_THR) {
num_threads = local_thread_count + 1;
pthread_mutex_unlock(&count_lock);
thread_data_array[local_thread_count].nr = local_thread_count;
/* moved this into the creator
* since getting it in the
* child will likely get the
* wrong value. */
pthread_create(&threads[local_thread_count], NULL, PrintHello,
&thread_data_array[local_thread_count]);
} else {
pthread_mutex_unlock(&count_lock);
}
Now, since you would have changed the num_threads to volatile you can atomically test and increment the thread count in all threads. At the end of this local_thread_count should be usable as an index into the array of threads. Note that I did not create but 1 thread in this code, while yours was supposed to create several. I did this to make the example more clear, but it should not be too difficult to change it to go ahead and add NEW_THR to num_threads, but if NEW_THR is 2 and MAX_THR - num_threads is 1 (somehow) then you have to handle that correctly somehow.
Now, all of that being said, there may be another way to accomplish similar things by using semaphores. Semaphores are like mutexes, but they have a count associated with them. You would not get a value to use as the index into the array of threads (the function to read a semaphore count won't really give you this), but I thought that it deserved to be mentioned since it is very similar.
man 3 semaphore.h
will tell you a little bit about it.
num_threads should at least be marked volatile, and preferably marked atomic too (although I believe that the int is practically fine), so that at least there is a higher chance that the different threads are seeing the same values. You might want to view the assembler output to see when the writes of num_thread to memory are actually supposedly taking place.
https://computing.llnl.gov/tutorials/pthreads/#PassingArguments
that seems to be the problem. you need to malloc the thread_data struct.
I don't know exactly how to word a search for this.. so I haven't had any luck finding anything.. :S
I need to implement a time delay in C.
for example I want to do some stuff, then wait say 1 minute, then continue on doing stuff.
Did that make sense? Can anyone help me out?
In standard C (C99), you can use time() to do this, something like:
#include <time.h>
:
void waitFor (unsigned int secs) {
unsigned int retTime = time(0) + secs; // Get finishing time.
while (time(0) < retTime); // Loop until it arrives.
}
By the way, this assumes time() returns a 1-second resolution value. I don't think that's mandated by the standard so you may have to adjust for it.
In order to clarify, this is the only way I'm aware of to do this with ISO C99 (and the question is tagged with nothing more than "C" which usually means portable solutions are desirable although, of course, vendor-specific solutions may still be given).
By all means, if you're on a platform that provides a more efficient way, use it. As several comments have indicated, there may be specific problems with a tight loop like this, with regard to CPU usage and battery life.
Any decent time-slicing OS would be able to drop the dynamic priority of a task that continuously uses its full time slice but the battery power may be more problematic.
However C specifies nothing about the OS details in a hosted environment, and this answer is for ISO C and ISO C alone (so no use of sleep, select, Win32 API calls or anything like that).
And keep in mind that POSIX sleep can be interrupted by signals. If you are going to go down that path, you need to do something like:
int finishing = 0; // set finishing in signal handler
// if you want to really stop.
void sleepWrapper (unsigned int secs) {
unsigned int left = secs;
while ((left > 0) && (!finishing)) // Don't continue if signal has
left = sleep (left); // indicated exit needed.
}
Here is how you can do it on most desktop systems:
#ifdef _WIN32
#include <windows.h>
#else
#include <unistd.h>
#endif
void wait( int seconds )
{ // Pretty crossplatform, both ALL POSIX compliant systems AND Windows
#ifdef _WIN32
Sleep( 1000 * seconds );
#else
sleep( seconds );
#endif
}
int
main( int argc, char **argv)
{
int running = 3;
while( running )
{ // do something
--running;
wait( 3 );
}
return 0; // OK
}
Here is how you can do it on a microcomputer / processor w/o timer:
int wait_loop0 = 10000;
int wait_loop1 = 6000;
// for microprocessor without timer, if it has a timer refer to vendor documentation and use it instead.
void
wait( int seconds )
{ // this function needs to be finetuned for the specific microprocessor
int i, j, k;
for(i = 0; i < seconds; i++)
{
for(j = 0; j < wait_loop0; j++)
{
for(k = 0; k < wait_loop1; k++)
{ // waste function, volatile makes sure it is not being optimized out by compiler
int volatile t = 120 * j * i + k;
t = t + 5;
}
}
}
}
int
main( int argc, char **argv)
{
int running = 3;
while( running )
{ // do something
--running;
wait( 3 );
}
return 0; // OK
}
The waitloop variables must be fine tuned, those did work pretty close for my computer, but the frequency scale thing makes it very imprecise for a modern desktop system; So don't use there unless you're bare to the metal and not doing such stuff.
Check sleep(3) man page or MSDN for Sleep
Although many implementations have the time function return the current time in seconds, there is no guarantee that every implementation will do so (e.g. some may return milliseconds rather than seconds). As such, a more portable solution is to use the difftime function.
difftime is guaranteed by the C standard to return the difference in time in seconds between two time_t values. As such we can write a portable time delay function which will run on all compliant implementations of the C standard.
#include <time.h>
void delay(double dly){
/* save start time */
const time_t start = time(NULL);
time_t current;
do{
/* get current time */
time(¤t);
/* break loop when the requested number of seconds have elapsed */
}while(difftime(current, start) < dly);
}
One caveat with the time and difftime functions is that the C standard never specifies a granularity. Most implementations have a granularity of one second. While this is all right for delays lasting several seconds, our delay function may wait too long for delays lasting under one second.
There is a portable standard C alternative: the clock function.
The clock function returns the implementation’s best approximation to the processor time used by the program since the beginning of an implementation-defined era related only to the program invocation. To determine the time in seconds, the value returned by the clock function should be divided by the value of the macro CLOCKS_PER_SEC.
The clock function solution is quite similar to our time function solution:
#include <time.h>
void delay(double dly){
/* save start clock tick */
const clock_t start = clock();
clock_t current;
do{
/* get current clock tick */
current = clock();
/* break loop when the requested number of seconds have elapsed */
}while((double)(current-start)/CLOCKS_PER_SEC < dly);
}
There is a caveat in this case similar to that of time and difftime: the granularity of the clock function is left to the implementation. For example, machines with 32-bit values for clock_t with a resolution in microseconds may end up wrapping the value returned by clock after 2147 seconds (about 36 minutes).
As such, consider using the time and difftime implementation of the delay function for delays lasting at least one second, and the clock implementation for delays lasting under one second.
A final word of caution: clock returns processor time rather than calendar time; clock may not correspond with the actual elapsed time (e.g. if the process sleeps).
For delays as large as one minute, sleep() is a nice choice.
If someday, you want to pause on delays smaller than one second, you may want to consider poll() with a timeout.
Both are POSIX.
There are no sleep() functions in the pre-C11 C Standard Library, but POSIX does provide a few options.
The POSIX function sleep() (unistd.h) takes an unsigned int argument for the number of seconds desired to sleep. Although this is not a Standard Library function, it is widely available, and glibc appears to support it even when compiling with stricter settings like --std=c11.
The POSIX function nanosleep() (time.h) takes two pointers to timespec structures as arguments, and provides finer control over the sleep duration. The first argument specifies the delay duration. If the second argument is not a null pointer, it holds the time remaining if the call is interrupted by a signal handler.
Programs that use the nanosleep() function may need to include a feature test macro in order to compile. The following code sample will not compile on my linux system without a feature test macro when I use a typical compiler invocation of gcc -std=c11 -Wall -Wextra -Wpedantic.
POSIX once had a usleep() function (unistd.h) that took a useconds_t argument to specify sleep duration in microseconds. This function also required a feature test macro when used with strict compiler settings. Alas, usleep() was made obsolete with POSIX.1-2001 and should no longer be used. It is recommended that nanosleep() be used now instead of usleep().
#define _POSIX_C_SOURCE 199309L // feature test macro for nanosleep()
#include <stdio.h>
#include <unistd.h> // for sleep()
#include <time.h> // for nanosleep()
int main(void)
{
// use unsigned sleep(unsigned seconds)
puts("Wait 5 sec...");
sleep(5);
// use int nanosleep(const struct timespec *req, struct timespec *rem);
puts("Wait 2.5 sec...");
struct timespec ts = { .tv_sec = 2, // seconds to wait
.tv_nsec = 5e8 }; // additional nanoseconds
nanosleep(&ts, NULL);
puts("Bye");
return 0;
}
Addendum:
C11 does have the header threads.h providing thrd_sleep(), which works identically to nanosleep(). GCC did not support threads.h until 2018, with the release of glibc 2.28. It has been difficult in general to find implementations with support for threads.h (Clang did not support it for a long time, but I'm not sure about the current state of affairs there). You will have to use this option with care.
Try sleep(int number_of_seconds)
sleep(int) works as a good delay. For a minute:
//Doing some stuff...
sleep(60); //Freeze for A minute
//Continue doing stuff...
Is it timer?
For WIN32 try http://msdn.microsoft.com/en-us/library/ms687012%28VS.85%29.aspx
you can simply call delay() function. So if you want to delay the process in 3 seconds, call delay(3000)...
If you are certain you want to wait and never get interrupted then use sleep in POSIX or Sleep in Windows. In POSIX sleep takes time in seconds so if you want the time to be shorter there are varieties like usleep() which uses microseconds. Sleep in Windows takes milliseconds, it is rare you need finer granularity than that.
It may be that you wish to wait a period of time but want to allow interrupts, maybe in the case of an emergency. sleep can be interrupted by signals but there is a better way of doing it in this case.
Therefore I actually found in practice what you do is wait for an event or a condition variable with a timeout.
In Windows your call is WaitForSingleObject. In POSIX it is pthread_cond_timedwait.
In Windows you can also use WaitForSingleObjectEx and then you can actually "interrupt" your thread with any queued task by calling QueueUserAPC. WaitForSingleObject(Ex) will return a code determining why it exited, so you will know when it returns a "TIMEDOUT" status that it did indeed timeout. You set the Event it is waiting for when you want it to terminate.
With pthread_cond_timedwait you can signal broadcast the condition variable. (If several threads are waiting on the same one, you will need to broadcast to wake them all up). Each time it loops it should check the condition. Your thread can get the current time and see if it has passed or it can look to see if some condition has been met to determine what to do. If you have some kind of queue you can check it. (Your thread will automatically have a mutex locked that it used to wait on the condition variable, so when it checks the condition it has sole access to it).
// Provides ANSI C method of delaying x milliseconds
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
void delayMillis(unsigned long ms) {
clock_t start_ticks = clock();
unsigned long millis_ticks = CLOCKS_PER_SEC/1000;
while (clock()-start_ticks < ms*millis_ticks) {
}
}
/*
* Example output:
*
* CLOCKS_PER_SEC:[1000000]
*
* Test Delay of 800 ms....
*
* start[2054], end[802058],
* elapsedSec:[0.802058]
*/
int testDelayMillis() {
printf("CLOCKS_PER_SEC:[%lu]\n\n", CLOCKS_PER_SEC);
clock_t start_t, end_t;
start_t = clock();
printf("Test Delay of 800 ms....\n", CLOCKS_PER_SEC);
delayMillis(800);
end_t = clock();
double elapsedSec = end_t/(double)CLOCKS_PER_SEC;
printf("\nstart[%lu], end[%lu], \nelapsedSec:[%f]\n", start_t, end_t, elapsedSec);
}
int main() {
testDelayMillis();
}
C11 has a function specifically for this:
#include <threads.h>
#include <time.h>
#include <stdio.h>
void sleep(time_t seconds) {
struct timespec time;
time.tv_sec = seconds;
time.tv_nsec = 0;
while (thrd_sleep(&time, &time)) {}
}
int main() {
puts("Sleeping for 5 seconds...");
sleep(5);
puts("Done!");
return 0;
}
Note that this is only available starting in glibc 2.28.
for C use in gcc.
#include <windows.h>
then use Sleep(); /// Sleep() with capital S. not sleep() with s .
//Sleep(1000) is 1 sec /// maybe.
clang supports sleep(), sleep(1) is for 1 sec time delay/wait.
For short delays (say, some microseconds) on Linux OS, you can use "usleep":
// C Program to perform short delays
#include <unistd.h>
#include <stdio.h>
int main(){
printf("Hello!\n");
usleep(1000000); // For a 1-second delay
printf("Bye!\n);
return 0;
system("timeout /t 60"); // waits 60s. this is only for windows vista,7,8
system("ping -n 60 127.0.0.1 >nul"); // waits 60s. for all windows
Write this code :
void delay(int x)
{ int i=0,j=0;
for(i=0;i<x;i++){for(j=0;j<200000;j++){}}
}
int main()
{
int i,num;
while(1) {
delay(500);
printf("Host name");
printf("\n");}
}