Array sum with 4 threads is slower than 1 thread [duplicate]

Array sum with 4 threads is slower than 1 thread [duplicate] - c

I've always used clock() to measure how much time my application took from start to finish, as;
int main(int argc, char *argv[]) {
const clock_t START = clock();
// ...
const double T_ELAPSED = (double)(clock() - START) / CLOCKS_PER_SEC;
}
Since I've started using POSIX threads this seem to fail. It looks like clock() increases N times faster with N threads. As I don't know how many threads are going to be running simultaneously, this approach fails. So how can I measure how much time has passed ?

clock() measure the CPU time used by your process, not the wall-clock time. When you have multiple threads running simultaneously, you can obviously burn through CPU time much faster.
If you want to know the wall-clock execution time, you need to use an appropriate function. The only one in ANSI C is time(), which typically only has 1 second resolution.
However, as you've said you're using POSIX, that means you can use clock_gettime(), defined in time.h. The CLOCK_MONOTONIC clock in particular is the best to use for this:
struct timespec start, finish;
double elapsed;
clock_gettime(CLOCK_MONOTONIC, &start);
/* ... */
clock_gettime(CLOCK_MONOTONIC, &finish);
elapsed = (finish.tv_sec - start.tv_sec);
elapsed += (finish.tv_nsec - start.tv_nsec) / 1000000000.0;
(Note that I have done the calculation of elapsed carefully to ensure that precision is not lost when timing very short intervals).
If your OS doesn't provide CLOCK_MONOTONIC (which you can check at runtime with sysconf(_SC_MONOTONIC_CLOCK)), then you can use CLOCK_REALTIME as a fallback - but note that the latter has the disadvantage that it will generate incorrect results if the system time is changed while your process is running.

What timing resolution do you need? You could use time() from time.h for second resolution. If you need higher resolution, then you could use something more system specific. See Timer function to provide time in nano seconds using C++

Related

Does `clock` measure `sleep` i.e. suspended threads?

I am trying to understand the clock_t clock(void); function better and have following question:
Did I understand that correctly that clock measures the number of ticks of a process since it is actively running and sleep suspends the calling thread – in this case there is only one thread, namely the main-thread – and therefore suspends the whole process. Which means that clock does not measure the cpu-time (ticks) of the process, since it is not actively running?
If so what is the recommended way to measure the actual needed time?
"The clock() function returns an approximation of processor time used by the program." Source
"The CPU time (process time) is measured in clock ticks or seconds." Source
"The number of clock ticks per second can be obtained using: sysconf(_SC_CLK_TCK);" Source
#include <stdio.h> // printf
#include <time.h> // clock
#include <unistd.h> // sleep
int main()
{
printf("ticks per second: %zu\n", sysconf(_SC_CLK_TCK));
clock_t ticks_since_process_startup_1 = clock();
sleep(1);
clock_t ticks_since_process_startup_2 = clock();
printf("ticks_probe_1: %zu\n", ticks_since_process_startup_1);
printf("sleep(1);\n");
printf("ticks_probe_2: %zu\n", ticks_since_process_startup_2);
printf("ticks diff: %zu <-- should be 100\n", ticks_since_process_startup_2 - ticks_since_process_startup_1);
printf("ticks diff sec: %Lf <-- should be 1 second\n", (long double)(ticks_since_process_startup_2 - ticks_since_process_startup_1) / CLOCKS_PER_SEC);
return 0;
}
Resulting Output:
ticks per second: 100
ticks_probe_1: 603
sleep(1);
ticks_probe_2: 616
ticks diff: 13 <-- should be 100
ticks diff sec: 0.000013 <-- should be 1 second

Does clock measure sleep i.e. suspended threads?
No.
(Well, it could measure it, there's nothing against it. You could have a very bad OS that implements sleep() as a while (!time_to_sleep_expired()) {} busy loop. But any self-respected OS will try to make the process not to use CPU when in sleep()).
Which means that clock does not measure the cpu-time (ticks) of the process, since it is not actively running?
Yes.
If so what is the recommended way to measure the actual needed time?
To measure "real-time" use clock_gettime(CLOCK_MONOTONIC, ...) on a POSIX system.
The number of clock ticks per second can be obtained using: sysconf(_SC_CLK_TCK);
Yes, but note that sysconf(_SC_CLK_TCK); is not CLOCKS_PER_SECOND. You do not use ex times() in your function, you use clock(), I do not really get why you print sysconf(_SC_CLK_TCK);. Anyway, see for example sysconf(_SC_CLK_TCK) vs. CLOCKS_PER_SEC .

Using a timer in C

I have read several forums with the same title, although what I am after is NOT a way to find out how long it takes my computer to execute the program.
I am interested in finding out how long the program is in use by the user. I have looked at several functions inside #include <time.h>, however, it seems as though these functions (like clock_t) give the time it takes for my computer to execute my code which is not what I am after.
Thank you for your time.
EDIT:
I have done:
clock_t start, stop;
long int x;
double duration;
start = clock();
//code
stop = clock(); // get number of ticks after loop
// calculate time taken for loop
duration = ( double ) ( stop - start ) / CLOCKS_PER_SEC;
printf( "\nThe number of seconds for loop to run was %.2lf\n", duration );
I receive 0.16 from the program, however, when i timed it I got 1 minute and 14 seconds. how does this possibly add up?

The clock function counts CPU time used, not elapsed time. For a program that's 100% CPU-intensive, the time reported by clock will be close to wall time, but otherwise it's likely to be less -- often much less.
One easy way of measuring elapsed or "wall clock" time is with the time function:
time_t start, stop;
start = time(NULL);
// code
stop = time(NULL);
printf("The number of seconds for loop to run was %ld\n", stop - start);
This is a POSIX function, though -- it's not part of the core C standards, and it may not exist on, say, embedded versions of C.

I think you're looking for getrusage() function. It doesn't give you time consumed like how you use a stop-watch to time a function.
The function gives overall resource usage of the current program and splits it per-user. This is what the time command gives.

Is the CLOCKS_PER_SEC value "wrong" inside a virtual machine

I code inside a virtual machine( Linux Ubuntu) that's installed on a windows.
According to this page, the CLOCKS_PER_SEC value in the library on Linux should always be 1 000 000.
When I run this code:
int main()
{
printf("%d\n", CLOCKS_PER_SEC);
while (1)
printf("%f, %f \n\n", (double)clock(), (double)clock()/CLOCKS_PER_SEC);
return 0;
}
The first value is 1 000 000 as it should be, however, the value that should show the number of seconds does NOT increase at the normal pace (it takes between 4 and 5 seconds to increase by 1)
Is this due to my working on a virtual machine? How can I solve this?

This is expected.
The clock() function does not return the wall time (the time that real clocks on the wall display). It returns the amount of CPU time used by your program. If your program is not consuming every possible scheduler slice, then it will increase slower than wall time, if your program consumes slices on multiple cores at the same time it can increase faster.
So if you call clock(), and then sleep(5), and then call clock() again, you'll find that clock() has barely increased at all. Even though sleep(5) waits for 5 real seconds, it doesn't consume any CPU, and CPU usage is what clock() measures.
If you want to measure wall clock time you will want clock_gettime() (or the older version gettimeofday()). You can use CLOCK_REALTIME if you want to know the civil time (e.g. "it's 3:36 PM") or CLOCK_MONOTONIC if you want to measure time intervals. In this case, you probably want CLOCK_MONOTONIC.
#include <stdio.h>
#include <time.h>
int main() {
struct timespec start, now;
clock_gettime(CLOCK_MONOTONIC, &start);
while (1) {
clock_gettime(CLOCK_MONOTONIC, &now);
printf("Elapsed: %f\n",
(now.tv_sec - start.tv_sec) +
1e-9 * (now.tv_nsec - start.tv_nsec));
}
}
The usual proscriptions against using busy-loops apply here.

Measuring execution time with clock in sec in C not working

I'm trying to measure execution time in C using clock() under linux using the following:
#include <time.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char const* argv[])
{
clock_t begin, end;
begin = clock();
sleep(2);
end = clock();
double spent = ((double)(end-begin)) / CLOCKS_PER_SEC;
printf("%ld %ld, spent: %f\n", begin, end, spent);
return 0;
}
The output is:
1254 1296, spent: 0.000042
The documentation says to divide the clock time by CLOCKS_PER_SEC to get the execution time in sec, but this seems pretty incorrect for a 2sec sleep.
What's the problem?

Sleeping takes almost no execution time. The program just has to schedule its wakeup and then put itself to sleep. While it's asleep, it is not executing. When it's woken up, it doesn't have to do anything at all. That all takes a very tiny fraction of a second.
It doesn't take more execution time to sleep longer. So the fact that there's a 2 second period when the program is not executing has no effect.

clock measures CPU time (in Linux at least). A sleeping process consumes no CPU time.
If you want to measure a time interval as if with a stopwatch, regardless of what your process is doing, use clock_gettime with CLOCK_MONOTONIC.

man clock() has the answer:
The clock() function returns an approximation of processor time used by the program.
Which clearly tells that clock() returns the processor time used by the program, not what you were expecting the total run time of the program which is typically done using gettimeofday.

Delay by 1 microsecond in C MIPS Environment

I'm trying to port u8glib (graphics library) to MIPS processor, OpenWrt router.
Here's an example in arm environment.
As such, one of the routines I must implement is:
delay_micro_seconds(uint32_t us)
Since this is a high resolution unit of time, how can I do this reliably in my environment? I've tried the following, but I'm not even sure how to validate it:
nanosleep((struct timespec[]){{0, 1000}}, NULL);
How can I validate this approach? If its a bad approach, how could I reliably delay by 1 microsecond in C?
EDIT: I've tried this, but I'm getting strange output, I expect the difference between the two print s to be 1000*10 iterations = 10,000 , but it is actually closer to 670,000 nanoseconds:
int main(int argc, char **argv)
{
long res, resb;
struct timespec ts, tsb;
int i;
res = clock_gettime(CLOCK_REALTIME, &ts);
for(i=0;i<10;i++){
nanosleep((struct timespec[]){{0,1000}}, NULL);
}
resb = clock_gettime(CLOCK_REALTIME, &tsb);
if (0 == res) printf("%ld %ld\n", ts.tv_sec, ts.tv_nsec);
else perror("clock_gettime");
if (0 == resb) printf("%ld %ld\n", tsb.tv_sec, tsb.tv_nsec);
else perror("clock_gettime"); //getting 670k delta instead of 10k for tv_nsec
return 0;
}

I assume that your codes would run under Linux.
First, using clock_getres(2) to find a clock's resolution.
and then, using clock_nanosleep(2) might make more accurate sleep.
To validate the sleep, I suggest you check elapsed time with clock_gettime(2)
res = clock_gettime(CLOCK_REALTIME, &ts);
if (0 == res) printf("%ld %ld\n", ts.tv_sec, ts.tv_nsec);
else perror("clock_gettime");
clock_nanosleep(CLOCK_REALTIME, 0, &delay, NULL);
res = clock_gettime(CLOCK_REALTIME, &ts);
if (0 == res) printf("%ld %ld\n", ts.tv_sec, ts.tv_nsec);
else perror("clock_gettime");
Also, if necessary, you can recompile your kernel with higher HZ configuration.
It would be helpful to read time(7) man page. Especially, The software clock, HZ, and jiffies and High-resolution timers sections.
Though my man pages says that High-resolution timer is not supported under mips architecture but I just googled it and mips-linux support HRT apparently.

Is this sleep synchronous? (I'm not familiar with your platform)
Validation
If you have development tools available, are there any profiling tools included? They can at least help you measure execution time.
You can also loop around your sleep call 1000+ times in a test program. Before you enter the loop, get a timestamp from the system clock. After the loop is done cycling, take another timestamp and compare with the first. Remember the loop itself will have some time overhead, but otherwise this will let you know how accurate (overshoot or undershoot) your sleep is. (If you cycle 1,000,000 time around a 1 microsecond sleep function, you would expect that it finishes quite near to 1 second.
Alternative
Sleep functions are not always perfect to their resolution, but they promise to be simple to use and always get you in the neighborhood of what they say. There are many statements that run much quicker than a microsecond, a++;.
Using a similar method as above, you could easily make a homemade synchronous timer with awesome accuracy using a FOR loop with some pointless statement inside of it. Once you find out how many iterations lands you nearest 1 microsecond, it should never change and you could hardcode a function out of it.
If you intend your delay to be asynchronous with a multi-tasking process in mind, this obviously would not cooperate with the other tasks well.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Array sum with 4 threads is slower than 1 thread [duplicate] - c

What timing resolution do you need? You could use time() from time.h for second resolution. If you need higher resolution, then you could use something more system specific. See Timer function to provide time in nano seconds using C++

Related

Does `clock` measure `sleep` i.e. suspended threads?

Using a timer in C

Is the CLOCKS_PER_SEC value "wrong" inside a virtual machine

Measuring execution time with clock in sec in C not working

Delay by 1 microsecond in C MIPS Environment

Categories

Resources