Interleaved usleep() functions being executed together. Is this compiler optimization? - c

I have a code that does something similar to the following repeatedly over a loop:
$ cat test.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
char arr[6] = {'h','e','l','l','o','!'};
for(int x=0; x<6 ; x++){
printf("%c",arr[x]);
usleep(1000000);
printf("%c",arr[x]);
usleep(1000000);
}
printf("\n");
return 0;
}
I see that printf() executes one after the other WITHOUT any delay (due to usleep), and then the program sleeps for the total usleep time at the end before the next iteration. Seems like all the usleep() calls happen together in the end.
I tried -O0 flag in gcc, because I suspected its the effect of compiler optimization. But I guess -O0 flag does not disable whatever optimization category this case falls under (if my guess is correct about the compiler being the reason for this behavior).
I am trying to understand the reason for this behavior and how to achieve the desired behavior from my program.
Note: I know it might be possible to replace usleep() with some compute-heavy function call that take an equivalent amount of time, but that is not the solution I am looking for.

You are using usleep() wrong. Use sleep(1) instead.
From man usleep:
EINVAL usec is greater than or equal to 1000000. (On systems where that is considered an error.)
Once you fix that you should do fflush() after printf() to avoid another surprise with output buffering.

Related

Why does the time process to complete the program differ between usleep and sleep in c?

#include<stdio.h>
main()
{
printf("Sleep for 5 milisecond to exit.\n");
sleep(0.005);
printf("usleep for 5 milisecond to exit.\n");
usleep(5000);
return 0;
}
the sleep function takes more time to execute comparatively to the usleep function which takes seconds?
The sleep system call is defined in <unistd.h> on POSIX systems as:
unsigned int sleep(unsigned int seconds);
If you do not include <unistd.h>, the sleep function is not defined and the compiler, using obsolete pre-ansi semantics, infers a prototype of int sleep(double). This has undefined behavior. The actual unsigned int value received by the system could be anything at all, this can cause a long pause as you observe, but it could also crash depending on the system's ABI...
If you include a proper definition by including <unistd.h>, the 0.005 double argument will be implicitly converted to 0 and the program will not pause at all.
Note also that omitting the return type of main() is an obsolete feature. Avoid this old style programming and upgrade your compiler.
It is recommended to enable all warnings to avoid such silly bugs: use the -Wall -Wextra -pedantic options with gcc and clang.
Note also that both sleep and usleep may cause the program to stop for longer than specified in the argument, depending on current system load.

Sleep function in C (POSIX) breaks my program

This is my program code:
#include <unistd.h>
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <sys/types.h>
void function() {
srand(time(NULL));
while(1) {
int n = rand();
printf("%d ", n);
//sleep(1);
}
}
int main() {
pid_t pid;
pid = fork();
if (pid == 0) {
function();
}
}
With the sleep line commented out (as in the code above) the program works fine (i.e. it prints a bunch of random numbers too fast to even see if they are actually random), but if I remove the comment the program doesn't print anything and exits (not even the first time, before it gets to the sleep), even though it compiles without warnings or errors with or without the comment.
but if I remove the comment the program doesn't print anything and exits
It does not print, but it does not really exit either. It will still be running a process in the background. And that process runs your infinite while loop.
Using your code in p.c:
$ gcc p.c
$ ./a.out
$ ps -A | grep a.out
267282 pts/0 00:00:00 a.out
$ killall a.out
$ killall a.out
a.out: no process found
The problem is that printf does not really print. It only sends data to the output buffer. In order to force the output buffer to be printed, invoke fflush(stdout)
If you're not flushing, then you just rely on the behavior of the terminal you're using. It's very common for terminals to flush when you write a newline character to the output stream. That's one reason why it's preferable to use printf("data\n") instead of printf("\ndata"). See this question for more info: https://softwareengineering.stackexchange.com/q/381711/283695
I'd suspect that if you just leave your program running, it will eventually print. It makes sense that it has a finite buffer and that it flushes when it gets full. But that's just an (educated) guess, and it depends on your terminal.
it prints a bunch of random numbers too fast to even see if they are actually random
How do you see if a sequence of numbers is random? (Playing the devils advocate)
I believe you need to call fflush(3) from time to time. See also setvbuf(3) and stdio(3) and sysconf(3).
I guess that if you coded:
while(1) {
int n = rand();
printf("%d ", n);
if (n % 4 == 0)
fflush(NULL);
sleep(1);
}
The behavior of your program might be more user friendly. The buffer of stdout might have several dozens of kilobytes at least.
BTW, I could be wrong. Check by reading a recent C draft standard (perhaps n2176).
At the very least, see this C reference website then syscalls(2), fork(2) and sleep(3).
You need to call waitpid(2) or a similar function for every successful fork(2).
If on Linux, read also Advanced Linux Programming and use both strace(1) and gdb(1) to understand the behavior of your program. With GCC don't forget to compile it as gcc -Wall -Wextra -g to get all warnings and debug info.
Consider also using the Clang static analyzer.

How to experience cache miss and hits in Linux system?

Hello I've been trying to experience cache miss and hits in Linux.
To do so, I've done a program in C, where I mesure the time in CPU cycle to do the instruction printf(). The first part mesure the time needed for a miss and the second one for a hit. Here is the given program :
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <sched.h>
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>
uint64_t rdtsc() {
uint64_t a, d;
asm volatile ("mfence");
asm volatile ("rdtsc" : "=a" (a), "=d" (d));
a = (d<<32) | a;
asm volatile ("mfence");
return a;
}
int main(int argc, char** argv)
{
size_t time = rdtsc();
printf("Hey ");
size_t delta1 = rdtsc() - time;
printf("delta: %zu\n", delta1);
size_t time2 = rdtsc();
printf("Hey ");
size_t delta2 = rdtsc() - time2;
printf("delta: %zu\n", delta2);
sleep(100);
}
Now I would like to show that two processes (two terminals) have cache in commun. So I thought that running this program in two terminals would result in :
Terminal 1:
miss
hit
Terminal 2:
hit
hit
But now I have something like:
Terminal 1:
miss
hit
Terminal 2:
miss
hit
Is my understanding incorrect? Or my program wrong?
Your assumption is somewhat correct.
printf is part of the libc library. If you use dynamic linking, the operating system may optimize memory usage by only loading the library once for all processes using it.
However, there are multiple reasons why I don't expect you to measure any sizable difference:
compared to the difference between a cache hit and cache miss, printf takes an enormous amount of time to complete and there is a lot going on that introduces noise. With just a single measurement, it is very unlikely that you're able to measure that tiny difference.
the actual reason for the first measurement to take longer is likely the lazy binding of the library function printf being resolved by the loader (https://maskray.me/blog/2021-09-19-all-about-procedure-linkage-table) or some other magic happening (buffers being setup, etc.) for the first output.
a lot of libc functions are used by many different processes. If the library is shared, it is likely, that printf may be cached even though you did not use it.
I would suggest to mount a Flush+Reload attack (https://eprint.iacr.org/2013/448.pdf) on printf in one of the terminals and use it in the other terminal. Then, you may see a timing difference.
Note: to find the actual address of printf for the attack, you need to be familiar with dynamic linking and the plt. Just using something like void* addr = printf will probably not work!

Returning From Catching A Floating Point Exception

So, I am trying to return from a floating point exception, but my code keeps looping instead. I can actually exit the process, but what I want to do is return and redo the calculation that causes the floating point error.
The reason the FPE occurs is because I have a random number generator that generates coefficients for a polynomial. Using some LAPACK functions, I solve for the roots and do some other things. Somewhere in this math intensive chain, a floating point exception occurs. When this happens, what I want to do is increment the random number generator state, and try again until the coefficients are such that the error doesn't materialize, as it usually doesn't, but very rarely does and causes catastrophic results.
So I wrote a simple test program to learn how to work with signals. It is below:
In exceptions.h
#ifndef EXCEPTIONS_H
#define EXCEPTIONS_H
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <math.h>
#include <errno.h>
#include <float.h>
#include <fenv.h>
void overflow_handler(int);
#endif // EXCEPTIONS_H //
In exceptions.c
#include "exceptions.h"
void overflow_handler(int signal_number)
{
if (feclearexcept(FE_OVERFLOW | FE_UNDERFLOW | FE_DIVBYZERO | FE_INVALID)){
fprintf(stdout, "Nothing Cleared!\n");
}
else{
fprintf(stdout, "All Cleared!\n");
}
return;
}
In main.c
#include "exceptions.h"
int main(void)
{
int failure;
float oops;
//===Enable Exceptions===//
failure = 1;
failure = feenableexcept(FE_OVERFLOW | FE_UNDERFLOW | FE_DIVBYZERO | FE_INVALID);
if (failure){
fprintf(stdout, "FE ENABLE EXCEPTIONS FAILED!\n");
}
//===Create Error Handler===//
signal(SIGFPE, overflow_handler);
//===Raise Exception===//
oops = exp(-708.5);
fprintf(stdout, "Oops: %f\n", oops);
return 0;
}
The Makefile
#===General Variables===#
CC=gcc
CFLAGS=-Wall -Wextra -g3 -Ofast
#===The Rules===#
all: makeAll
makeAll: makeExceptions makeMain
$(CC) $(CFLAGS) exceptions.o main.o -o exceptions -ldl -lm
makeMain: main.c
$(CC) $(CFLAGS) -c main.c -o main.o
makeExceptions: exceptions.c exceptions.h
$(CC) $(CFLAGS) -c exceptions.c -o exceptions.o
.PHONY: clean
clean:
rm -f *~ *.o
Why doesn't this program terminate when I am clearing the exceptions, supposedly successfully? What do I have to do in order to return to the main, and exit?
If I can do this, I can put code in between returning and exiting, and do something after the FPE has been caught. I think that I will set some sort of flag, and then clear all most recent info in the data structures, redo the calculation etc based on whether or not that flag is set. The point is, the real program must not abort nor loop forever, but instead, must handle the exception and keep going.
Help?
"division by zero", overflow/underflow, etc. result in undefined behaviour in the first place. If the system, however, generates a signal for this, the effect of UB is "suspended". The signal handler takes over instead. But if the handler returns, the effect of UB will "resume".
Therefore, the standard disallows returning from such a situation.
Just think: How would the program have to recover from e.g. DIV0? The abstract machine has no idea about FPU registers or status flags, and even if - what result would have to be generated?
C also has no provisions to unroll the stack properly like C++.
Note also, that generating signals for arithmetic exceptions is optional, so there is no guarantee a signal will actually be generated. The handler is mostly meant to notify about the event and possibly clean up external resources.
Behaviour is different for signals which do not origin from undefined behaviour, but just interrupt program execution. This is well defined as the program state is well-defined.
Edit:
If you have to rely on the program to continue under all circumstances, you hae to check all arguments of arithmetic operations before doing the actual operation and/or use safe operations only (re-order, use larger intermediate types, etc.). One exaple for integers might be to use unsigned instead of signed integers, as for those overflow-behavior is well-defined (wrap), so intermediate results overflowing will not make trouble as long as that is corrected afterwards and the wrap is not too much. (Disclaimer: that does not always work, of course).
Update:
While I am still not completely sure, according to comments, the standard might allow, for a hosted environment at least, to use LIA-1 traps and to recover from them (see Annex H. As these are not necessarily precise, I suspect recovery is not possible under all circumstances. Also, math.h might present additional aspects which have to be carefully evaluated.
Finally: I still think there is nothing gained with such approach, but some uncertainty added compared to using safe algorithms. It would be different, if there wer not so much different components involved. For a bare-metal embedded system, the view might be completely different.
I think you're supposed to mess around with the calling stack frame if you want to skip an instruction or break out of exp or whatever. This is high voodoo and bound to be unportable.
The GNU C library lets you use setjmp() outside of a signal handler to which you can longjmp() from inside. This seems like a better way to go. Here is a self-contained modification of your program showing how to do it:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <setjmp.h>
#include <math.h>
#include <errno.h>
#include <float.h>
#include <fenv.h>
sigjmp_buf oh_snap;
void overflow_handler(int signal_number) {
if (feclearexcept(FE_OVERFLOW | FE_UNDERFLOW | FE_DIVBYZERO | FE_INVALID)){
fprintf(stdout, "Nothing Cleared!\n");
}
else{
fprintf(stdout, "All Cleared!\n");
}
siglongjmp(oh_snap, 1);
return;
}
int main(void) {
int failure;
float oops;
failure = 1;
failure = feenableexcept(FE_OVERFLOW | FE_UNDERFLOW | FE_DIVBYZERO | FE_INVALID);
if (failure){
fprintf(stdout, "FE ENABLE EXCEPTIONS FAILED!\n");
}
signal(SIGFPE, overflow_handler);
if (sigsetjmp(oh_snap, 1)) {
printf("Oh snap!\n");
} else {
oops = exp(-708.5);
fprintf(stdout, "Oops: %f\n", oops);
}
return 0;
}

Error with -mno-sse flag and gettimeofday() in C

A simple C program which uses gettimeofday() works fine when compiled without any flags ( gcc-4.5.1) but doesn't give output when compiled with the flag -mno-sse.
#include <stdio.h>
#include <stdlib.h>
int main()
{
struct timeval s,e;
float time;
int i;
gettimeofday(&s, NULL);
for( i=0; i< 10000; i++);
gettimeofday(&e, NULL);
time = e.tv_sec - s.tv_sec + e.tv_usec - s.tv_usec;
printf("%f\n", time);
return 0;
}
I have CFLAGS=-march=native -mtune=native
Could someone explain why this happens?
The program returns a correct value normally, but prints "0" when compiled with -mno-sse enabled.
The flag -mno-sse causes floating point arguments to be passed on the stack, whereas the usual x86_64 ABI specifies that they should be passed via SSE registers.
Since printf() in your C library was compiled without -mno-sse, it is expecting floating point arguments to be passed in accordance with the ABI. This is why your code fails. It has nothing to do with gettimeofday().
If you wish to use printf() from your code compiled with -mno-sse and pass it floating point arguments, you will need to recompile your C library with that option and link against that version.
It appears that you are using a loop which does nothing in order to observe a time difference. The problem is, the compiler may optimize this loop away entirely. The issue may not be with the -mno-sse itself, but may be that that allows an optimization that removes the loop, thus giving you the same time each time you run it.
I would recommend trying to put something in that loop which can't be optimized out (such as incrementing a number which you print out at the end). See if you still get the same behavior. If not, I'd recommend looking at the generated assembler gcc -S and see what the code difference is.
The datastructures tv_usec and tv_sec are usually longs.
Redeclaration of the variable "time" as a long integer solved the issue.
The following link addresses the issue.
http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00525.html
Working code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
struct timeval s,e;
long time;
int i;
gettimeofday(&s, NULL);
for( i=0; i< 10000; i++);
gettimeofday(&e, NULL);
time = e.tv_sec - s.tv_sec + e.tv_usec - s.tv_usec;
printf("%ld\n", time);
return 0;
}
Thanks for the prompt replies. Hope this helps.
What do you mean doesn't give output?
0 (zero) is a perfectly reasonable output to expect.
Edit: Try compiling to assembler (gcc -S ...) and see the differences between the normal and the no-sse version.

Resources