Calling pthread_join with one argument causes a segmentation fault? - c

If I call pthread_join continuously (without using other functions) it will cause a segmentation fault.
I can solve the problem by inserting a sleep(); , printf() or anything else between two calls of pthread_join.
OS & GCC Version:
gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Complie command:
gcc demo_thread.c -lpthread -o demo_thread.out
Source code (demo_thread.c):
#include <stdio.h>
#include <stdlib.h>
void *f1(void *);
int main() {
int k = 2;
pthread_t fa[k];
for(int i=0; i<k; i++) {
pthread_create(&fa[i], NULL, f1, NULL);
}
for(int i=0; i<k; i++) {
// printf("%ul\n", fa[i]); // uncomment this line, problem disapper.
pthread_join(fa[i]);
}
}
void *f1(void *arg) {
for(int i=0; i<4;i++) {
printf("%d\n",i );
}
return 0;
}

How did you even make that compile? I just now realized you did not use #include <pthread.h> and you used one argument instead of two for pthread_join.
If I leave out the include I get
error: unknown type name ‘pthread_t’
And if I do include it then I get
error: too few arguments to function ‘pthread_join’
Oh I see that if I include #include <stdlib.h> and leave out <pthread.h> then it will have a definition for pthread_t but nothing for pthread_join. There are still plenty of warnings though:
warning: implicit declaration of function ‘pthread_join’
You should always build programs with the -Wall -W -pedantic arguments to the compiler. And fix the warnings.
And to explain the crash: Since you don't pass NULL as the second argument to pthread_join it will be receiving a "random" value and then writing into it as if it was a pointer. Which it isn't. So it will either write a value into your allocated memory where it shouldn't, or it will get a segmentation fault.
And to explain how printf or sleep fixes the problem: Making those function calls must change the value of the RSI register (RSI is used for the second function argument) enough that it is either a valid pointer or NULL.

Related

Why gettimg error: undefined reference to `sqrt' when passing a variable, but it compiled successfully when passing constant as an argument [duplicate]

I created a small program, as follows:
#include <math.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
int i;
double tmp;
double xx;
for(i = 1; i <= 30; i++) {
xx = (double) i + 0.01;
tmp = sqrt(xx);
printf("the square root of %0.4f is %0.4f\n", xx,tmp);
sleep(1);
xx = 0;
}
return 0;
}
When I try to compile this with the following command, I get a compiler error.
gcc -Wall calc.c -o calc
returns:
/tmp/ccavWTUB.o: In function `main':
calc.c:(.text+0x4f): undefined reference to `sqrt'
collect2: ld returned 1 exit status
If I replace the variable in the call to sqrt(xx) with a constant like sqrt(10.2), it compiles just fine. Or, if I explicitly link like the following:
gcc -Wall -lm calc.c -o calc
It also works just fine. Can anyone tell me what's causing this? I've been a C programmer for a long time (and I've written similar small programs using math.h) and I have never seen anything like this.
My version of gcc follows:
$ gcc --version
gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$
If you look at the output of the compiler in the case where you used sqrt(10.2), I'll bet you see that a call to sqrt() isn't actually made.
This happens because GCC recognizes several functions that it can treat specially. This gives it the ability to do certain optimizations, in this case Constant folding. Such special functions are called Built-ins.
In the case where it must link to the math library (because you're calling it with a variable), you need to link it explicitly. Some operating systems/compilers do it for you, which is why you might not have noticed in the past.

Error: Cannot convert 'char(*)[a]' to 'char(*)[2]'. I don't understand why there's an error

In the code below if instead of a I use a number it works fine. But if i use a it gives the following error:
Cannot convert 'char(*)[a]' to 'char(*)[2]' for argument '2' to 'void displayNumbers(int, char(*)[2])'
#include <stdio.h>
void displayNumbers(int,char num[2][2]);
int main()
{
int a=2;
char num[a][a];// if i write 2 instead of a, it works fine!
displayNumbers(a,num);
return 0;
}
void displayNumbers(int a,char num[2][2])
{
printf("%c\n",a+ num[1][1]);
}
Why does using a or 2 make a difference here? I have a feeling the reason may be trivial but it would be really great if someone helps. The IDE i am using is Dev C++.
You have three primary problems.
First the declaration of displayNumber hard-codes num[2][2] defeating the purpose of using a Variable Length Array (the variable part of the name is at issue). While num[2][2] is valid, it is better written as num[a][a], or best (num*)[a] (a pointer to an array of char [a])
Next, num is completely uninitialized. What do you expect to print? Attempt to read an uninitialized value invokes Undefined Behavior.
Last, your format string in printf ("%c\n", a + num[1][1]); is suspect. The function name is displayNumber, yet you are attempting to print a character. Further, if the value is below ' ' (space, e.g. 0x20, 32), nothing will print, you would be in the Non-Printiable range. If you want to print the number, use the "%d" format specifier.
Putting it altogether, you could do something similar to:
#include <stdio.h>
void displayNumbers (int a, char (*num)[a]);
int main (void) {
int a=2;
char num[a][a];
for (int i = 0; i < a; i++)
for (int j = 0; j < a; j++)
num[i][j] = i + j;
displayNumbers (a, num);
return 0;
}
void displayNumbers (int a, char (*num)[a])
{
printf ("%d\n", a + num[1][1]);
}
Example Use/Output
$ ./bin/dispnum
4
Look things over and let me know if you have further questions.
Example Compile and Use on TDM-GCC 4.9.2
I have TDM-GCC 4.9.2 on a Win7 box. It works fine. Here is the version, compile and run on Windows 7:
C:\Users\david\Documents\dev\src-c\tmp>gcc --version
gcc (tdm-1) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
C:\Users\david\Documents\dev\src-c\tmp>gcc -Wall -Wextra -pedantic -Ofast -std=gnu11 -o bin\dispnum dispnum.c
C:\Users\david\Documents\dev\src-c\tmp>bin\dispnum
4
(I left the full directory and path information as an example that you can compile from anywhere as long as gcc.exe is in your path)

Erroneous Code + GCC 5.4 Optimisation Causes Infinite Loop

Disclaimer:
The code I'm sharing here is an isolated version of the real code, nevertheless reproduce the same behaviour.
The following code compiled using gcc 5.4.0 with optimisation enabled on Ubuntu 16.04, when executed, generates a infinite loop:
#include <stdio.h>
void *loop(char *filename){
int counter = 10;
int level = 0;
char *filenames[10];
filenames[0] = filename;
while (counter-- > 0) {
level++;
if (level > 10) {
break;
}
printf("Level %d - MAX_LEVELS %d\n", level, 10);
filenames[level] = filename;
}
return NULL;
}
int main(int argc, char *argv[]) {
loop(argv[0]);
}
The compiler versions:
gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
The compilation command used:
gcc infinite.c -O2 -o infinite
I know that it is caused by the optimisation flag "-02" because it doesn't happen without it. I also Know that adding volatile to the variable "level" also fix the error. But I can't add this keyword to all my variables.
My question is, why this happen and what can I do to avoid it in the future?
Is there any gcc flag that still optimise the code at a similar level of -O2 without this kind of problem?
You've found an example of undefined behaviour causing the optimizer to do something unexpected! GCC can see that if the loop runs 10 times then there is undefined behaviour.
This code will write filenames[1] through filenames[10] i.e. the 2nd through 11th elements of filenames. However filenames is 10 elements long, so the last one is undefined behaviour.
Because you aren't allowed to have undefined behaviour, it can assume that the loop will stop some other way before it gets to 10 (perhaps you have a modified version of printf that will call exit?).
And it sees that if the loop is going to stop anyway before it gets to 10, there is no point having code to make it only run 10 times. So it removes that code.
To prevent this optimization, you can use "-fno-aggressive-loop-optimizations".

difference between time() and gettimeofday() and why does one cause seg fault

I'm trying to measure the amount of time for a system call, and I tried using time(0) and gettimeofday() in this program, but whenever I use gettimeofday() it seg faults. I suppose I can just use time(0) but I'd like to know why this is happening. And I know you guys can just look at it and see the problem. Please don't yell at me!
I want to get the time but not save it anywhere.
I've tried every combination of code I can think of but I pasted the simplest version here. I'm new to C and Linux. I look at the .stackdump file but it's pretty meaningless to me.
GetRDTSC is in util.h and it does rdtsc(), as one might expect. Now it's set to 10 iterations but later the loop will run 1000 times, without printf.
#include <stdio.h>
#include <time.h>
#include "util.h"
int main() {
int i;
uint64_t cycles[10];
for (i = 0; i < 10; ++i) {
// get initial cycles
uint64_t init = GetRDTSC();
gettimeofday(); // <== time(0) will work here without a seg fault.
// get cycles after
uint64_t after = GetRDTSC();
// save cycles for each operation in an array
cycles[i] = after - init;
printf("%i\n", (int)(cycles[i]));
}
}
The short version
gettimeofday() requires a pointer to a struct timeval to fill with time data.
So, for example, you'd do something like this:
#include <sys/time.h>
#include <stdio.h>
int main() {
struct timeval tv;
gettimeofday(&tv, NULL); // timezone should be NULL
printf("%d seconds\n", tv.tv_secs);
return 0;
}
The long version
The real problem is that gcc is automatically including vdso on your system, which contains a symbol for the syscall gettimeofday. Consider this program (entire file):
int main() {
gettimeofday();
return 0;
}
By default, gcc will compile this without warning. If you check the symbols it's linked against, you'll see:
ternus#event-horizon ~> gcc -o foo foo.c
ternus#event-horizon ~> ldd foo
linux-vdso.so.1 => (0x00007ffff33fe000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56a5255000)
/lib64/ld-linux-x86-64.so.2 (0x00007f56a562b000)
You just happen to be using a function that has a defined symbol, but without the prototype, there's no way to tell how many command-line arguments it's supposed to have.
If you compile it with -Wall, you'll see:
ternus#event-horizon ~> gcc -Wall -o foo foo.c
foo.c: In function ‘main’:
foo.c:2:3: warning: implicit declaration of function ‘gettimeofday’ [-Wimplicit-function-declaration]
Of course, it'll segfault when you try to run it. Interestingly, it'll segfault in kernel space (this is on MacOS):
cternus#astarael ~/foo> gcc -o foo -g foo.c
cternus#astarael ~/foo> gdb foo
GNU gdb 6.3.50-20050815 (Apple version gdb-1822) (Sun Aug 5 03:00:42 UTC 2012)
[etc]
(gdb) run
Starting program: /Users/cternus/foo/foo
Reading symbols for shared libraries +.............................. done
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000001
0x00007fff87eeab73 in __commpage_gettimeofday ()
Now consider this program (again, no header files):
typedef struct {
long tv_sec;
long tv_usec;
} timeval;
int main() {
timeval tv;
gettimeofday(&tv, 0);
return 0;
}
This will compile and run just fine -- no segfault. You've provided it with the memory location it expects, even though there's still no gettimeofday prototype provided.
More information:
Can anyone understand how gettimeofday works?
Is there a faster equivalent of gettimeofday?
The POSIX gettimeofday specification

sqrt from math.h causes linker error "undefined reference to sqrt" only when the argument is not a constant

I created a small program, as follows:
#include <math.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
int i;
double tmp;
double xx;
for(i = 1; i <= 30; i++) {
xx = (double) i + 0.01;
tmp = sqrt(xx);
printf("the square root of %0.4f is %0.4f\n", xx,tmp);
sleep(1);
xx = 0;
}
return 0;
}
When I try to compile this with the following command, I get a compiler error.
gcc -Wall calc.c -o calc
returns:
/tmp/ccavWTUB.o: In function `main':
calc.c:(.text+0x4f): undefined reference to `sqrt'
collect2: ld returned 1 exit status
If I replace the variable in the call to sqrt(xx) with a constant like sqrt(10.2), it compiles just fine. Or, if I explicitly link like the following:
gcc -Wall -lm calc.c -o calc
It also works just fine. Can anyone tell me what's causing this? I've been a C programmer for a long time (and I've written similar small programs using math.h) and I have never seen anything like this.
My version of gcc follows:
$ gcc --version
gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$
If you look at the output of the compiler in the case where you used sqrt(10.2), I'll bet you see that a call to sqrt() isn't actually made.
This happens because GCC recognizes several functions that it can treat specially. This gives it the ability to do certain optimizations, in this case Constant folding. Such special functions are called Built-ins.
In the case where it must link to the math library (because you're calling it with a variable), you need to link it explicitly. Some operating systems/compilers do it for you, which is why you might not have noticed in the past.

Resources