C Pthreads Problem, Can't Pass the Info I Want? - c

So I'm trying to make it so the threads startup function opens a file that was given via commandline, one file for each thread, but I also need the startup function to get my results array. So basically I need to get a string (the filename) and a 2D array of results to my startup thread some how, I'm thoroughly confused.
Anyone have any tips or ideas? Thanks.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include "string.h"
void* func(void *args);
int main(int argc, const char * argv[])
{
int nthreads = 0;
int i = 0;
long **results;
printf("Enter number of threads to use:\n> ");
scanf("%d", nthreads);
pthread_t threadArray[nthreads];
// results 2d array; 3 rows by nthreads cols
results = malloc((nthreads*4) * sizeof(long *));
for(i = 0; i<nthreads; i++) {
pthread_create(&threadArray[i], NULL, wordcount, HELP!!!!);
}
for(i = 0; i<nthreads; i++) {
pthread_join(threadArray[i], NULL);
}
pthread_exit();
}
void * func(void *arguments)
{
FILE *infile = stdin;
infile = fopen(filename, "rb");
fclose (infile);
}

Generally a structure that contains the data for the thread is declared and initialized, and a pointer to that structure is passed as the thread argument.
The thread function then casts the void* back to the structure pointer and has at the data.
Just remember that the lifetime of that structure still needs to be valid when the thread gets scheduled (which means you need to be very careful if it's a local variable). And as Jonathan Leffler pointed out, pass each thread it's own instance of the structure, or be very careful how you reuse it. Otherwise a thread may read data intended for a different thread if the structure gets reused before the thread is finished with it.
Probably the simplest way to manage those issues is to malloc() a structure for each thread, initialize it, pass the pointer to the thread and let the thread free() it when it's done with the data.

The last parameter to pthread_create can be any object you want, so for example you could have:
struct ThreadArguments {
const char* filename;
// additional parameters
};
void* ThreadFunction(void* arg) {
CHECK_NOTNULL(arg);
ThreadArguments* thread_arg = (ThreadArguments*) arg;
// now you can access the other parameters through this thread_arg object
// ...
}
// ...
ThreadArguments* arg = (ThreadArguments*) malloc(sizeof(ThreadArguments));
ret = pthread_create(&thread_id, attributes, &ThreadFunction, arg);
// make sure to check ret
// ...
pthread_join(thread_id);
free(arg);

Related

Thread safe, reentrant, async-signal safe putenv

I apologise in advance for what will be a bit of a code dump, I've trimmed as much unimportant code as possible:
// Global vars / mutex stuff
extern char **environ;
pthread_mutex_t env_mutex = PTHREAD_MUTEX_INITIALIZER;
int
putenv_r(char *string)
{
int len;
int key_len = 0;
int i;
sigset_t block;
sigset_t old;
sigfillset(&block);
pthread_sigmask(SIG_BLOCK, &block, &old);
// This function is thread-safe
len = strlen(string);
for (int i=0; i < len; i++) {
if (string[i] == '=') {
key_len = i; // Thanks Klas for pointing this out.
break;
}
}
// Need a string like key=value
if (key_len == 0) {
errno = EINVAL; // putenv doesn't normally return this err code
return -1;
}
// We're moving into environ territory so start locking stuff up.
pthread_mutex_lock(&env_mutex);
for (i = 0; environ[i] != NULL; i++) {
if (strncmp(string, environ[i], key_len) == 0) {
// Pointer assignment, so if string changes so does the env.
// This behaviour is POSIX conformant, instead of making a copy.
environ[i] = string;
pthread_mutex_unlock(&env_mutex);
return(0);
}
}
// If we get here, the env var didn't already exist, so we add it.
// Note that malloc isn't async-signal safe. This is why we block signals.
environ[i] = malloc(sizeof(char *));
environ[i] = string;
environ[i+1] = NULL;
// This ^ is possibly incorrect, do I need to grow environ some how?
pthread_mutex_unlock(&env_mutex);
pthread_sigmask(SIG_SETMASK, &old, NULL);
return(0);
}
As the title says, I'm trying to code a thread safe, async-signal safe reentrant version of putenv. The code works in that it sets the environment variable like putenv would, but I do have a few concerns:
My method for making it async-signal safe feels a bit ham-handed, just blocking all signals (except SIGKILL/SIGSTOP of course). Or is this the most appropriate way to go about it.
Is the location of my signal blocking too conservative? I know strlen isn't guaranteed to be async-signal safe, meaning that my signal blocking has to occur beforehand, but perhaps I'm mistaken.
I'm fairly sure that it is thread safe, considering that all the functions are thread-safe and that I lock interactions with environ, but I'd love to be proven otherwise.
I'm really not too sure about whether it's reentrant or not. While not guaranteed, I imagine that if I tick the other two boxes it'll most likely be reentrant?
I found another solution to this question here, in which they just set up the appropriate signal blocking and mutex locking (sick rhymes) and then call putenv normally. Is this valid? If so, it's obviously far simpler than my approach.
Sorry about the large block of code, I hope I've established a MCVE. I'm missing a bit of error checking in my code for brevity's sake. Thanks!
Here is the rest of the code, including a main, if you wish to test the code yourself:
#include <string.h>
#include <errno.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
// Prototypes
static void thread_init(void);
int putenv_r(char *string);
int
main(int argc, char *argv[]) {
int ret = putenv_r("mykey=myval");
printf("%d: mykey = %s\n", ret, getenv("mykey"));
return 0;
}
This code is a problem:
// If we get here, the env var didn't already exist, so we add it.
// Note that malloc isn't async-signal safe. This is why we block signals.
environ[i] = malloc(sizeof(char *));
environ[i] = string;
It creates a char * on the heap, assigns the address of that char * to environ[i], then overwrites that value with the address contained in string. That's not going to work. It doesn't guarantee that environ is NULL-terminated afterwards.
Because char **environ is a pointer to an array of pointers. The last pointer in the array is NULL - that's how code can tell it's reached the end of the list of environment variables.
Something like this should work better:
unsigned int envCount;
for ( envCount = 0; environ[ envCount ]; envCount++ )
{
/* empty loop */;
}
/* since environ[ envCount ] is NULL, the environ array
of pointers has envCount + 1 elements in it */
envCount++;
/* grow the environ array by one pointer */
char ** newEnviron = realloc( environ, ( envCount + 1 ) * sizeof( char * ) );
/* add the new envval */
newEnviron[ envCount - 1 ] = newEnvval;
/* NULL-terminate the array of pointers */
newEnviron[ envCount ] = NULL;
environ = newEnviron;
Note that there's no error checking, and it assumes the original environ array was obtained via a call to malloc() or similar. If that assumption is wrong, the behavior is undefined.

How use atexit() function for free up memory? [duplicate]

I am developing a project in C, and I need to free the allocated memory and also close all the open files before it exits.
I decided to implement a clean function that will do all this stuff and call it with atexit because there are a lot of possible exit scenarios.
The problem is that atexit doesn't allow me to set functions with parameters, so I can't send to clean the pointers that need to be freed in the end of the process.
So I need to declare as global variables every pointer that may need to be freed, and every file that may remaining open in the program? (I already did that but doesn't looks good) or does exist a similar function to atexit that allows to send parameters? or more probably there is another way that I am missing?
Using a static pointer inside a function:
#include <stdio.h>
#include <stdlib.h>
void atexit_clean(void *data);
static void clean(void)
{
atexit_clean(NULL);
}
void atexit_clean(void *data)
{
static void *x;
if (data) {
x = data;
atexit(clean);
} else {
free(x);
}
}
int main(void)
{
int *a = malloc(sizeof(int));
atexit_clean(a);
return 0;
}
Another method using a single global variable: you can store all objects to be freed in an array of pointers or a linked list, this example uses realloc (doesn't check (m/re)alloc for brevity):
#include <stdio.h>
#include <stdlib.h>
static void **vclean;
static size_t nclean;
void atexit_add(void *data)
{
vclean = realloc(vclean, sizeof(void *) * (nclean + 1));
vclean[nclean++] = data;
}
void clean(void)
{
size_t i;
for (i = 0; i < nclean; i++) {
free(vclean[i]);
}
free(vclean);
}
int main(void)
{
int *a, *b, *c;
double *d;
int e = 1;
atexit(clean);
a = &e;
b = malloc(sizeof(int));
atexit_add(b);
c = malloc(sizeof(int));
atexit_add(c);
d = malloc(sizeof(double));
atexit_add(d);
return 0;
}
There is no way to pass any parameters to atexit(), so you're stuck using global variables.
When your program terminates normally, through exit() or by returning from main(), it will automatically flush and close any open streams and (under most operating systems) free allocated memory. However, it is good practice to explicitly clean up your resources before the program terminates, because it typically leads to a more structured program. Sometimes the cleanest way to write your program is to just exit and leave the cleanup to the implementation.
But be warned that you should always check the return value of fclose(). See "What are the reasons to check for error on close()?" for an anecdote about what could happen when you don't.

Accessing the variable inside another code

Is there a way to access a variable initialized in one code from another code. For eg. my code1.c is as follows,
# include <stdio.h>
int main()
{
int a=4;
sleep(99);
printf("%d\n", a);
return 0;
}
Now, is there any way that I can access the value of a from inside another C code (code2.c)? I am assuming, I have all the knowledge of the variable which I want to access, but I don't have any information about its address in the RAM. So, is there any way?
I know about the extern, what I am asking for here is a sort of backdoor. Like, kind of searching for the variable in the RAM based on some properties.
Your example has one caveat, set aside possible optimizations that would make the variable to dissapear: variable a only exists while the function is being executed and has not yet finished.
Well, given that the function is main() it shouldn't be a problem, at least, for standard C programs, so if you have a program like this:
# include <stdio.h>
int main()
{
int a=4;
printf("%d\n", a);
return 0;
}
Chances are that this code will call some functions. If one of them needs to access a to read and write to it, just pass a pointer to a as an argument to the function.
# include <stdio.h>
int main()
{
int a=4;
somefunction(&a);
printf("%d\n", a);
return 0;
}
void somefunction (int *n)
{
/* Whatever you do with *n you are actually
doing it with a */
*n++; /* actually increments a */
}
But if the function that needs to access a is deep in the function call stack, all the parent functions need to pass the pointer to a even if they don't use it, adding clutter and lowering the readability of code.
The usual solution is to declare a as global, making it accessible to every function in your code. If that scenario is to be avoided, you can make a visible only for the functions that need to access it. To do that, you need to have a single source code file with all the functions that need to use a. Then, declare a as static global variable. So, only the functions that are written in the same source file will know about a, and no pointer will be needed. It doesn't matter if the functions are very nested in the function call stack. Intermediate functions won't need to pass any additional information to make a nested function to know about a
So, you would have code1.c with main() and all the functions that need to access a
/* code1.c */
# include <stdio.h>
static int a;
void somefunction (void);
int main()
{
a=4;
somefunction();
printf("%d\n", a);
return 0;
}
void somefunction (void)
{
a++;
}
/* end of code1.c */
About trying to figure out where in RAM is a specific variable stored:
Kind of. You can travel across function stack frames from yours to the main() stack frame, and inside those stack frames lie the local variables of each function, but there is no sumplementary information in RAM about what variable is located at what position, and the compiler may choose to put it wherever it likes within the stack frame (or even in a register, so there would be no trace of it in RAM, except for push and pops from/to general registers, which would be even harder to follow).
So unless that variable has a non trivial value, it's the only local variable in its stack frame, compiler optimizations have been disabled, your code is aware of the architecture and calling conventions being used, and the variable is declared as volatile to stop being stored in a CPU register, I think there is no safe and/or portable way to find it out.
OTOH, if your program has been compiled with -g flag, you might be able to read debugging information from within your program and find out where in the stack frame the variable is, and crawl through it to find it.
code1.c:
#include <stdio.h>
void doSomething(); // so that we can use the function from code2.c
int a = 4; // global variable accessible in all functions defined after this point
int main()
{
printf("main says %d\n", a);
doSomething();
printf("main says %d\n", a);
return 0;
}
code2.c
#include <stdio.h>
extern int a; // gain access to variable from code1.c
void doSomething()
{
a = 3;
printf("doSomething says %d\n", a);
}
output:
main says 4
doSomething says 3
main says 3
You can use extern int a; in every file in which you must use a (code2.c in this case), except for the file in which it is declared without extern (code1.c in this case). For this approach to work you must declare your a variable globally (not inside a function).
One approach is to have the separate executable have the same stack layout as the program in question (since the variable is placed on the stack, and we need the relative address of the variable), therefore compile it with the same or similar compiler version and options, as much as possible.
On Linux, we can read the running code's data with ptrace(PTRACE_PEEKDATA, pid, …). Since on current Linux systems the start address of the stack varies, we have to account for that; fortunately, this address can be obtained from the 28th field of /proc/…/stat.
The following program (compiled with cc Debian 4.4.5-8 and no code generator option on Linux 2.6.32) works; the pid of the running program has to be specified as the program argument.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ptrace.h>
void *startstack(char *pid)
{ // The address of the start (i. e. bottom) of the stack.
char str[FILENAME_MAX];
FILE *fp = fopen(strcat(strcat(strcpy(str, "/proc/"), pid), "/stat"), "r");
if (!fp) perror(str), exit(1);
if (!fgets(str, sizeof str, fp)) exit(1);
fclose(fp);
unsigned long address;
int i = 28; char *s = str; while (--i) s += strcspn(s, " ") + 1;
sscanf(s, "%lu", &address);
return (void *)address;
}
static int access(void *a, char *pidstr)
{
if (!pidstr) return 1;
int pid = atoi(pidstr);
if (ptrace(PTRACE_ATTACH, pid, 0, 0) < 0) return perror("PTRACE_ATTACH"), 1;
int status;
// wait for program being signaled as stopped
if (wait(&status) < 0) return perror("wait"), 1;
// relocate variable address to stack of program in question
a = a-startstack("self")+startstack(pidstr);
int val;
if (errno = 0, val = ptrace(PTRACE_PEEKDATA, pid, a, 0), errno)
return perror("PTRACE_PEEKDATA"), 1;
printf("%d\n", val);
return 0;
}
int main(int argc, char *argv[])
{
int a;
return access(&a, argv[1]);
}
Another, more demanding approach would be as mcleod_ideafix indicated at the end of his answer to implement the bulk of a debugger and use the debug information (provided its presence) to locate the variable.

Exists a way to free memory in atexit or similar without using global variables?

I am developing a project in C, and I need to free the allocated memory and also close all the open files before it exits.
I decided to implement a clean function that will do all this stuff and call it with atexit because there are a lot of possible exit scenarios.
The problem is that atexit doesn't allow me to set functions with parameters, so I can't send to clean the pointers that need to be freed in the end of the process.
So I need to declare as global variables every pointer that may need to be freed, and every file that may remaining open in the program? (I already did that but doesn't looks good) or does exist a similar function to atexit that allows to send parameters? or more probably there is another way that I am missing?
Using a static pointer inside a function:
#include <stdio.h>
#include <stdlib.h>
void atexit_clean(void *data);
static void clean(void)
{
atexit_clean(NULL);
}
void atexit_clean(void *data)
{
static void *x;
if (data) {
x = data;
atexit(clean);
} else {
free(x);
}
}
int main(void)
{
int *a = malloc(sizeof(int));
atexit_clean(a);
return 0;
}
Another method using a single global variable: you can store all objects to be freed in an array of pointers or a linked list, this example uses realloc (doesn't check (m/re)alloc for brevity):
#include <stdio.h>
#include <stdlib.h>
static void **vclean;
static size_t nclean;
void atexit_add(void *data)
{
vclean = realloc(vclean, sizeof(void *) * (nclean + 1));
vclean[nclean++] = data;
}
void clean(void)
{
size_t i;
for (i = 0; i < nclean; i++) {
free(vclean[i]);
}
free(vclean);
}
int main(void)
{
int *a, *b, *c;
double *d;
int e = 1;
atexit(clean);
a = &e;
b = malloc(sizeof(int));
atexit_add(b);
c = malloc(sizeof(int));
atexit_add(c);
d = malloc(sizeof(double));
atexit_add(d);
return 0;
}
There is no way to pass any parameters to atexit(), so you're stuck using global variables.
When your program terminates normally, through exit() or by returning from main(), it will automatically flush and close any open streams and (under most operating systems) free allocated memory. However, it is good practice to explicitly clean up your resources before the program terminates, because it typically leads to a more structured program. Sometimes the cleanest way to write your program is to just exit and leave the cleanup to the implementation.
But be warned that you should always check the return value of fclose(). See "What are the reasons to check for error on close()?" for an anecdote about what could happen when you don't.

Accessing global variables in pthreads in different c-files

I have a main.c with a global variable called int countboards. In the main() I start a pthread, that listens to ONE TCP-Connection and runs that through (progserver.c). Means, this thread will never return. In the main() I enter the function rmmain(...) which is in the rm.c (RM=Ressource Manager). In rm.c I read countboards, in the progserver.c in the pthread I write to this variable (both are made accessible by extern int countboards).
So the problem is, when I write to countboards in the pthread and I want to access this variable after it's been written to in the rm.c, it still has the old value (in this case 0 instead of for example 10). Why?
main.c:
int countboards;
int main(int argc, char** argv) {
countboards = 0;
pthread_t thread;
pthread_create(&thread, NULL, startProgramserver, NULL);
rmmain();
return 0;
}
rm.c:
extern int countboards;
int rmmain(vhbuser* vhbuserlist, int countvhbuser,
userio* useriolist, int countios, int usertorm, int rmtosslserver, int sslservertorm) {
while(1) {
int n;
n=read(usertorm,buf,bufc); // blocks until command comes from the user
...
board* b = findAFreeBoard(boardlist, countboards, usagelist); // here countboards should be >0, but it isn't
...
}
}
programserver.c:
extern int countboards;
void* startProgramserver(void*) {
...
sock = tcp_listen();
...
http_serve(ssl,s, sslpipes);
}
static int http_serve(SSL *ssl, int s, void* sslpipes) {
...
countboards = countboards + countboardscommands;
...
// here countboards has the new value
}
You're seeing a cached copy in each thread. I would suggest declaring it volatile int countboards except that's really not a good way to go about things.
Globals are kinda evil. You'd be better served by passing a pointer to each thread and synchronizing with a mutex.
Edit: To expand on this since I was in a hurry last night ...
http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/
As KasigiYabu mentions in the comments below, creating a "context" structure that contains all the information you want to share between the threads and passing that in to pthread_create as the last arg is a sound approach and is what I do as well in most cases.

Resources