I wanted to know if it was possible to return a pointer-address from the main function in c. Here a very short example:
int main(){
int i = 0;
return &i; //won't work because of type difference and because i..
} //.. will be deallocated.
So is there any way to do this?
And second: i want to do this in order to return a heap object from one program to anotherone..
Is it possible to keep the heap object alive if the called program terminates on the main() but continues running on a second thread which was started from main?
Thanks in advance!
In short, the answer is no.
The return value from main() is typically for error codes. Data to be kept persistent after the end of the process should be communicated in some other way. When you return from main(), your process (and all its threads) has ended. The memory space allocated by the operating system for the process has also been freed. This includes your heap memory. In short, once the process is over, there is no heap, and your object in the heap is gone.
You seem to have discovered the difficulty of inter-process communication. There are many possible techniques to allow one process to communicate with another. Some of them are
storing data to a file
pipes
message passing interface (MPI)
shared memory
Some techniques are better suited for different situations, this is why there are multiple options.
Is it possible to keep the heap object alive if the called program terminates on the main() but continues running on a second thread which was started from main?
No, it is not. When you return from main, it is equivalent to calling exit. Hence, the program terminates. No threads can be alive after that.
From the C99 Standard:
5.1.2.2.3 Program termination
1 If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument;
You cannot return it in a strictly-conforming program nor on most implementations.
From the C11 standard draft N1570, ยง5.1.2.2.1:
The function called at program startup is named main. [...] It shall
be defined with a return type of int and with no parameters:
int main(void) { /* ... */ }
or with two parameters [...]:
int main(int argc, char *argv[]) { /* ... */ }
[...] or in some other implementation-defined manner.
Mangling the pointer into an int would be possible but that would be implementation-defined, possibly undefined, and... nasty.
Is it possible to keep the heap object alive if the called program
terminates on the main() but continues running on a second thread
which was started from main?
A program is process. main runs in a thread. Heap memory is shared between all threads, so yes, sharing heap memory between threads is entirely possible.
However, main returning is equivalent to the program terminating, so you need to find another way to share it between threads. An example would be to set a pointer pointing to heap memory in one thread (maybe make that thread yield), and use that pointer in the other threads.
i want to do this in order to return a heap object from one program to
anotherone..
What now, between processes or threads? Heap memory is shared between threads but not between multiple processes. Sharing between multiple processes is done using shared memory (mmap or shmopen, shmget, etc.).
Two problems:
When main returns, i ceases to exist; any pointer value you returned would be invalid.
I am not aware of any runtime environment that expects to receive a pointer value from any executed program; main returns an int because the runtime environment expects an int.
Related
I am trying to implement a solution based partly on the discussions below. Basically, I want to malloc() in the main thread and free() in a secondary thread. The discussions linked deal with LabWindows, but I think my question is more aimed at C programmers in general.
LabWindows: implementing thread safe queues that can handle string
elements
How to use a thread safe queue to store strings
I create a pointer to char in the main thread and allocate storage using malloc(). I copy some data into the storage, and assign the pointer to an array element (CmtWriteTSQData expects an array). This array gets passed to a thread safe queue.
In a secondary thread, the thread safe queue is read. The data is assigned to a new array.
How do I free the memory allocated within the secondary thread, as the pointer variable is no longer in scope?
Can I just call free() on the array element? Or do I need to create another pointer to char in the secondary thread, copy the array element to it, and then call free() on the pointer?
There doesn't appear to be a return value with free(), so I can't figure out how to ensure the call succeeds.
// Main thread
char *ptr = NULL;
char *array1[1] = {0};
ptr = (char *) malloc (3 * sizeof (char));
strcpy (ptr, "hi");
array1[0] = ptr;
CmtWriteTSQData (queue, array1, 1, 0 NULL);
// Secondary thread
char *array2[1] = {0};
CmtReadTSQData (queue, array2, 1, 0, 0);
printf ("%s", array2[0]); // Prints "hi"
free (array2[0]); // Does this work?
The short answer is yes.
When you call malloc(), you are asking the operating system to give you a chunk of usable memory of some minimum size, and the operating system responds by passing you a pointer, which is just an integer representing the virtual address of that chunk of memory.
When a program has multiple threads, this means that several lightweight processes are sharing the same virtual address space, which means if there are several copies of some valid pointer among several threads of the same process, then they must all point to the same memory location.
The operating system does not care which thread asked for the memory, and it does not care which thread gives it back. When malloc() returns a pointer in a process, all of the threads in that process may use it, and when one of the threads free()s that pointer, that memory location becomes invalid for all threads in the process.
I don't know how the functions CmtWriteTSQData() and CmtReadTSQData() behave, but as long as printf("%p\n",ptr) in main and printf("%p\n",array2[0]) in the secondary thread produce the same hex value, your code is good.
Hope that helps!
I'm writing a very simple code in which i need to use some threads.
When I create first type of thread i pass argument with pthread_create:
fman thread_arg;
thread_arg.sd=sda;
char* split = strtok(buffer, "|");
thread_arg.wcount=atoi(split);
split = strtok(NULL,"");
strcpy(thread_arg.id, split);
pthread_create(&thread_temp, NULL, registerF, &thread_arg);
And everything works fine, but in function registerF I need to do something like this:
wman thread_arg;
thread_arg.sd=foremans_fd[ix];
thread_arg.fmanix=ix;
strcpy(thread_arg.id,tmpr);
pthread_create(&thread_temp, NULL, registerW, &thread_arg);
Those arguments are structures defined by me:
typedef struct fman
{
int sd;
char id[100];
int wcount;
} fman;
typedef struct wman
{
int sd;
int fmanix;
char id[100];
} wman;
And when I check it by printf("%x, args) I get the same address but values are different inside. Where is my mistake?
One likely problem is here:
fman thread_arg;
[...]
pthread_create(&thread_temp, NULL, registerF, &thread_arg);
Note that the thread_arg object is located on the stack, and thus will be destroyed (and likely overwritten by other stack variables) when the function it is declared in returns.
pthread_create(), on the other hand, launches a thread that will run asynchronously with this function, which means that the thread can (and often will) run after the function you excerpted has returned, which means that by the time the thread dereferences &thread_arg, thread_arg has likely already been destroyed and that pointer is now pointing to some other data that was written into the same stack location later on.
Unless you are doing something special to make sure that the struct's lifetime is long enough to include all of the spawned thread's accesses to the struct, then the fact that this code ever works is pure luck (i.e. the scheduler just happened to schedule the thread to run and perform all of its accesses to the struct before the struct was destroyed/overwritten). You definitely can't depend on that.
In order to fix the problem, you need to either allocate the struct on the heap (so that it won't be destroyed when the function returns -- the spawned thread can then free the struct when it is done using it), or use some kind of synchronization mechanism (e.g. a condition variable) to cause the main thread to block inside your function until the spawned thread has indicated that it is done accessing the struct.
The thread struct in the running thread is treated like a block of memory and accessed using offsets. Since your fman and wman structs have different orders (4+100+4) vs (4+4+100), it's likely that you're getting right struct but reading from different memory location, given the passed struct to this thread is fman, and it's being accessed as wman.
Try changing them both to same signature, as in, int, int, char* and it should work.
I'm trying to understand the details in the TCB (thread control block and the differences between per-thread states and shared states. My book has its own implementation of pthread, so it gives an example with this mini C program (I've not typed the whole thing out)
#include "thread.h"
static void go(int n);
static thread_t threads[NTHREADS];
#define NTHREADS 10
int main(int argh, char **argv) {
int i;
long exitValue;
for (i = 0; i < NTHREADS; i++) {
thread_create(&threads[i]), &go, i);
}
for (i = 0; i < NTHREADS; i++) {
exitValue = thread_join(threads[i]);
}
printf("Main thread done".\n);
return 0;
}
void go(int n) {
printf("Hello from thread %d\n", n);
thread_exit(100 + n);
}
What would the variables i and exitValue (in the main() function) be examples of? They're not shared state since they're not global variables, but I'm not sure if they're per-thread state either. The i is used as the parameter for the go function when each thread is being created, so I'm a bit confused about it. The exitValue's scope is limited only to main() so that seems like it would just be stored on the process' stack. The int n as the parameter for the void go() would be a per-thread variable because its value is independent for each thread. I don't think I fully understand these concepts so any help would be appreciated! Thanks!
Short Answer
All of the variables in your example program are automatic variables. Each time one of them comes into scope storage for it is allocated, and when it leaves its scope it is no longer valid. This concept is independent of whether the variables is shared or not.
Longer Answer
The scope of a variable refers to its lifetime in the program (and also the rules for how it can be accessed). In your program the variables i and exitValue are scoped to the main function. Typically a compiler will allocate space on the stack which is used to store the values for these variables.
The variable n in function go is a parameter to the function and so it also acts as a local variable in the function go. So each time go is executed the compiler will allocate space on the stack frame for the variables n (although the compiler may be able to perform optimization to keep the local variables in registers rather than actually allocating stack space). However, as a parameter n will be initialized with whatever value it was called with (its actual parameter).
To make this more concrete, here is what the values of the variales in the program would be after the first loop has completed 2 iterations (assuming that the spawned threads haven't finished executing).
Main thread: i = 2, exitValue = 0
Thread 0: n = 0
Thread 1: n = 1
The thing to note is that there are multiple independent copies of the variable n. And that n gets a copy of the value in i when thread_create is executed, but that the values of i and n are independent after that.
Finally I'm not certain what is supposed to happen with the statement exitValue = thread_join(threads[i]); since this is a variation of pthreads. But what probably happens is that it makes the value available when another thread calls thread_join. So in that way you do get some data sharing between threads, but the sharing is synchronized by the thread_join command.
They're objects with automatic storage, casually known as "local variables" although the latter is ambiguous since C and C++ both allow objects with local scope but that only have one global instance via the static keyword.
What is the advantage of the static keyword in block scope vs. using malloc?
For example:
Function A:
f() {
static int x = 7;
}
Function B:
f() {
int *x = malloc(sizeof(int));
if (x != NULL)
*x = 7;
}
If I am understanding this correctly, both programs create an integer 7 that is stored on the heap. In A, the variable is created at the very beginning in some permanent storage, before the main method executes. In B, you are allocating the memory on the spot once the function is called and then storing a 7 where that pointer points. In what type of situations might you use one method over the other? I know that you cannot free the x in function A, so wouldn't that make B generally more preferable?
Both programs create an integer 7 that is stored on the heap
No, they don't.
static creates a object with static storage duration which remains alive throughout the lifetime of the program. While a dynamically allocated object(created by malloc) remains in memory until explicitly deleted by free. Both provide distinct functionality. static maintains the state of the object within function calls while dynamically allocated object does not.
In what type of situations might you use one method over the other?
You use static when you want the object to be alive throughout the lifetime of program and maintain its state within function calls. If you are working in a multithreaded environment the same static object will be shared for all the threads and hence would need synchronization.
You use malloc when you explicitly want to control the lifetime of the object.for e.g: Making sure the object lives long enough till caller of function accesses it after the function call.(An automatic/local object will be deallocated once the scope{ } of the function ends). Unless the caller explicitly calls free the allocated memory is leaked until the OS reclaims it at program exit.
In Function A, you're allocating x with static storage duration, which generally means it is not on (what most people recognize as) the heap. Rather, it's just memory that's guaranteed to exist the entire time your program is running.
In Function B, you're allocating the storage every time you enter the function, and then (unless there's a free you haven't shown) leaking that memory.
Given only those two choices, Function A is clearly preferable. It has shortcomings (especially in the face of multi-threading) but at least there are some circumstances under which it's correct. Function B (as it stands) is just plain wrong.
Forget stack v. heap. That is not the most important thing that is going on here.
Sometimes static modifies scope and sometimes it modifies lifetime. Prototypical example:
void int foo() {
static int count = 0;
return count++;
}
Try calling this repeatedly, perhaps from several different functions or files even, and you'll see that count keeps increasing, because in this case static gives the variable a lifetime equal to that of the entire execution of the program.
Read http://www.teigfam.net/oyvind/pub/notes/09_ANSI_C_static_vs_malloc.html
The static variable is created before main() and memory does not need to be allocated after running the program.
If I am understanding this correctly, both programs create an integer 7 that is stored on the heap
No, static variables are created in Data or BSS segment, and they have lifetime throughout the lifetime of the program. When you alloc using malloc(), memory is allocated in heap, and they must be explicitly freed using free() call.
In what type of situations might you use one method over the other?
Well, you use the first method, when you want access to the same variable for the multiple invocation of the same function. ie, in your example, x will only initialized once, and when you call the method for the second time, the same x variable is used.
Second method can be used, when you don't want to share the variable for multiple invocation of the function, so that this function is called for the second time, x is malloced again.
You must free x every time.
You can see the difference by calling f() 2 times, for each kind of f()
...
f();
f();
...
f(){
static int x = 7;
printf("x is : %d", x++);
}
f(){
int *x = malloc(sizeof(int));
if (x != NULL)
*x = 7;
printf("x is : %d", (*x)++);
free(x); //never forget this,
}
the results will be different
First things first , static is a storage class , and malloc() is an API , which triggers the brk() system call to allocate memory on the heap.
If I am understanding this correctly, both programs create an integer
7 that is stored on the heap ?
No.Static variables are stored in the data section of the memory allocated to the program. Even though if the scope of a static variable ends , it can still be accessed outside its scope , this may indicate that , the contents of data segment , has a lifetime independent of scope.
In what type of situations might you use one method over the other?
If you want more control , within a given scope ,over your memory use malloc()/free(), else the simpler (and more cleaner) way is to use static.
In terms of performance , declaring a variable static is much faster , than allocating it on the heap . since the algorithms for heap management is complex and the time needed to service a heap request varies depending on the type of algorithm
One more reason i can think of suggesting static is that , the static variables are by default initialized to zero , so one more less thing to worry about.
consider below exaple to understand how static works. Generally we use static keyword to define scope of variable or function. e.g. a variable defined as static will be restricted within the function and will retail its value.
But as shown in below sample program if you pass the reference of the static variable to any other function you can still update the same variable from any other function.
But precisely the static variable dies when the program terminates, it means the memory will be freed.
#include <stdio.h>
void f2(int *j)
{
(*j)++;
printf("%d\n", *j);
}
void f1()
{
static int i = 10;
printf("%d\n", i);
f2(&i);
printf("%d\n", i);
}
int main()
{
f1();
return 0;
}
But in case of malloc(), memory will not be freed on termination of the program unless and untill programmer takes care of freeing the memory using free() before termination of the program.
This way you will feel that using malloc() we can have control over variable lifespan but beware...you have to be very precise in allocating and freeing the memory when you choose dynamic memory allocation.
If you forget to free the memory and program terminated that part of heap cannot be used to allocate memory by other process. This will probably lead to starvation of memory in real world and slows down the computation. To come out of such situation you have to manually reboot the system.
I have a question about pthread, when I create a variable inside a thread with malloc and then pass its pointer to a shared structure, i.e fifo, is the pointer passed by thread-1 will be accessed by thread2 ?
Please note that I have to code for the question above, I'm just trying to understand threading better, the below is just what I'm thinking about. The environment is pthread, c and linux
As far as I know threads are sharing the memory of their parent process, If that's the case the below should be correct.
void *thread-1(void *pointer)
{
int *intp = malloc(4);
send_to_fifo(intp);
}
void *thread-2(void *pointer)
{
int *iptr;
iptr = read_from_fifo();
do_something(iptr);
free(iptr);
}
is the pointer passed by thread-1 will be accessed by thread2 ?
Yes: since all threads operate in a common memory space, this is allowed.
malloc, free, and other memory management functions are thread-safe by default, unless compiled with NO_THREADS.
Of course you can do this. However you must be careful to not write to variable when it's used by another thread. You need synchronization.
In your case, you have race condition if the threads are run simultaneously (thread2 not waiting for thread1 to finish): thread2 either execute all it's code before thread1 puts anything to fifo or after that.