CreateProcess and lpCommandLine lifetime - c

I allocate memory for the parameter lpCommandLine in CreateProcess function, either with malloc or on the stack.
Can I free/release that memory immediately after the call, or do I have to wait until the process finishes?

The buffer referred to by lpCommandLine needs be valid only for the duration of the call to CreateProcess. Once CreateProcess returns, it will not refer to that buffer again.
Imagine if you did have to keep that buffer alive. Were that the case, then all parent processes would have to outlive all of their children. That's clearly a ridiculous proposition and I'm sure you will know from experience that there is not such requirement.
There is a general principle here. By and large, API functions will not refer to their arguments after the function returns. If they do need to do so, then it will be explicitly called out in the documentation, or it will be blatantly obvious from the intent of the function. As an example of the latter I am thinking of passing a window procedure to RegisterClass. It is quite clear that the window procedure must remain valid for as long as there exists a window of that class.

Related

Clarifying how GNU C Library defines nonreentrant functions

Taken from: https://www.gnu.org/software/libc/manual/html_node/Nonreentrancy.html
For example, suppose that the signal handler uses gethostbyname. This function returns its value in a static object, reusing the same object each time. If the signal happens to arrive during a call to gethostbyname, or even after one (while the program is still using the value), it will clobber the value that the program asked for.
I fail to see how the above scenario is non-reentrant. It seems to me that gethostbyname is a (read-only) getter function that merely reads from memory (as opposed to modifying memory). Why is gethostbyname non-reentrant?
As the word says, reentrancy is the capability of a function to be able to be called again while it is being called in anothe thread. The scenario you propose is the exact place in which reentrancy is exercised. asume the function has some static or global variable (as the gethostbyname(3) function does) As the return buffer for the structure is being written by one, the other call can be overwriting it to completely destroy the first writing. When the in execution instance of the function (the interrupted one, not the interrumpting one) gets control again, all it's data has been overwritten by the interrupting one, and destroyed it.
A common solution to solve this problem with interruptions is to disable interrupts while the function is executing. This way it doesn't get interrupted by a new call to itself.
If two threads call the same piece of code, and all the parameters and local variables are stored in the stack, each thread has a copy of its own data, so there's no problem in calling both at the same time, as the data they touch is in different stacks. This will not happen with static variables, being those local scope, compilation unit scope or global scope (think that the problem comes when calling the same piece of code, so everywhere one call has access to, the other has also)
Static data, like buffers (look at stdio buffered packages) etc. means in general, the routines will not be reentrant.

When are thread function local variables allocated with Posix?

I know it's a very specific question and it's not very interesting for a high level programmer, but I would like to know when exactly are allocated the local variables of a thread function, in other words after
pthread_create(&thread, &function, ...)
is executed, can I say that they exists in memory or not (considering that the scheduler could have not executed the thread yet)?
I tried to search in the posix library code but it's not easy to understand, I arrive at the clone function, written in assembly, but than I cannot find che code of the system call service routine sys_clone to understand what exactly it does. I see in the clone code the invocation of the thread function, but I think this should happen only in the created thread (which could have never been executed by the scheduler when pthread_create is terminated) and not in the creator.
in other words after
pthread_create(&thread, &function, ...)
is executed, can I say that they exists in memory or not (considering
that the scheduler could have not executed the thread yet)?
POSIX does not give you any reason for confidence that the local variables of the initial call to function function() in the created thread will have been allocated by the time pthread_create() returns. They might or might not have been, and indeed, the answer might not even be well defined inasmuch as different threads do not necessarily have a consistent view of machine state.
There is no special significance to the local variables of a thread's start function relative to the local variables of any other function called in that thread. Moreover, although pthread_create() will not return successfully until the new thread has been created, that's a separate question from whether the start function has even been entered, much less whether its local variables have been allocated.

What happens to the parameters to execv?

I was always a bit hazy on this little bit of C magic. When you call execv, you're "replacing the process image." What exactly does that mean? Just the DATA segment? Everything allocated to the process? The stack? The heap?
My question is about what happens to the storage used by the parameters that you pass to execv? If they were local variables to the function that called execv, then they're on the stack. But if you replace the process image, and call the new process's main() function, bad things would happen when main() returned, because the stack information that points to the return location from the main call was replaced by the new process image.
Same thing for variables, yes? And what if those variables were allocated on the heap?
Inquiring minds are inquiring to anybody who knows.
The exec family of functions replace the process wholesale - data, stack, text, heap, everything. Some file descriptors can stay open (those opened by the original process without FD_CLOEXEC set). But apart from that, you pretty much get a whole new process - see the link for all the details.
What happens to the parameters you passed in is the OS's problem - it has to make sure they're passed to the new process's main function in a way that complies with the standard, but I don't think POSIX dictates exactly how it does that.
For Linux, you can look at the fs/exec.c file to see the implementation. Jump near the end (line 1484 as I post this) to look at the do_execveat_common function which is the main part of the implementation. You'll see the arguments are copied into the new address space (calls to copy_strings near the end of the function).
Just the DATA segment?
No, all memory mappings are erased and re-create for the new executable
Everything allocated to the process? The stack? The heap?
Yes, all memory. Some kernel resources, documented here, are inherited from the parent process though, such as file descriptors. These resources are managed by the kernel, and are not part of the process memory. All of this is quite operating system specific though, it can accomplish this through various means as long as it complies with the mentioned exec() documentation.
what happens to the storage used by the parameters that you pass to execv?
Typically the kernel makes a copy of those arguments, and injects them into the memory of the new executable.
But if you replace the process image, and call the new process's main() function, bad things would happen when main() returned,
No, when main() returns, that process ends. The code and memory of the original process that called exec() doesn't exist any more, there's nothing to return to.

Is memory, that is allocated in a thread, alife after the thread terminates?

I experience some memory allocation problems and try to detect possible reasons for these problems.
There are many possible reasons, and lots of hours must be spent to check each of them.
One of the possible reasons is that there is a memory buffer, that is allocated within a thread, and this buffer is used after the thread terminates.
So, if there is a chance that thread termination causes memory deallocation, then many hours of debugging may be avoided.
Thank you very much in advance.
I don't think it does, although it of course might depend on your particular details.
Generally, memory allocation from the operating system's point of view is a per-process activity, while threads exist inside the process. So if one thread allocates memory and then dies, the operating system doesn't clean that up since the process is still alive. Memory is shared inside the process, so the OS can't know that the memory no longer is used and can be cleaned up.
No, threads that 'die' do not deallocate any memory.
When a thread ends, the thread itself vanishes from memory, like a function does once it's done executing. It will take all the 'stack' objects with it, but all the memory you allocated yourself (i.e. malloc) will still be there.
As such, before you end your thread, you should make sure that all dynamic memory that was used by the thread and is not needed any more is freed properly.
Anything on the thread's stack (a local variable, for example) becomes invalid when the thread ends. However, if the data is in the heap, then the memory is still valid as long as the process is running. Of course, you'll need to save the pointer to that heap allocation somewhere outside that thread.
Memory allocated by a thread behaves like memory allocated by a method call:
variables on the stack will be dealocated when the method returns (thread terminates)
variables on the heap will continue to be allocated unless explicitly deallocated.
In addition to all answers, I'd like to make a note that pthread has a TLS keys which are registered with pthread_key_create which accepts key ID and destructor functions. On pthread_exit a static pthread_key_clean_all() is called that iterates through the keys and invokes assigned destructors that may perform memory deallocation (by application design).
So, to understand that - search in your code all pthread_key_create invocations, check if a destructor assigned and the put breakpoints to all of them to check what and in which order is destroyed.

pthread Linux data runtime

I have threads in my application that wait on condition variable. When the codition is good thread starts to work and reads some data. My data is global variable. Is it possible pass data on runtime without using global data? I read something about specific data but i don't know if it is useful in this case. Thank you!
Yes, you can pass this to your thread routine: pthread_create(thread, attr, function, *USER_ARG*). Simply create a struct for the data you need for the thread to execute.
Where *USER_ARG* is stored in memory is important, you will often want to use the free store (malloc it) for the argument, otherwise you may corrupt the stack of the thread which called pthread_create.

Resources