How is thread stack created in C?

How is thread stack created in C? - c

Let's say we have the following program:
int main() {
pthread_t tid;
Pthread_create(&tid, NULL, thread, NULL);
Pthread_join(tid, NULL);
... //do some other work
exit(0);
}
void *thread(void *vargp) {
...//do sth
return NULL;
}
Below is a picture that shows the main thread stack:
My question is, after a new thread is created, how does the new thread's own stack look like? does the beginning of the new stack start right after the main thread as:
or the new thread's stack's beginning address can be any random address, therefore leaving "splinters" as:
I know due to virtual address, the virual pages can be anywhere in the physical disk, but I just want to know if the virtual address itself is continuous or not.

This depends on the operating system.
For security reasons, the layout of the virtual address space is randomized in most modern operating systems. This is called Address Space Layout Randomization (ASLR).
Therefore, it is unlikely that the virtual memory reserved for the thread's main stack will be directly adjacent to that of another thread. Even without ASLR, there will probably be at least one guard page (probably more) between the two stacks to detect and protect against a stack overflow.

Related

Cygwin/Cygserver Shared memory

I was trying to migrate some shared memory code from CENTOS(3.5) to CYGWIN(2.8.1, win10).
the shared memory generally work like this:
Spawn a shared memory at a process by shmget.
Map the shared memory on this process by the shmat and record the location, then fill some information into the memory.
Map the shared memory on another process by the "shmat", pass the location of last process recorded, because we expect that both processes will mapping the shared memory at the same address.
Here are some code to explain:
// one process
size_t size = 1024 * 1024;//1M
int id = shmget(IPC_PRIVATE, size, 0660);
char *madr = 0;
char *location = shmat(id, madr, 0);
// another process
char *location1 = shmat(id, location , 0);
// !!!we hope location1 and location should be the same!!!
On Centos it works well.
On Cygwin one process mapped the shared memory at 0xffd90000, another process is not same with it but mapped at oxffdb0000. we check that the memory 0xffd90000 is available on that process.

Wrong expectation also on Linux, see
https://linux.die.net/man/2/shmat
Be aware that the shared memory segment attached in this way may be
attached at different addresses in different processes. Therefore, any
pointers maintained within the shared memory must be made relative
(typically to the starting address of the segment), rather than
absolute.

nonexistant memory in gdb

I'm working on an OS class project with a variant of HOCA system.
I'm trying to create the interrupt handler part of the OS where I/O device interrupts are detected and handled.
(If you have no idea about HOCA, that's fine) My question is really about the internal manipulation of C.
The whole system work like this:
Main function of the OS calls an init() where all the parts are initialized.
After initializing the OS, the root process is created and the first application is schedule()'ed to the specific application. Then the application processes are created and schedule()'ed in a tree structure which rooted from the root process.
void schedule(){
proc_t *front;
front = headQueue(RQ); //return the first available process in the Ready Queue
if (checkPointer(front)) {
intschedule(); // load a timeslice to the OS
LDST(&(front->p_s)); // load the state to the OS
// so that OS can process the application specified by p_s
// LDST() is system function to load a state to processor
}
else {
intdeadlock(); // unlock a process from the blocked list and put in RQ
}
}
Using gdb, I see everything is ok, until it processes right before if(checkPointer(front))
int checkPointer(void *p){
return ((p != (void *) ENULL)&&(p != (void *)NULL));
}
gdb respond:
trap: nonexistant memory address: -1 memory size: 131072 ERROR:
address greater than MEMORYSIZE
what's going wrong with this?
checkPointer() is located in another file.
Your help is much appreciated.

fork() system call and memory space of the process

I quote "when a process creates a new process using fork() call, Only the shared memory segments are shared between the parent process and the newly forked child process. Copies of the stack and the heap are made for the newly created process" from "operating system concepts" solutions by Silberschatz.
But when I tried this program out
#include <stdio.h>
#include <sys/types.h>
#define MAX_COUNT 200
void ChildProcess(void); /* child process prototype */
void ParentProcess(void); /* parent process prototype */
void main(void)
{
pid_t pid;
char * x=(char *)malloc(10);
pid = fork();
if (pid == 0)
ChildProcess();
else
ParentProcess();
printf("the address is %p\n",x);
}
void ChildProcess(void)
{
printf(" *** Child process ***\n");
}
void ParentProcess(void)
{
printf("*** Parent*****\n");
}
the result is like:
*** Parent*****
the address is 0x1370010
*** Child process ***
the address is 0x1370010
both parent and child printing the same address which is in heap.
can someone explain me the contradiction here. please clearly state what are all the things shared by the parent and child in memory space.

Quoting myself from another thread.
When a fork() system call is issued, a copy of all the pages
corresponding to the parent process is created, loaded into a separate
memory location by the OS for the child process. But this is not
needed in certain cases. Consider the case when a child executes an
"exec" system call or exits very soon after the fork(). When the
child is needed just to execute a command for the parent process,
there is no need for copying the parent process' pages, since exec
replaces the address space of the process which invoked it with the
command to be executed.
In such cases, a technique called copy-on-write (COW) is used. With
this technique, when a fork occurs, the parent process's pages are not
copied for the child process. Instead, the pages are shared between
the child and the parent process. Whenever a process (parent or child)
modifies a page, a separate copy of that particular page alone is made
for that process (parent or child) which performed the modification.
This process will then use the newly copied page rather than the
shared one in all future references. The other process (the one which
did not modify the shared page) continues to use the original copy of
the page (which is now no longer shared). This technique is called
copy-on-write since the page is copied when some process writes to it.
Also, to understand why these programs appear to be using the same space of memory (which is not the case), I would like to quote a part of the book "Operating Systems: Principles and Practice".
Most modern processors introduce a level of indirection, called
virtual addresses. With virtual addresses, every process's memory
starts at the "same" place, e.g., zero.
Each process thinks that it has the entire machine to itself, although
obviously that is not the case in reality.
So these virtual addresses are translations of physical addresses and doesn't represent the same physical memory space, to leave a more practical example we can do a test, if we compile and run multiple times a program that displays the direction of a static variable, such as this program.
#include <stdio.h>
int main() {
static int a = 0;
printf("%p\n", &a);
getchar();
return 0;
}
It would be impossible to obtain the same memory address in two
different programs if we deal with the physical memory directly.
And the results obtained from running the program several times are...

Yes, both processes are using the same address for this variable, but these addresses are used by different processes, and therefore aren't in the same virtual address space.
This means that the addresses are the same, but they aren't pointing to the same physical memory. You should read more about virtual memory to understand this.

The address is the same, but the address space is not. Each process has its own address space, so parent's 0x1370010 is not the same as child's 0x1370010.

You're probably running your program on an operating system with virtual memory. After the fork() call, the parent and child have separate address spaces, so the address 0x1370010 is not pointing to the same place. If one process wrote to *x, the other process would not see the change. (In fact those may be the same page of memory, or even the same block in a swap-file, until it's changed, but the OS makes sure that the page is copied as soon as either the parent or the child writes to it, so as far as the program can tell it's dealing with its own copy.)

When the kernel fork()s the process, the copied memory information inherits the same address information since the heap is effectively copied as-is. If addresses were different, how would you update pointers inside of custom structs? The kernel knows nothing about that information so those pointers would then be invalidated. Therefore, the physical address may change (and in fact often will change even during the lifetime of your executable even without fork()ing, but the logical address remains the same.

Yes address in both the case is same. But if you assign different value for x in child process and parent process and then also prints the value of x along with address of x, You will get your answer.
#include <stdio.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
#define MAX_COUNT 200
void ChildProcess(void); /* child process prototype */
void ParentProcess(void); /* parent process prototype */
void main(void)
{
pid_t pid;
int * x = (int *)malloc(10);
pid = fork();
if (pid == 0) {
*x = 100;
ChildProcess();
}
else {
*x = 200;
ParentProcess();
}
printf("the address is %p and value is %d\n", x, *x);
}
void ChildProcess(void)
{
printf(" *** Child process ***\n");
}
void ParentProcess(void)
{
printf("*** Parent*****\n");
}
Output of this will be:
*** Parent*****
the address is 0xf70260 and value is 200
*** Child process ***
the address is 0xf70260 and value is 100
Now, You can see that value is different but address is same. So The address space for both the process is different. These addresses are not actual address but logical address so these could be same for different processes.

Segfault occurs on initialization in pthread only

I cannot understand why the following pseudo code is causing a segfault.
Using pthreads to run a function I run into a SEGFAULT initializing an integer to zero.
When my_threaded_function not in threaded context or if I called function from the main thread there is no issue.
The SEGFAULT doesn't occur on initializing rc=0;bu only inside the maze_init function.
I have confirmed that I am out of stack space. But I can't think of what is causing the function to behave differently inside the pthread (no shared memory involved), the address &aa cannot be accessed according to gdb.
Why would a stack variable's address not be on the stack?
int maze_init(Maze*m, char* filename)
{
FILE *fp;
int aa, bb, rc;
aa = 0; /// SEGFAULT HERE
...
return 1;
}
void* my_threaded_function(void* arg)
{
Maze maze;
int rc;
rc = 0;
rc = maze_init(&maze,"test.txt");
return rc;
pthread_exit((void*)1);
}
int main(int argc,char** argv){
pthread_t t;
pthread_create(&t, NULL, my_threaded_function,(void*)0);
sleep(10);
}
edit (fixed code typo to return rc)

I have confirmed that I am out of stack space. But I can't think of
what is causing the function to behave differently inside the pthread
Well for one by default secondary threads have smaller stacks than the "main" thread. You can set the size with pthread_attr_setstacksize.
TLPI says:
Each thread has its own stack whose size is fixed when the thread is
created. On Linux/x86-32, for all threads other than the main thread,
the default size of the per-thread stack is 2 MB. The main thread has
a much larger space for stack growth
So that is one reason why it would work when called normally and fail when called from a secondary thread.

98th call to pthread_create() fails

I'm running the following program. It simply creates threads that die straight away.
What I have found is that after 93 to 98 (it varies slightly) successful calls, every next call to pthread_create() fails with error 11: Resource temporarily unavailable. I think I'm closing the thread correctly so it should give up on any resources it has but some resources become unavailable.
The first parameter of the program allows me to set the interval between calls to pthread_create() but testing with different values, I've learned that the interval does not matter (well, I'll get the error earlier): the number of successful calls will be the same.
The second parameter of the program allows me to set a sleep interval after a failed call but the length of the interval does not seem to make any difference.
Which ceiling am I hitting here?
EDIT: found the error in doSomething(): change lock to unlock and the program runs fine. The question remains: what resource is depleted with the error uncorrected?
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
#include <pthread.h>
#include <errno.h>
pthread_mutex_t doSomethingLock;
void milliSleep(unsigned int milliSeconds)
{
struct timespec ts;
ts.tv_sec = floorf(((float)milliSeconds / 1000));
ts.tv_nsec = ((((float)milliSeconds / 1000) - ts.tv_sec)) * 1000000000;
nanosleep(&ts, NULL);
}
void *doSomething(void *args)
{
pthread_detach(pthread_self());
pthread_mutex_lock(&doSomethingLock);
pthread_exit(NULL);
}
int main(int argc, char **argv)
{
pthread_t doSomethingThread;
pthread_mutexattr_t attr;
int threadsCreated = 0;
if (argc != 3)
{
fprintf(stderr, "usage: demo <interval between pthread_create() in ms> <time to wait after fail in ms>\n");
exit(1);
}
pthread_mutexattr_init(&attr);
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE_NP);
pthread_mutex_init(&doSomethingLock, &attr);
while (1)
{
pthread_mutex_lock(&doSomethingLock);
if (pthread_create(&doSomethingThread, NULL, doSomething, NULL) != 0)
{
fprintf(stderr, "%d pthread_create(): error %d, %m\n", threadsCreated, errno);
milliSleep(atoi(argv[2]));
}
else threadsCreated++;
milliSleep(atoi(argv[1]));
}
}

If you are on a 32 bit distro, you are probably hitting address space limits. The last I checked, glibc will allocate about 13MB for stack space in every thread created (this is just the size of the mapping, not allocated memory). With 98 threads, you will be pushing past a gigabyte of address space of the 3G available.
You can test this by freezing your process after the error (e.g. sleep(1000000) or whatever) and looking at its address space with pmap.
If that is the problem, then try setting a smaller stack size with pthread_attr_setstack() on the pthread_attr_t you pass to pthread_create. You will have to be the judge of your stack requirements obviously, but often even complicated code can run successfully in only a few kilobytes of stack.

Your program does not "create threads that simply die away". It does not do what you think it does.
First, pthread_mutex_unlock() only unlocks a pthread_mutex_t that has been locked by the same thread. This is how mutexes work: they can only be unlocked by the same thread that locked them. If you want the behaviour of a semaphore semaphore, use a semaphore.
Your example code creates a recursive mutex, which the doSomething() function tries to lock. Because it is held by the original thread, it blocks (waits for the mutex to become free in the pthread_mutex_lock() call). Because the original thread never releases the lock, you just pile up new threads on top of the doSomethingLock mutex.
Recursivity with respect to mutexes just means a thread can lock it more than once; it must unlock it the same number of times for the mutex to be actually released.
If you change the pthread_mutex_lock() in doSomething() to pthread_mutex_unlock(), then you're trying to unlock a mutex not held by that thread. The call fails, and then the threads do die immediately.
Assuming you fix your program, you'll next find that you cannot create more than a hundred or so threads (depending on your system and available RAM).
The reason is well explained by Andy Ross: the fixed size stacks (getrlimit(RLIMIT_STACK, (struct rlimit *)&info) tells you how much, unless you set it via thread attributes) eat up your available address space.
The original stack given to the process is resized automatically, but for all other threads, the stack size is fixed. By default, it is very large; on my system, 8388608 bytes (8 megabytes).
I personally create threads with very small stacks, usually 65536 bytes, which is more than enough unless your functions use local arrays or large structures, or do insanely deep recursion:
#ifndef THREAD_STACK_SIZE
#define THREAD_STACK_SIZE 65536
#endif
pthread_attr_t attrs;
pthread_t thread[N];
int i, result;
/* Create a thread attribute for the desired stack size. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, THREAD_STACK_SIZE);
/* Create any number of threads.
* The attributes are only a guide to pthread_create(),
* they are not "consumed" by the call. */
for (i = 0; i < N; i++) {
result = pthread_create(&thread[i], &attrs, some_func, (void *)i);
if (result) {
/* strerror(result) describes the error */
break;
}
}
/* You should destroy the attributes when you know
* you won't be creating any further threads anymore. */
pthread_attr_destroy(&attrs);
The minimum stack size should be available as PTHREAD_STACK_MIN, and should be a multiple of sysconf(_SC_PAGESIZE). Currently PTHREAD_STACK_MIN == 16384, but I recommend using a larger power of two. (Page sizes are always powers of two on any binary architecture.)
It is only the minimum, and the pthread library is free to use any larger value it sees fit, but in practice the stack size seems to be what you set it to, plus a fixed value depending on the architecture, kernel, and the pthread library version. Using a compile-time constant works well for almost all cases, but if your application is complex enough to have a configuration file, it might be a good idea to let the user override the compile-time constant if they want to, in the config file.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How is thread stack created in C? - c

Related

Cygwin/Cygserver Shared memory

nonexistant memory in gdb

fork() system call and memory space of the process

Segfault occurs on initialization in pthread only

98th call to pthread_create() fails

Categories

Resources