where is file descriptor stored in process memory? - c

When a function A is called from a point of execution, internally it is a JMP statement to the address pointing to function A. So the current point of execution is saved onto the stack, the PC loads the address of the called function A and continues.
To get back to the point of execution after the function call, the function block should have equal push and pops onto the stack. Normally in C on exiting the function, the stack variables defined are destroyed(which I presume means popped off the stack), but I decided to define a file descriptor variable inside my function. The code is below:
void main() {
printf("In the beginning there was main()\n");
func_call();
printf("func_call complete\n");
while(1);
}
void func_call() {
int fp;
//Opening a file to get handle to it.
fp = open("stack_flush.c", O_RDONLY);
if (fp < 0 ) {
perror("fp could not open stack_flush.c");
return;
}
}
On running this program and checking lsof, I can see that the fd is still open upon exiting the function func_call().
stack_flu 3791 vvdnlt260 0u CHR 136,1 0t0 4 /dev/pts/1
stack_flu 3791 vvdnlt260 1u CHR 136,1 0t0 4 /dev/pts/1
stack_flu 3791 vvdnlt260 2u CHR 136,1 0t0 4 /dev/pts/1
stack_flu 3791 vvdnlt260 3r REG 8,3 526 24660187 /home/vvdnlt260/Nishanth/test_space/stack_flush.c
I checked the wikipedia entry for file descriptors and I found this:
To perform input or output, the process passes the file descriptor to the kernel through a system call, and the kernel will access the file on behalf of the process. The process does not have direct access to the file or inode tables.
From the above statement it's obvious that the file descriptor integer value is stored in process memory, but although it was defined in a function, the file descriptor was not local to the function as it did not get removed on function exit.
So my question is 2 fold:
1) If the file descriptor is part of the func_call() stack, then how does the code return to its pre function call execution point although it has not been popped off? Also in this case why does it persist after the function call exists?
2) If not part of the func_call() stack where does the file descriptor reside in the process memory?

The variable int fd; is only visible from the function func_call() and after this function finishes executing it will be popped of the stack and the memory will be overwritten probably when a new function is entered. The fact that you destroy some int value pointing to the file does not mean that you close said file. What if you did something like:
int global_fd;
void foo() {
int local_fd = open("bar.txt", O_RDONLY);
global_fd = local_fd;
}
And called foo()? Would You expect to not be able to use global_fd anymore afteer foo exits?
It is helpful to think in this case of the file descriptor as a of a pointer, You ask the kernel to give You the file, and it gives you a value that You can use as a token for this specific file, this token is what you use to let the kernel know on which file should the function like read or lseek act. When the token is passed around or destroyed the file remains open just as destroying the pointer does not free the allocated memory.

When you open a file there's a table in the kernel where descriptor of files are stored. So, when you opened your file, you created an entry in that table. If you don't close the file (with its descriptor) the entry is never deleted (it doesn't mean you cannot open the file again).
If the file descriptor is part of the func_call() stack, then how does the code return to its pre function call execution point although it has not been popped off? Also in this case why does it persist after the function call exists?
As far as I know, there's only one stack per process, not per function. So the fp variable is stored at the stack of the process and it's deleted from there when the function ends.

File descriptors are special. As you know, they're just ints. But they "contain" a fair amount of information about the file being read (the location of the file on disk, the position within the fie of the read/write pointer, etc.), so where is that information stored? The answer is that it's stored somewhere in the OS kernel. It's stored in the OS kernel because its the kernel's job to manage file I/O for you. When we say that the int referring to the open file is a "file descriptor" we mean that the int is referring to information stored somewhere else, sort of like a pointer. That word "descriptor" is important. Another word that's sometimes used for this sort of situation is "handle".
As you know, the memory for local variables is generally stored on the stack. When you return from a function, releasing the memory for the function's local variables is very simple -- they basically disappear along with the function's stack frame. And when they disappear, they do just disappear: there's no way (in C) to have some action associated with their disappearing. In particular, there's no way to have the effect of a call to close() for variables that happen to be file descriptors.
(If you want to have a cleanup action take place when a variable disappears, one way is to use C++, and use a class variable, and define an explicit destructor.)
A similar situation arises when you call malloc. In this function:
void f()
{
char *p = malloc(10);
}
we call malloc to allocate 10 bytes of memory and store the returned pointer in a local pointer variable p, which disappears when function f returns. So we lose the pointer to the allocated memory, but there's no call to free(), so the memory remains allocated. (This is an example of a memory leak.)

Related

Why does printing to stderr cause segmentation fault when dealing with ucontext?

I was working on a project for a course on Operating Systems. The task was to implement a library for dealing with threads, similar to pthreads, but much more simpler. The purpose of it is to practice scheduling algorithms. The final product is a .a file. The course is over and everything worked just fine (in terms of functionality).
Though, I got curious about an issue I faced. On three different functions of my source file, if I add the following line, for instance:
fprintf(stderr, "My lucky number is %d\n", 4);
I get a segmentation fault. The same doesn't happen if stdout is used instead, or if the formatting doesn't contain any variables.
That leaves me with two main questions:
Why does it only happen in three functions of my code, and not the others?
Could the creation of contexts using getcontext() and makecontext(), or the changing of contexts using setcontext() or swapcontext() mess up with the standard file descriptors?
My intuition says those functions could be responsible for that. Even more when given the fact that the three functions of my code in which this happens are functions that have contexts which other parts of the code switch to. Usually by setcontext(), though swapcontext() is used to go to the scheduler, for choosing another thread to execute.
Additionally, if that is the case, then:
What is the proper way to create threads using those functions?
I'm currently doing the following:
/*------------------------------------------------------------------------------
Funct: Creates an execution context for the function and arguments passed.
Input: uc -> Pointer where the context will be created.
funct -> Function to be executed in the context.
arg -> Argument to the function.
Return: If the function succeeds, 0 will be returned. Otherwise -1.
------------------------------------------------------------------------------*/
static int create_context(ucontext_t *uc, void *funct, void *arg)
{
if(getcontext(uc) != 0) // Gets a context "model"
{
return -1;
}
stack_t *sp = (stack_t*)malloc(STACK_SIZE); // Stack area for the execution context
if(!sp) // A stack area is mandatory
{
return -1;
}
uc->uc_stack.ss_sp = sp; // Sets stack pointer
uc->uc_stack.ss_size = STACK_SIZE; // Sets stack size
uc->uc_link = &context_end; // Sets the context to go after execution
makecontext(uc, funct, 1, arg); // "Makes everything work" (can't fail)
return 0;
}
This code is probably a little modified, but it is originally an online example on how to use u_context.
Assuming glibc, the explanation is that fprintf with an unbuffered stream (such as stderr by default) internally creates an on-stack buffer which as a size of BUFSIZE bytes. See the function buffered_vfprintf in stdio-common/vfprintf.c. BUFSIZ is 8192, so you end up with a stack overflow because the stack you create is too small.

Changing global variable using stat.c

I am trying to print logs dynamically. What I have done is I have a debug variable which I set in my own stat_my.c file. Below is show_stat function.
extern int local_debug_lk;
static int show_stat(struct seq_file *p, void *v)
{
int temp=0;
if(local_debug_lk == 0)
{
seq_printf(p,"local_debug_lk=0, enabling,int_num=%d\n",int_num);
local_debug_lk=1;
}
else
{
seq_printf(p,"local_debug_lk=:%d,int_num=%d\n",local_debug_lk,int_num);
while(temp<int_num){
seq_printf(p,"%d\n",intr_list_seq[temp]);
temp++;
}
local_debug_lk=0;
int_num=0;
}
return 0;
}
Driver file
int local_debug_lk, int_num;
isr_root(...){
/*
logic to extract IRQ number, saved in vect variable
*/
if(local_debug_lk && (int_num < 50000)){
intr_list_seq[int_num]=vect;
int_num++;
}
What I expect is when I do "cat /proc/show_stat", first it will enable local_debug_lk flag and whenever an interrupt occurs in driver file, it will be stored in intr_list_seq[] array. and when I do "cat /proc/stat_my" second time, it should print IRQ sequence and disable IRQ recording by setting local_debug_lk=0.
But…what's happening is, I am always getting
"local_debug_lk=0, enabling,int_num=0" log on cat; i.e. local_debug_lk is always zero; it never gets enabled.
Also, when my driver is not running, it works fine!
On two consecutive "cat /proc/stat_my", first value is set to 1 and then 0 again.
Is it possible my driver is not picking latest updated value of local_debug_lk variable?
Could you please let me know what I am doing wrong here?
It could be more calls to .show function than readings from the file (with cat /proc/show_stat). Moreover underlying system expects stable results from .show: if called with the same parameters, the function should print the same information to the seq_file.
Because of that, switching a flag in the .show function has a little sence, and making the function's output dependent on this flag is simply wrong.
Generally, changing any kernel state when a file is read is not what expected by the user. It is better to use write functionality for that.
Function .show actually prints information into temporary kernel buffer. If everything goes OK, information from the buffer is transmitted into user buffer and eventually is printed by cat. But if kernel buffer is too small, information printed into it is discarded. In that case underlying system allocates bigger buffer, and call .show again.
Also, .show is rerun if user buffer is too small to accomodate all information printed.

Are fclose(), fprintf(), ftell() thread_safe only in terms of each function itself?

Glibc says fclose()/fopen()/fprintf()/ftell() are thread-safe. But what happens when one thread is writing to or reading the file and another thread is closing the file?
Say I have a function looks like this
FILE * f; //f is opened when program starts
int log(char * str)
{
fprintf(f, "%s", str);
if (ftell(f) > SIZE_LIMIT) {
pthread_mutex_lock(&mutex);
if (ftell(f) > SIZE_LIMIT) {
fclose(f);
rename(OLD_PATH, NEW_PATH);
f = open(OLD_PATH, "a");
}
pthread_mutex_unlock(&mutex);
}
}
This function is used by multiple threads to write to file. Is it safe,i.e. no crashes? Note that function returning error is fine, my experiments show that the program crashes intermittently.
EDIT:
1. as #2501 pointed out, "The value of a pointer to a FILE object is indeterminate after the associated file is closed", this explains the intermittently crashes.
What if I rewrite the code using freopen?
pthread_mutex_lock(&mutex);
if (ftell(f) > SIZE_LIMIT) {
rename(OLD_PATH, NEW_PATH);
f = freopen(OLD_PATH, "a", f);
}
pthread_mutex_unlock(&mutex);
Each of those functions locks a mutex associated with the FILE*. So those functions are 'atomic' with respect to the particular FILE* object. But once the FILE* object is closed, it's invalid for use. So if the FILE* gets closed and a another thread tries to use that closed FILE*, then you'll have a failure due to trying to write to a closed file.
Note that this is aside from any data race you might have with the f variable being changed without synchronizing with other threads. (from the code snippet we see, it's not clear whether there's a race there, but I'm guessing that there probably is).
After the stream is closed using fclose, the value of the FILE pointer is indeterminate. This means that using it causes undefined behavior.
7.21.3 Files
... The value of a pointer to a FILE object is
indeterminate after the associated file is closed ...
Since the fprintf call may happen by other threads in the time between the fclose() and open(), when the value of the pointer f is indeterminate, the behavior or your code is undefined.
To make the code defined, the fprintf call, and any other call using the pointer, should be locked by the mutex as well.

segfault during fclose()

fclose() is causing a segfault. I have :
char buffer[L_tmpnam];
char *pipeName = tmpnam(buffer);
FILE *pipeFD = fopen(pipeName, "w"); // open for writing
...
...
...
fclose(pipeFD);
I don't do any file related stuff in the ... yet so that doesn't affect it. However, my MAIN process communicates with another process through shared memory where pipeName is stored; the other process fopen's this pipe for reading to communicated with MAIN.
Any ideas why this is causing a segfault?
Thanks,
Hristo
Pass pipeFD to fclose. fclose closes the file by file handle FILE* not filename char*. With C (unlike C++) you can do implicit type conversions of pointer types (in this case char* to FILE*), so that's where the bug comes from.
Check if pepeFD is non NULL before calling fclose.
Edit: You confirmed that the error was due to fopen failing, you need to check the error like so:
pipeFD = fopen(pipeName, "w");
if (pipeFD == NULL)
{
perror ("The following error occurred");
}
else
{
fclose (pipeFD);
}
Example output:
The following error occurred: No such file or directory
A crash in fclose implies the FILE * passed to it has been corrupted somehow. This can happen if the pointer itself is corrupted (check in your debugger to make sure it has the same value at the fclose as was returned by the fopen), or if the FILE data structure gets corrupted by some random pointer write or buffer overflow somewhere.
You could try using valgrind or some other memory corruption checker to see if it can tell you anything. Or use a data breakpoint in your debugger on the address of the pipeFD variable. Using a data breakpoint on the FILE itself is tricky as its multiple words, and is modified by normal file i/o operations.
You should close pipeFD instead of pipeName.

Seg fault with open command when trying to open very large file

I'm taking a networking class at school and am using C/GDB for the first time. Our assignment is to make a webserver that communicates with a client browser. I am well underway and can open files and send them to the client. Everything goes great till I open a very large file and then I seg fault. I'm not a pro at C/GDB so I'm sorry if that is causing me to ask silly questions and not be able to see the solution myself but when I looked at the dumped core I see my seg fault comes here:
if (-1 == (openfd = open(path, O_RDONLY)))
Specifically we are tasked with opening the file and the sending it to the client browser. My Algorithm goes:
Open/Error catch
Read the file into a buffer/Error catch
Send the file
We were also tasked with making sure that the server doesn't crash when SENDING very large files. But my problem seems to be with opening them. I can send all my smaller files just fine. The file in question is 29.5MB.
The whole algorithm is:
ssize_t send_file(int conn, char *path, int len, int blksize, char *mime) {
int openfd; // File descriptor for file we open at path
int temp; // Counter for the size of the file that we send
char buffer[len]; // Buffer to read the file we are opening that is len big
// Open the file
if (-1 == (openfd = open(path, O_RDONLY))) {
send_head(conn, "", 400, strlen(ERROR_400));
(void) send(conn, ERROR_400, strlen(ERROR_400), 0);
logwrite(stdout, CANT_OPEN);
return -1;
}
// Read from file
if (-1 == read(openfd, buffer, len)) {
send_head(conn, "", 400, strlen(ERROR_400));
(void) send(conn, ERROR_400, strlen(ERROR_400), 0);
logwrite(stdout, CANT_OPEN);
return -1;
}
(void) close(openfd);
// Send the buffer now
logwrite(stdout, SUC_REQ);
send_head(conn, mime, 200, len);
send(conn, &buffer[0], len, 0);
return len;
}
I dunno if it is just a fact that a I am Unix/C novice. Sorry if it is. =( But you're help is much appreciated.
It's possible I'm just misunderstanding what you meant in your question, but I feel I should point out that in general, it's a bad idea to try to read the entire file at once, in case you deal with something that's just too big for your memory to handle.
It's smarter to allocate a buffer of a specific size, say 8192 bytes (well, that's what I tend to do a lot, anyway), and just always read and send that much, as much as necessary, until your read() operation returns 0 (and no errno set) for end of stream.
I suspect you have a stackoverflow (I should get bonus points for using that term on this site).
The problem is you are allocating the buffer for the entire file on the stack all at once. For larger files, this buffer is larger than the stack, and the next time you try to call a function (and thus put some parameters for it on the stack) the program crashes.
The crash appears at the open line because allocating the buffer on the stack doesn't actually write any memory, it just changes the stack pointer. When your call to open tries tow rite the parameters to the stack, the top of the stack is now overflown and this causes a crash.
The solution is as Platinum Azure or dreamlax suggest, read in the file little bits at a time or allocate your buffer on the heap will malloc or new.
Rather than using a variable length array, perhaps try allocated the memory using malloc.
char *buffer = malloc (len);
...
free (buffer);
I just did some simple tests on my system, and when I use variable length arrays of a big size (like the size you're having trouble with), I also get a SEGFAULT.
You're allocating the buffer on the stack, and it's way too big.
When you allocate storage on the stack, all the compiler does is decrease the stack pointer enough to make that much room (this keeps stack variable allocation to constant time). It does not try to touch any of this stacked memory. Then, when you call open(), it tries to put the parameters on the stack and discovers it has overflowed the stack and dies.
You need to either operate on the file in chunks, memory-map it (mmap()), or malloc() storage.
Also, path should be declared const char*.

Resources