Pointer to local variable outside the scope of its declaration - c

Let's say I have a structure representing a PDF document pdf and a structure representing one of its pages pdf_page:
typedef struct pdf_page {
int page_no;
pdf_page *next_page;
char *content;
} pdf_page;
typedef struct {
pdf_page *first_page, *last_page;
} pdf;
From my main(), I call create_pdf_file(pdf *doc):
void main() {
pdf doc;
create_pdf_file(&doc);
// reading the linked list of pages here
}
Assume that create_pdf_file is something along these lines:
void
create_pdf_file(pdf *doc) {
for (int i = 0; i < 10; i++) {
pdf_page p;
p.page_no = i;
p.contents = "Hello, World!";
doc->last_page->next_page = p;
}
}
(This is merely an example source code, so no list processing is shown. Obviously, the first_page and last_page members of pdf need to be set first.)
My question: If I access doc->first_page - as well as the other pages in the linked list - after the create_pdf_file() call in my main(), is it possible that I get segmentation faults because of "taking the local variable p out of its context"?
(I am not sure whether I have guaranteed that the corresponding memory location will not be used for something else.)
If so, how do I avoid this?

yes, p is a local variable stored on the stack, when the lifetime ends (every loop iteration) any pointer to it gets invalid. you need to allocate every page with malloc() and free() it after you are finished.
this would look similar to:
for (int i = 0; i < 10; i++)
{
pdf_page* p = malloc(sizeof(pdf_page));
p->page_no = i;
p->contents = "Hello, World!";
doc->last_page->next_page = p;
}
and when you call your function you have to pass a pointer to doc:
create_pdf_file(&doc);

is it possible that I get segmentation faults because of "taking the local variable p out of its context"?
Once the block in which p is declared terminates, any pointer to p is invalid (a "dangling pointer") and attempting to dereference such a pointer is Undefined Behaviour. In other words, don't do it: you could get segmentation faults, or any other behaviour (including random memory corruption or the use of the wrong data without any error condition.)
(I am not sure whether I have guaranteed that the corresponding memory location will not be used for something else.)
You've guaranteed that the lifetime of p is shorter than a pointer to p.
If so, how do I avoid this?
Use malloc to dynamically allocate a memory region of the correct size to hold the datum. Don't forget to free the memory when you no longer need it.

Related

Memory allocation using for loop

My Doubt is regarding only memory allocation so don't think about program output
#include<stdio.h>
int main(){
for(int i=0;i<20;i++){
char *str=malloc(sizeof(char)*6); //assuming length of each string is 6
scanf("%s",str);
insertinlinkedlist(str);
}
}
whenever i allocate memory here as shown above only the base address of char array will pass to linked list,and that is the memory block allocated for char array is inside main only and i am storing the base address of that array in str which is local to main and is passed to insetinlinkedlist
I want to ask whenever memory is allocated inside loop than why the number of
memory blocks(no of char arrays declared ) are created equal to n (number of time loop runs) since variable name is same we should be directed to same memory location
Note I have checked in compiler by running the loop all the times when loop runs memory the value of str is different
is The above method is correct of allocating memory through loop and through same variable "Is the method ensures that every time we allocate memory in above manner their will be no conflicts while memory allocation and every time we will get the address of unique memory block"
Now above doubt also creates a doubt in my mind
That if we do something like that
int main(){
for(int i=0;i<n;i++){
array[50];
}
}
then it will also create 50 array inside stack frame
malloc returns a pointer to the first allocated byte. Internally it keeps track of how much memory was allocated so it knows how much to free (you do need to insert calls to free() or you'll leak memory, by the way). Usually, it does this by allocating a little bit of memory before the pointer it gives you and storing the length there, however it isn't required to do it that way.
The memory allocated by malloc is not tied to main in any way. Currently main is the only function whose local variables have a pointer to that memory, but you could pass the pointer to another function, and that function would also be able to access the memory. Additionally, when the function that called malloc returns, that memory will remain allocated unless manually freed.
The variable name doesn't matter. A pointer is (to first approximation) just a number. Much like how running int a = 42; a = 20; is permitted and replaces the previous value of a with a new one, int *p = malloc(n); p = malloc(n); will first assign the pointer returned by the first malloc call to p, then will replace it with the return value of the second call. You can also have multiple pointers that point to the same address:
int *a = malloc(42);
int *b = malloc(42);
int *c = a;
a = malloc(42);
At the end of that code, c will be set to the value returned by the first malloc call, and a will have the value returned by the last malloc call. Just like if you'd done:
//assume here that f() returns a different value each time
//it's called, like malloc does
int a = f();
int b = f();
int c = a;
a = f();
As for the second part of your question:
for(int i=0;i<n;i++){
int array[50];
}
The above code will create an array with enough space for 50 ints inside the current stack frame. It will be local to the block within the for loop, and won't persist between iterations, so it won't create n separate copies of the array. Since arrays declared this way are part of the local stack frame, you don't need to manually free them; they will cease to exist when you exit that block. But you could pass a pointer to that array to another function, and it would be valid as long as you haven't exited the block. So the following code...
int sum(int *arr, size_t n) {
int count = 0;
for (size_t i = 0; i < n; i++) {
count += arr[i];
}
return count;
}
for(int i=0;i<n;i++){
int array[50];
printf("%d\n", sum(array, 50));
}
...would be legal (from a memory-management perspective, anyway; you never initialize the array, so the result of the sum call is not defined).
As a minor side note, sizeof(char) is defined to be 1. You can just say malloc(6) in this case. sizeof is necessary when allocating an array of a larger type.

Fortune while returning C arrays?

I'm newbie with C and stunned with some magic while using following C functions. This code works for me, and prints all the data.
typedef struct string_t {
char *data;
size_t len;
} string_t;
string_t *generate_test_data(size_t size) {
string_t test_data[size];
for(size_t i = 0; i < size; ++i) {
string_t string;
string.data = "test";
string.len = 4;
test_data[i] = string;
}
return test_data;
}
int main() {
size_t test_data_size = 10;
string_t *test_data = generate_test_data(test_data_size);
for(size_t i = 0; i < test_data_size; ++i) {
printf("%zu: %s\n", test_data[i].len, test_data[i].data);
}
}
Why function generate_test_data works only when "test_data_size = 10", but when "test_data_size = 20" process finished with exit code 11? HOW does it possible?
This code will never work perfectly, it just happens to be working. In C, you have to manage the memory yourself. If you make a mistake, the program might continue to work... or something might scribble all over the memory you thought was yours. This often manifests itself as weird errors like you're having: it works when the length is X, but fails when the length is Y.
If you turn on -Wall, or if you're using clang even better -Weverything, you'll get a warning like this.
test.c:18:12: warning: address of stack memory associated with local variable 'test_data' returned
[-Wreturn-stack-address]
return test_data;
^~~~~~~~~
The two important kinds of memory in C are: stack and heap. Very basically, stack memory is only good for the duration of the function. Anything declared on the stack will be freed automatically when the function returns, sort of like local variables in other languages. The rule of thumb is if you don't explicitly allocate it, it's on the stack. string_t test_data[size]; is stack memory.
Heap memory you allocate and free yourself, usually using malloc or calloc or realloc or some other function doing this for you like strdup. Once allocated, heap memory stays around until it's explicitly deallocated.
Rule of thumb: heap memory can be returned from a function, stack memory cannot... well, you can but that memory slot might then be used by something else. That's what's happening to you.
So you need to allocate memory, not just once, but a bunch of times.
Allocate memory for the array of pointers to string_t structs.
Allocate memory for each string_t struct in the array.
Allocate memory for each char string (really an array) in each struct.
And then you have to free all that. Sound like a lot of work? It is! Welcome to C. Sorry. You probably want to write functions to allocate and free string_t.
static string_t *string_t_new(size_t size) {
string_t *string = malloc(sizeof(string_t));
string->len = 0;
return string;
}
static void string_t_destroy(string_t *self) {
free(self);
}
Now your test data function looks like this.
static string_t **generate_test_data_v3(size_t size) {
/* Allocate memory for the array of pointers */
string_t **test_data = calloc(size, sizeof(string_t*));
for(size_t i = 0; i < size; ++i) {
/* Allocate for the string */
string_t *string = string_t_new(5);
string->data = "test";
string->len = 4;
test_data[i] = string;
}
/* Return a pointer to the array, which is also a pointer */
return test_data;
}
int main() {
size_t test_data_size = 20;
string_t **test_data = generate_test_data_v3(test_data_size);
for(size_t i = 0; i < test_data_size; ++i) {
printf("%zu: %s\n", test_data[i]->len, test_data[i]->data);
}
/* Free each string_t in the array */
for(size_t i = 0; i < test_data_size; i++) {
string_t_destroy(test_data[i]);
}
/* Free the array */
free(test_data);
}
Instead of using pointers you could instead copy all the memory you use, which is sort of what you were previously doing. That's easier for the programmer, but inefficient for the computer. And if you're coding in C, it's all about being efficient for the computer.
Because the space for test_data in v1 gets created in the function, and that space gets reclaimed when the function returns (and can thus be used for other things); in v2, the space is set aside outside of the function, so doesn't get reclaimed.
Why function generate_test_data_v1 works only when "test_data_size = 10", but when "test_data_size = 20" process finished with exit code 11?
I see no reason why function generate_test_data_v1() should ever fail, but you cannot use its return value. It returns a pointer to an automatic variable belonging to the function's scope, and automatic variables cease to exist when the function to which they belong returns. Your program produces undefined behavior when it dereferences that pointer. I can believe that it appears to work as you intended for some sizes, but even in those cases the program is wrong.
Moreover, your program is very unlikely to be producing an exit code of 11, but it may well be terminating abruptly with a segmentation fault, which is signal 11.
And why generate_test_data_v2 works perfectly?
Function generate_test_data_v2() populates elements of an existing array belonging to function main(). That array is in scope for substantially the entire life of the program.

c char pointers memory and global variables

The pointer to my global variable is turning to crud after freeing the local resource that I use to set the value in c.
this is the .c class
char* resource_directory;
void getResourcePath()
{
char *basePath = SDL_GetBasePath();
char* resource_dir = (char*)malloc(37 * sizeof(char));
for(int i = 0; i < 25; i++)
{
resource_dir[i] = basePath[i];
}
strcat(resource_dir, "resources/");
resource_dir[36] = '\0';
*resource_directory = *resource_dir;
free(basePath);
// free(resource_dir); <--- If I free here the value goes to crud
}
(this line below should say the value at resresource_directorydir equals the value at resource_dir) right?
*resource_directory = *resource_dir;
so the value at the address of the first pointer should get the value of the address at the 2nd but after trying to free the resource towards the end of the function.
even doing a print statement of the addresses show that they have different addresses.
SDL_Log("%d, %d", &resource_directory, &resource_dir);
example output : 245387384, 1361037488
I get the feeling that I am making a silly mistake here but I don't know what it is.
This line,
*resource_directory = *resource_dir;
is assigning the first value resource_dir points to, to the uninitialized pointer resource_directory, it's equivalent to
resource_directory[0] = resource_dir[0];
which is clearly not what you want.
You need to assign the pointer
resource_directory = resource_dir;
but you shouldn't use a global variable for that, and specially
Don't malloc() it, you have to free() everything you malloc() and global variables make it hard.
Don't use malloc() for fixed size objects, instead declare it as an array with the appropriate size, like this
char resource_directory[37];
Copy strings with strcpy() instead of writing a loop your self
for(int i = 0; i < 25; i++)
{
resource_dir[i] = basePath[i];
}
woule be
strcpy(resource_dir, basePath);
One thing you should notice when using a global variable like this is that if you call getResourcesPath() more than once, you are going to leak resources, if you must use global variables to carry values that need to live as long as the whole program lives, try to make their initialization static, and you can completely avoid using global variables for that, because everything that you declare and initialized in the stack frame of main() will hafe the same lifetime as the program, so you can pass it as parameters to any function that requires them from within main(), if you have many of these variables, create a struct to hold them, and pass the struct across the functions that need these resources, this is a very common technique in fact.
This statements
SDL_Log("%d, %d", &resource_directory, &resource_dir);
outputs as integer values the addresses of global variables resource_directory and local variable resource_dir. Of course their addresses are different.
As for this statement
*resource_directory = *resource_dir;
then it stores the first character of the string pointed to by resource_dir at the address that is stored in pointer resource_directory. However initially resource_directory was initialized by zero as an object with the static storage duration. So the program has undefined behaviour.
I think you mean the following
resource_directory = resource_dir;
That is you wanted that resource_directory would point to the string built in the function.
And there is no sense to use statement
free(resource_dir); <--- If I free here the value goes to crud
necause in this case statement
resource_directory = resource_dir;
also does not have sense.
If resource_directory need to point to the built in the function string then you shall not destroy it.
Take into account that using magic numbers 37 and 25 makes the program unclear and error-prone.
A pointer is a variable that contains a numeric value which happens to be the address of a memory location. For example, a NULL pointer is a variable containing the value 0. Assigning the notion of "isa pointer" to is a way of notifying the compiler of your intended use: that is, you cannot use pointer syntax on non-pointer variables. But until you do use pointer syntax, they more or less behave like normal variables.
int i = 0;
int* p = &i;
int j;
int* q;
j = i; // j has the same value as i
q = p; // q has the same value as p
At the end of the above code, p and q point to the same address. We only use the * syntax when we want to dereference a pointer:
q = p;
*q = *p; // copies the `int` pointed to by `p` to
// the `int` memory location pointed to by `q`,
// which is the same location.
Note that it is possible to nest pointers:
int** pp = &p;
p is int*, so &p is int**.
pp is a numeric value, it is the address of a memory location that contains an int*. *pp means fetch the value at the memory location contained in pp which will retrieve a second numeric value - the value that is in p, which is itself an int* and thus a second address. **pp will retrieve us the integer to which p points.
Your assignment
*resource_directory = *resource_dir;
copies the pointed-to-values, not the addresses.
Since you have tagged your question as C++ I'm going to conclude by offering this alternative implementation:
std::string resource_dir;
void getResourcePath()
{
char *basePath = SDL_GetBasePath();
resource_dir = base_path;
resource_dir += "resources/";
free(basePath);
}
If you need the c-string value of resource_dir subsequently, use resource_dir.c_str().
Alternatively, if you require a C implementation:
char* resource_dir;
void getResourcePath()
{
const char subPath[] = "resources/";
size_t length;
char *basePath = SDL_GetBasePath();
if (resource_dir)
free(resource_dir);
length = strlen(basePath) + sizeof(subPath);
resource_dir = (char*)malloc(length);
snprintf(resource_dir, length, "%s%s", basePath, subPath);
free(basePath);
}

Variable scope inside while loop

This is perhaps one of the most odd things I've ever encountered. I don't program much in C but from what I know to be true plus checking with different sources online, variables macroName and macroBody are only defined in scope of the while loop. So every time the loop runs, I'm expecting marcoName and macroBody to get new addresses and be completely new variables. However that is not true.
What I'm finding is that even though the loop is running again, both variables share the same address and this is causing me serious headache for a linked list where I need to check for uniqueness of elements. I don't know why this is. Shouldn't macroName and macroBody get completely new addresses each time the while loop runs?
I know this is the problem because I'm printing the addresses and they are the same.
while(fgets(line, sizeof(line), fp) != NULL) // Get new line
{
char macroName[MAXLINE];
char macroBody[MAXLINE];
// ... more code
switch (command_type)
{
case hake_macro_definition:
// ... more code
printf("**********%p | %p\n", &macroName, &macroBody);
break;
// .... more cases
}
}
Code that is part of my linked-list code.
struct macro {
struct macro *next;
struct macro *previous;
char *name;
char *body;
};
Function that checks if element already exists inside linked-list. But since *name has the same address, I always end up inside the if condition.
static struct macro *macro_lookup(char *name)
{
struct macro *temp = macro_list_head;
while (temp != NULL)
{
if (are_strings_equal(name, temp->name))
{
break;
}
temp = temp->next;
}
return temp;
}
These arrays are allocated on the stack:
char macroName[MAXLINE];
char macroBody[MAXLINE];
The compiler has pre-allocated space for you that exists at the start of your function. In other words, from the computer's viewpoint, the location of these arrays would the same as if you had defined them outside the loop body at the top of your function body.
The scope in C merely indicates where an identifier is visible. So the compiler (but not the computer) enforces the semantics that macroName and macroBody cannot be referenced before or after the loop body. But from the computer's viewpoint, the actual data for these arrays exists once the function starts and only goes away when the function ends.
If you were to look at the assembly dump of your code, you'd likely see that your machine's frame pointer is decremented by a big enough amount for your function's call stack to have space for all of your local variables, including these arrays.
What I need to mention in addition to chrisaycock's answer: you should never use pointers to local variables outside function these variables were defined in. Consider this example:
int * f()
{
int local_var = 0;
return &local_var;
}
int g(int x)
{
return (x > 0) ? x : 0;
}
int main()
{
int * from_f = f(); //
*from_f = 100; //Undefined behavior
g(15); //some function call to change stack
printf("%d", *from_f); //Will print some random value
return 0;
}
The same, actually, applies to a block. Technically, block-local variables can be cleaned out after the block ends. So, on each iteration of a loop old addresses can be invalid. It will not be true since C compiler indeed puts these vars to the same address for perfomance reasons, but you can not rely on it.
What you need to understand is how memory is allocated. If you want to implement a list, it is a structure that grows. Where does the memory come from? You can not allocate much memory from the stack, plus the memory is invalidated once you return from a function. So, you will need to allocate it from the heap (using malloc).

pointer and which is pointed by the pointer

Update : Sorry, just a big mistake. It is meaningless to write int *a = 3; But please just think the analogy to the case like TCHAR *a = TEXT("text"); (I edited my question, so some answers and comments are strange, since they are for my original question which is not suitable)
In main function, suppose I have a pointer TCHAR *a = TEXT("text"); Then it excutes the following code:
int i;
for (i = 0; i < 1000; i++) {
a = test(i);
}
with the function TCHAR* test(int par) defined by:
TCHAR* test(int par)
{
TCHAR *b = TEXT("aaa");
return b;
}
My question is, after executing the above code, but before the program ends, in the memory:
1. the pointer `a` remains?
2. The 1000 pointers `b` are deleted each time the function test(...) exits ?
3. But there are still 1000 memory blocks there?
In fact, my question is motivated from the following code, which shows a tooltip when mouse is over a tab item in a tab control with the style TCS_TOOLTIPS:
case WM_NOTIFY
if (lpnmhdr->code == TTN_GETDISPINFO) {
LPNMTTDISPINFO lpnmtdi;
lpnmtdi = (LPNMTTDISPINFO)lParam;
int tabIndex = (int) wParam; // wParam is the index of the tab item.
lpnmtdi->lpszText = SetTabToolTipText(panel->gWin.At(tabIndex));
break;
}
I am thinking if the memory usage increases each time it calls
SetTabToolTipText(panel->gWin.At(tabIndex)), which manipulates with TCHAR and TCHAR* and return a value of type LPTSTR.
Yes, the pointer a remains till we return from the main function
The variable b (a 4-byte pointer) is automatic. It is created each time we call test function. Once we return from it, the variable disappears (the pointer). Please note, the value to which b points isn't affected.
No. In most of the cases, I think, there will be only one block allocated during compilation time (most likely in the read-only memory) and the function will be returning the same pointer on every invocation.
If SetTabToolTipText allocates a string inside using some memory management facilities new/malloc or some os-specific, you should do an additional cleanup. Otherwise there'll be a memory leak.
If nothing like this happens inside (it's not mentioned in the documentation or comments etc), it's most likely returning the pointer to some internal buffer which you typically use as readonly. In this case, there should be no concerns about a memory consumption increase.
You dont allocate any memory so you don't have to worry about memory being freed. When your vaiables go out of scope they will be freed automatically. In this function
int test(int par)
{
int *b = par;
}
you don't have a return value even though the function says that is will return an int, so you should probably do so as in this line
for (i = 0; i < 1000; i++) {
a = test(i);
}
you assign to a the value that is returned by test(). Also
int* a = 3;
int* b = par;
are asking for trouble. You are assigning integer values to a pointer variable. You should probably rethink your above code.
Pointer should contain adress... so int* a = 3 is something meaningless... And in function you don't allocate memory for int (only for par variable, which then destroy when the function ends), you allocate memory for storing adress in int* b, this memory also free when the funciton ends.

Resources