can pointers exist outside their scope - c

why does the pointer "a" points to the correct location when i call the function for the second time
because during the second function call to "cr" the statements in the if block will not be executed so how the hell does the pointer "a" remember its previous location even though its not a static variable
code:
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
typedef struct heavy{
int data;
struct heavy *nxt;
}ode;
void cr(ode**);
void prin(ode*);
int main()
{
ode*head=NULL;
cr(&head);cr(&head);
prin(head);
getch();
return 0;
}
void cr(ode**p)
{
ode*temp,*a;
temp=*p;
if(temp==NULL)
{
a=(ode*)malloc(sizeof(ode));
a->data=1;
a->nxt=(ode*)malloc(sizeof(ode));
*p=a;
a=a->nxt;
a->nxt=NULL;
}else{
a->data=2;
a->nxt=NULL;
}
}
void prin(ode*head)
{
if(head==NULL)
printf("list is empty");
while(head!=NULL)
{
printf("%d",head->data);
head=head->nxt;
}
}

When you don't assign any value to a local variable (in this case we're talking about a) the program will behave in an unpredictable way, thus we say undefined behavior. This means that your program work correctly just by chance, and you should always remember to assign values before using the variables.
In particular, in this case I can guess why it's working every time you run it, and it has to do with how function calls work in C. Let me explain.
When we make a call to a function, a new frame (layer) in the stack (a place in memory where local variables and other "local-ish" things are stored). As you can expect from what I just said, a stack is organized in layers. Let me expose an example.
If in a particular function called George() I declare and use 2 local variables
int George(){
int a;
int b;
a = 5;
return 0;
}
the compiler will know that space for 2 variables is needed, so when I call the function it will reserve space for those 2 local variables to be stored. The new stack frame will be something like:
| 'a': ____ | <-- 4 bytes space for variable a
| 'b': ____ | <-- 4 bytes space for variable b
|-----------|
(While keeping in mind that this is not a realistic representation of the stack, it's good enough to explain what's going on)
Those spaces reserved for the values of the variables are NOT set to a default value, they contain what there was in memory before (and right now we can't make any guess).
When I call this function from another function (i.e. main) a stack frame with that shape will be added (push) to the stack:
int main(){
int hello = 7;
int hey;
hey = George(); // Here we make the function call
return 0;
}
The stack will then be something like:
STACK:
| 'a': ____ | <- 'George' stack frame, containing
| 'b': ____ | local variables of George
|--------------|
| 'hello': 7 | <- 'main' stack frame, containing
| 'hey': ____ | local variables of main
|--------------|
After the 3rd line of George, just before the return, the stack will be:
STACK:
| 'a': 5 | <- Variable a has been set to 5
| 'b': ____ |
|--------------|
| 'hello': 7 |
| 'hey': ____ |
|--------------|
and then there will be a return. Here, the stack is pop'd, which means a frame is deleted (we are returning in the main function "domain", so we discard local variables of George).
STACK:
| 'hello': 7 | <- 'main' stack frame, with hey replaced with
| 'hey': 0 | George return value (0)
|--------------|
And everything works fine, BUT the memory we pop'd just right now is not set to 0 or some other default value. It stays like that until some other program overrides it. Which means that, while we are not using those values anymore, they're probably still there.
In this state, if we call George another time, we will have our stack pushed another time:
STACK:
| 'a': ____ | <- 'a' address is in the same position
| 'b': ____ | where there was the 'a' in the previous
|--------------| function call to Giorgio
| 'hello': 7 |
| 'hey': ____ |
|--------------|
In this state, if we don't assign any value to a it will (probably) contain the value that the a had on the previous function call, because the 'new' a has the same address of the 'previous' a, because the function called is the same and there was no other function call in between that could override the previous value. If we do so, before we assign any value to the new local variable a, it contains 5.
The same happens to your program when you run
cr(&head);
cr(&head);
twice, one before the other. The value of your local variable a is probably kept unchanged.
Anyway, explanation apart, NEVER use this kind of behaviors in your code. This is a very very bad way of programming, and the outcome is usually unpredictable.
I hope my explanation was not too bad.

Related

Creating a singly-linked list

I have a function to join two structs to create a linked list.
Here, is the code:
struct point{
int x;
int y;
struct point *next;
};
void printPoints(struct point *);
void printPoint(struct point *);
struct point * append(struct point *, struct point *);
void main(){
struct point pt1={1,-1,NULL};
struct point pt2={2,-2,NULL};
struct point pt3={3,-3,NULL};
struct point *start, *end;
start=end=&pt1;
end=append(end,&pt2);
end=append(end,&pt3);
printPoints(start);
}
void printPoint(struct point *ptr){
printf("(%d, %d)\n", ptr->x, ptr->y);
}
struct point * append(struct point *end, struct point *newpt){
end->next=newpt;
return newpt;
}
void printPoints(struct point *start){
while(start!=NULL){
printPoint(start);
start=start->next;
}
}
Here, the append function's task involves changing the end pointer.
Both the arguments of append function are pointers; in 1st case, 1st argument is &pt1 and 2nd argument is &pt2.
The function makes a copy of the end pointer which has the type struct point.
Since &pt1 is passed then this duplicate end pointer has x component as 1 and y component as -1 and next component as NULL.
Now we change this copy's next component to newpt pointer and return the newpt pointer.
Back to the main function, the original end pointer now has the value of &pt2.
end->next = newpt; shouldn't produce any change in the original end pointer in main because only the local pointer was changed.
So then why do I get a linked list.
What I get:
(1, -1)
(2, -2)
(3, -3)
What I think I should get:
(1, -1)
end->next = newpt; shouldn't produce any change in the original end pointer in main. Because, only the local pointer was changed
Not quite correct. It is true that when you call append, a copy of end is made. However, the -> operator dereferences what that pointer points to. You would get the same behavior with (*end).. Since the end in main is the same as the end in append, they both point to the same thing. You could have 100 copies of a pointer, all pointing to the same thing. If you choose one, follow what it points to and change that, then you've changed the same thing that all other 99 pointers point to. Furthermore, you reassign end in main by returning newpt, so each call to append results in an updated end. The output you observe is correct. Consider the condensed stack frames:
In main, at first call to append:
____main____
|___pt1____| <----+ <-+
|x=1 y=-1 | | |
|_next=NULL| | |
|___pt2____| | |
|___pt3____| | |
|__start___|------+ |
|___end____|-----------+ // cell sizes NOT drawn to scale
// start and end both point to pt1
Now, on the first call to append, the main stack frame stays the same, and a new one is created for append, where end and the address to pt2 are passed in.
|___main____
|___pt1____| <----+ <-+
|_next=NULL| | | // x and y omitted for brevity
|___pt2____| | |
|___pt3____| | |
|___end____|------+ |
|
___append__ |
|___&pt2___| |
|___end____|-----------+ // also points to pt1 back in main
When you use the -> operator, you dereference what that pointer points to. In this case, pt1, so both end in main and end in append point to pt1. In append, you do
end->next = newpt;
which is the address of pt2. So now your stack frames look like this:
|___main____
|___pt1____| <-----------+ <-+
|_next=&pt2|------+ | | // x and y omitted for brevity
| | | | | // (made pt1 cell bigger just to make the picture clearer, no other reason)
|__________| | | |
|___pt2____| <----+ | |
|___pt3____| | |
|___end____|-------------+ |
|
___append__ |
|___&pt2___| |
|___end____|------------------+ // also points to pt1 back in main
Finally, when you return from append, you return the address of pt2 and assign it to end, so your stack in main looks like this before the 2nd call to append (again, some cells made larger for picture clarity, this does not suggest anything grew in size):
____main____
|___pt1____| <-----------+
|_next=&pt2|---+ |
| | | |
|__________| | |
|___pt2____| <-+ <-+ |
|___pt3____| | |
|___end____|-------+ |
|___start__|-------------+ // flipped start and end position to make diagram cleaner, they don't really change positions on the stack
And you do it all again with your next call to append, passing in end (now points to pt2) and the address of pt3. After all the calls to append, start points to pt1, pt1->next points to pt2, and pt2->next points to pt3, just as you see in the output.
One final note, you have an incorrect function signature for main
As in the illustration start still points to p1 and end points to pt3.
void main(){
struct point pt1={1,-1,NULL};
struct point pt2={2,-2,NULL};
struct point pt3={3,-3,NULL};
struct point *start, *end;
start=end=&pt1;
end=append(end,&pt2);
end=append(end,&pt3);
printPoints(start);
}
As in the main function, you make start and end to point at pt1.
That's why any changes made to end is also seen from start.
struct point * append(struct point *end, struct point *newpt){
end->next=newpt;
return newpt;
}
In the append function, end->next=newpt which sets the next of end to newpt. In the first case, when end points pt1, the next is set to point at pt2. This change in the list is also seen from start.
Hence, the output you are getting is correct.
Changing Pointers
When you pass a pointer to a function, the value of the pointer (that is, the address) is copied into the function not the value it is pointing at.
So, when you dereference the pointer and change it, the change is also seen from any pointer which contain the same address.
Remember that p->next is the same as (*p).next.
void change_the_pointer_reference(int* ip)
{
int i = 1;
*ip = i;
printf("%d\n", *ip); // outputs 1
}
int main()
{
int i = 0;
change_the_pointer_reference(&i);
printf("%d\n", i); // outputs 1
}
But as the value of the pointer is copied, if you assign to the pointer, this change is only seen locally.
void change_the_pointer(int* ip)
{
int i = 1;
ip = &i;
printf("%d\n", *ip); // outputs 1
}
int main()
{
int i = 0;
change_the_pointer(&i);
printf("%d\n", i); // outputs 0
}
Last final note, you have an incorrect signature of main

How is memory allocated to multi-nested structs in C?

A couple of days ago, I asked this question. I duplicated 90% of the code given in the answer to my previous question. However, when I used Valgrind to do memcheck, it told me that there were memory leaks. But I don't think it was that 10% difference that caused the memory leaks. In addition to the memory leak issue, I have a couple of other questions.
A brief summary of my previous post:
I have a multi-nested struct. I need to correctly allocate memory to it and free the memory later on. The structure of the entire struct should look like this:
College Sys
| | | ... |
ColleA ColleB ColleC ... ColleX
| | | | | | | | | | | | ... | | | |
sA sB sC sD sA sB sC sD sA sB sC sD ... sA sB sC sD
| | | | | | | | | | | | ... | | | |
fam fam ...
// Colle is short for college
// s is short for stu (which is short for student)
There could be arbitrary number of colleges and students, which is controllable by #define MAX_NUM _num_.
As per the previous answer, I should allocate memory in the order of "outermost to innermost" and free the memory "innermost to outermost". I basically understand the logic behind the pattern. The following questions are extensions to it.
1) Does
CollegeSys -> Colle = malloc(sizeof(*(CollegeSys -> Colle)) * MAX_NUM);
CollegeSys -> Colle -> stu = malloc(sizeof(*(CollegeSys -> Colle -> stu)) * MAX_NUM);
CollegeSys -> Colle -> stu -> fam = malloc(sizeof(*(CollegeSys -> Colle -> stu -> fam)));
mean " there are MAX_NUM colleges under the college system, each of which has MAX_NUM students — each of which has one family"?
1.a)If yes, do I still need for loops to initialize every single value contained in this huge struct?
For example, the possibly correct way:
for (int i = 0; i < MAX_NUM; i++) {
strcpy(CollegeSys -> Colle[i].name, "collestr");
for (int n = 0; n < MAX_NUM; n++) {
strcpy(system -> Colle[i].stu[n].name, "stustr");
...
}
}
the possibly incorrect way:
strcpy(CollegeSys -> Colle -> name, "collestr");
strcpy(CollegeSys -> Colle -> stu -> name, "stustr");
I tried the "possibly incorrect way". There was no syntax error, but it would only initialize CollegeSys -> Colle[0].name and ... -> stu[0].name. So, the second approach is very likely to be incorrect if I want to initialize every single attribute.
2) If I modularize this whole process, separating the process into several functions that return corresponding struct pointers — newSystem(void), newCollege(void), newStudent(void) (arguments might not necessarily be void; we might also pass a str as the name to the functions; besides, there might be a series of addStu(), etc... to assign those returned pointers to the corresponding part of CollegeSys). When I create a new CollegeSys in newSystem(), is it correct to malloc memory to every nested struct once and for all within newSystem()?
2.a) If I allocate memory to all parts of the struct in newSystem(), the possible consequence I can think of so far is there would be memory leaks. Since we've allocated memory to all parts when creating the system, we inevitably have to create a new struct pointer and allocate adequate memory to it in the other two functions too. For instance,
struct Student* newStudent(void) {
struct Student* newStu = malloc(sizeof(struct Student));
newStu -> fam = malloc(sizeof(*(newStu -> fam)));
// I'm not sure if I'm supposed to also allocate memoty to fam struct
...
return newStu;
}
If so, we actually allocate the same amount of memory to an instance at least twice — one in the newSystem(void), the other in newStudent(void). If I'm correct so far, this is definitely memory leak. And the memory we allocate to the pointer newStu in newStudent(void) can never be freed (I think so). Then, what is the correct way to allocate memory to the whole structure when we separate the whole memory allocation process into several small steps?
3) Do we have to use sizeof(*(layer1 -> layer2 -> ...)) when malloc'ing memory to a struct nested in a struct? Can we directly specify the type of it? For example, doing this
CollegeSys -> Colle = malloc(sizeof(struct College) * MAX_NUM);
// instead of
// CollegeSys -> Colle = malloc(sizeof(*(CollegeSys -> Colle)) * MAX_NUM);
4) It seems even if we allocate a certain amount of memory to a pointer, we still can't prevent segfault. For example, we code
// this is not completely correct C code, just to show what I mean
struct* ptr = malloc(sizeof(struct type) * 3);
We still could call ptr[3], ptr[4] and so on, and the compiler will print out nonsense. Sometimes, the compiler may throw an error but sometimes may not. So, essentially, we can't rely on malloc (or calloc and so forth) to avoid the appearance of segfault?
I'm sorry about writing such a long text. Thanks for your patience.

Why is the following code susceptible to heap overflow attack

I'm new to cyber security, and I am trying to understand why the following code is susceptible to a heap overflow attack...
struct data {
char name[128];
};
struct fp {
int (*fp)();
};
void printName() {
printf("Printing function...\n");
}
int main(int argc, char **argv) {
struct data *d;
struct fp *f;
d = malloc(sizeof(struct data));
f = malloc(sizeof(struct fp));
f->fp = printName;
read(stdin,d->name,256);
f->fp();
}
Is it because of the read(stdin, d->name, 256) as it is reading more than the allocated buffer size of 128 for char name in struct data?
Any help would be great
A heap overflow attack is similar to a buffer overflow attack, except instead of overwriting values in the stack, the attacker tramples data in the heap.
Notice in your code that there are two dynamically allocated values:
d = malloc(sizeof(struct data));
f = malloc(sizeof(struct fp));
So d now holds the address of a 128-byte chunk of memory in the heap, while f holds the address of an 8-byte (assuming a 64-bit machine) chunk of memory. Theoretically, these two addresses could be nowhere near each other, but since they're both relatively small, it's likely that the OS allocated one larger chunk of contiguous memory and gave you pointers that are next to each other.
So once you run f->fp = printName;, your heap looks something like this:
Note: Each row is 8 bytes wide
| |
+------------------------+
f -> | <Address of printName> |
+------------------------+
| ▲ |
| 11 more rows |
| not shown |
| |
d -> | <Uninitialized data> |
+------------------------+
| |
Your initial assessment of where the vulnerability comes from is correct. d points to 128 bytes of memory, but you let the user write 256 bytes to that area. C has no mechanism for bounds checking, so the compiler is perfectly happy to let you write past the edge of the d memory. If f is right next to d, you'll fall over the edge of d and into f. Now, an attacker has the ability to modify the contents of f just by writing to d.
To exploit this vulnerability, an attacker feeds the address of some code that they've written to d by repeating it for all 256 bytes of input. If the attacker has stored some malicious code at address 0xbadc0de, they feed in 0xbadc0de to stdin 32 times (256 bytes) so that the heap gets overwritten.
| 0xbadc0de |
+-------------+
f -> | 0xbadc0de |
+-------------+
| ... |
| 0xbadc0de |
| 0xbadc0de |
d -> | 0xbadc0de |
+-------------+
| |
Then, your code reaches the line
f->fp();
which is a function call using the address stored in f. The machine goes to memory location f and retrieves the value stored there, which is now the address of the attacker's malicious code. Since we're calling it as a function, the machine now jumps to that address and begins executing the code stored there, and now you've got a lovely arbitrary code execution attack vector on your hands.

How can I free a struct pointer that i need

I am trying to avoid memory leaks in my code. I need to de-allocate pElement, line and pSecond, without losing the values inside pImage. Because I need to print those values inside my print function.
My add function contains struct GraphicElement *pElements;, struct GraphicElement *pSecond;, struct Point point;.
I allocate memory using malloc for each struct and then add the values and then I pass the final values into pImage. All my other functions work perfectly besides the fact that I always end up with 3 memory leaks. Because I didnt not free(pSecond);....free(pElement)...free(line);
If I try to free them before my function exits and after passing the values into pImage. My values all get erased.
How can I free those values inside my add function locally?
struct Point
{
int x, y;
};
struct Line
{
Point start;
Point end;
};
struct GraphicElement
{
enum{ SIZE = 256 };
unsigned int numLines; //number of lines
Line* pLines; //plines points to start and end
char name[SIZE];
};
typedef struct
{
unsigned int numGraphicElements;
GraphicElement* pElements; //the head points to pLines
} VectorGraphic;
void InitVectorGraphic(VectorGraphic*); //initializes pImage->pElement
void AddGraphicElement(VectorGraphic*); //Used to add
void ReportVectorGraphic(VectorGraphic*); // prints pImage contents
void CleanUpVectorGraphic(VectorGraphic*); //deallocates memory
How can I free those values inside my add function locally?
It is not possible to explicitly free a memory allocated locally. Nor to locally free some memory. Once freed, a memory slot cannot be accessed and the data stored inside are lost.
In C, you have two option to allocate some memory: you can allocate it on the heap or on the stack. The memory slots reserved on the heap can be accessed globally and will remain until they are explicitly freed. The one reserved on the stack are only valid while you stay within the context they were created.
Let's say you execute the following code :
void func()
{
int x = 3; // [2]
int * p = & x; // [3]
}
int main()
{
func(); // [1]
// [4]
return 0;
}
The instruction [2] will allocate some memory on the stack. The second one ([3]) will do the same and will store the address of the of the first variable in the new memory slot. After the function returns ([4]), this memory is freed. Graphically, here is what happens :
Context STACK Address
+---------+
| | 0xa1
main | | 0xa0
+---------+
[1] +---------+
=====> | |
func | | 0xa2
+---------+
| | 0xa1
main | | 0xa0
+---------+
[2] +---------+
=====> | |
func | 3 | 0xa2 <-- x
+---------+
| | 0xa1
main | | 0xa0
+---------+
[3] +---------+
=====> | 0xa2 | 0xa3 <-- p
func | 3 | 0xa2 <-- x
+---------+
| | 0xa1
main | | 0xa0
+---------+
[4] +---------+
=====> | | 0xa1
main | | 0xa0
+---------+
So if i use malloc inside a function. Once I exist the function the allocated memory on the heap is freed automatically?
It's the opposite. If you use, a function like malloc, the memory slot will be allocated on the heap. So if we change the line [3] above to something like
int * p = malloc(sizeof(int)); // [3]
The memory allocated on the stack will be freed as you'll leave the function, but the memory allocated on the heap will remain allocated and will still be accessible, until you free it. Graphically :
HEAP Address Free (y/n)
+---------+
| | 0xb4 - Yes
| | 0xb3 - Yes
+---------+
Context STACK Address
[3] +---------+ +---------+
=====> | 0xb4 | 0xa3 <-- p | | 0xb4 - No
func | 3 | 0xa2 <-- x | | 0xb3 - Yes
+---------+ +---------+
| | 0xa1
main | | 0xa0
+---------+
[4] +---------+ +---------+
=====> | | 0xa1 | | 0xb4 - No !!! Leak !!!
main | | 0xa0 | | 0xb3 - Yes
+---------+ +---------+
As you can see, after you leave the function, you have a memory leak as you don't have any pointer to the dynamically allocated memory. One way to avoid this is to return the pointer (so to pass the address of the new memory slot to the calling function) or to store it somewhere to free it later. It's also possible to allocate the memory before calling the function and to pass it to the function as a parameter. It really depends on your application.

Seg fault when using structure pointers to access struct members in C

What is wrong with my program, I get seg fault when I try to print the values.
My aim is assign some values in sample_function.
and in main function I want to copy the structure to another structure.
#include<stdio.h>
#include<string.h>
typedef struct
{
char *name;
char *class;
char *rollno;
} test;
test *
sample_function ()
{
test *abc;
abc = (test *)malloc(sizeof(test));
strcpy(abc->name,"Microsoft");
abc->class = "MD5";
abc->rollno = "12345";
printf("%s %s %s\n",abc->name,abc->class,abc->rollno);
return abc;
}
int main(){
test *digest_abc = NULL;
test *abc = NULL;
abc = sample_function();
digest_abc = abc;
printf(" %s %s %s \n",digest_abc->name,digest_abc->class,digest_abc->rollno);
return 1;
}
Pointer has always been a nightmare for me, I never understood it.
test * sample_function ()
{
test *abc;
strcpy(abc->name,"Surya");
What do you think abc points to, here? The answer is, it doesn't really point to anything. You need to initialize it to something, which in this case means allocating some memory.
So, let's fix that first issue:
test * sample_function ()
{
test *abc = malloc(sizeof(*abc));
strcpy(abc->name,"Surya");
Now, abc points to something, and we can store stuff in there!
But ... abc->name is a pointer too, and what do you think that points to? Again, it doesn't really point to anything, and you certainly can't assume it points somewhere you can store your string.
So, let's fix your second issue:
test * sample_function ()
{
test *abc = malloc(sizeof(*abc));
abc->name = strdup("Surya");
/* ... the rest is ok ... */
return abc;
}
Now, there's one last issue: you never release the memory you just allocated (this probably isn't an issue here, but it'd be a bug in a full-sized program).
So, at the end of main, you should have something like
free(abc->name);
free(abc);
return 1;
}
The final issue is a design one: you have three pointers in your structure, and only convention to help you remember which is dynamically allocated (and must be freed) and which point to string literals (and must not be freed).
That's fine, so long as this convention is followed everywhere. As soon as you dynamically allocate class or rollno, you have a memory leak. As soon as you point name at a string literal, you'll have a crash and/or heap damage.
As japreiss points out in a comment, a good way to enforce your convention is to write dedicated functions, like:
void initialize_test(test *obj, const char *name, char *class, char *rollno) {
obj->name = strdup(name);
...
}
void destroy_test(test *obj) {
free(obj->name);
}
test *malloc_test(const char *name, ...) {
test *obj = malloc(sizeof(*obj));
initialize_test(obj, name, ...);
return test;
}
void free_test(test *obj) {
destroy_test(obj);
free(obj);
}
In your function sample_function you return a pointer to abc. You cannot do this in C due to the way Activation Records are organized.
An Activation Record is a data structure that contains all the relevant information for a function call, parameters, return address, addresses of local variables, etc...
When you call a function a new Activation Record gets pushed onto the stack it could look something like this.
// Record for some function f(a, b)
| local variable 1 | <- stack pointer (abc in your case)
| local variable 2 |
| old stack pointer | <- base pointer
| return address |
| parameter 1 |
| parameter 2 |
---------------------
| caller activation |
| record |
When you return from a function this same activation record gets popped off of the stack but what happens if you returned the address of a variable that was on the old record ?
// popped record
| local variable 1 | <- address of abc #
| local variable 2 | #
| old stack pointer | # Unallocated memory, any new function
| return address | # call could overwrite this
| parameter 1 | #
| parameter 2 | #
--------------------- <- stack pointer
| caller activation |
| record |
Now you try to use abc and your program correctly crashes because it sees that you are accessing an area of memory that is unallocated.
You also have problems with allocation, but other answers have already covered that.
In sample_function you declare abc as a pointer to a test structure, but you never initialize it. It's just pointing off into the weeds somewhere. Then you try to dereference it to store values - BOOM.
Your program doesn't need any pointers at all; structures can be passed by value in C.
If you do want to keep similar interfaces to what you have now, you're going to have to add some dynamic allocations (malloc/free calls) to make sure your structures are actually allocated and that your pointers actually point to them.

Resources