In-order traversal deviating to other locations in memory - c

I'm trying to do an in-order traversal on a BST. In the first call of inOrder() everything works as expected: *node points to the root and in the debugger I can see that the whole three is represented correctly (i.e., the root's descendants are correctly represented).
However, in the next call on the left child of the root (i.e., *node now represents the root's left child), the tree is not represented correctly anymore. The only things that are correct are the value of *node and the right child being NULL. The left child is not the node with the value 3 that was appended before, but it has some weird values - it appears to point at a random memory location.
(If I run further, Xcode terminates saying: EXC_BAD_ACCESS...)
Can you explain me why this is so?
#include <stdio.h>
typedef struct Node Node;
struct Node
{
int value;
Node *left;
Node *right;
};
void inOrder(Node *node)
{
if(node!=NULL)
{
inOrder(node->left);
printf("%d,", node->value);
inOrder(node->right);
}
}
void append(Node *root)
{
Node n = {3,NULL};
root->left->left = &n;
}
int main(int argc, const char * argv[])
{
Node a = {10, NULL};
Node b = {5, NULL};
Node root = {8, &b, &a};
// appends a node with value 3 to the node 5 (just a test)
append(&root);
inOrder(&root);
puts("\n");
}

Inside append() function,
void append(Node *root)
{
Node n = {3,NULL};
root->left->left = &n;
}
you're trying to return the address of a local variable n from the function. Outside append(), the address of n is invalid. Using that invokes undefined behaviour.
Solution: Define n as a pointer of type Node, allocate memory dynamically using malloc() and then you can return the pointer. The lifetime of dynamically allocated memory remains valid until deallocated.

root->left->left = &n;
The variable n is local to the function append() and once you exit the function append() n is no more valid.
So accessing the variable out of its scope leads to undefined behavior

Related

Why does my Recursion using Pointer in C crash?

I'm currently studying Computer Science and we started working with Pointers. I had the feeling that I started to understand pointers but I ran into a problem and can't figure out what went wrong.
We defined a Tree like this:
typedef struct node *tree;
struct node {int key; tree left, right;};
Now we should write a function that creates nodes with three parameters, the key of the node, the left node and the right node that should be "below" the node. I did it like this and it seemed to work:
tree createNode(int n, tree l, tree r){
tree node = (tree) malloc(sizeof(tree));
node->key = n;
node->left = l;
node->right = r;
return node;
}
Finally we should write a function that multiplies all leaves of the Tree and i thought the easiest way would be to start at the root and search for the leaves through a recursion and then multiply them. But when i call the function the program seem to crash in the middle of the function. My function looks like this:
int leafprod(tree t){
printf("%d\n", t->key);
if (t->left == NULL){
if (t->right == NULL){
printf("$1\n\n");
return t->key;
}
printf("$2\n\n");
return leafprod(t->right);
}
if (t->right == NULL){
printf("$3\n\n");
return leafprod(t->left);
}
printf("$4\n\n");
return leafprod(t->left) * leafprod(t->right);
}
and i call the function in the main function like this:
int main(){
tree a = createNode(1, NULL, NULL);
tree b = createNode(2, NULL, NULL);
tree c = createNode(3, a, NULL);
tree d = createNode(4, b, c);
int n = leafprod(d);
printf("end: %d", n);
free(a);
free(b);
free(c);
free(d);
return 0;
}
I used the print statements to follow the programm and try to locate the error, but in most cases it prints nothing. Then sometimes it prints:
4
$4
2
$2
And only two times the program went through the whole code. I believe maybe I am using the malloc function wrong but I cannot tell.
The problem is in this line:
tree node = (tree) malloc(sizeof(tree));
tree is typedef for a pointer to struct node.
Therefore sizeof(tree) is just the size of the pointer.
You should one of the following instead:
tree node = malloc(sizeof(*l));
Or:
tree node = malloc(sizeof(*r));
*l and *r are of type struct node (not a pointer) and this is the element you are trying to create.
Another option as #IanAbbott commented is:
tree node = malloc(sizeof(*node));
Note that node here is the name of the variable, not the type (which in C requires to be prefixes with struct, i.e. struct node).
The advantage of this approach is that the statement is not dependent on other variables.
Side notes:
You shouldn't cast the result of malloc. See here:Do I cast the result of malloc?.
It's not a good practice to hide pointer types with typedefs. You can consider to avoid it (you can use typedef struct node Node if you want to save the need to use the struct keyword everywhere).
Don't use the typedef for hiding pointer types.
With typedef struct node *tree; you have no idea what tree is in your code hence the confusion.
Typically here tree node = (tree) malloc(sizeof(tree)) you did it wrong, and the main reason was probably because you didn't know anymore what tree actually was.
...
struct node
{
int key;
struct node* left;
struct node *right;
};
struct node* createNode(int n, struct node*l, struct node*r) {
struct node* node = malloc(sizeof(*node)); // don't use the cast, it's useless
...
}
int leafprod(struct node* t) {
...

Malloc function in dynamic lists

I'm getting started with dynamic lists and i don't understand why it is necessary to use the malloc function even when declaring the first node in the main() program, the piece of code below should just print the data contained in the first node but if i don't initialize the node with the malloc function it just doesn't work:
struct node{
int data;
struct node* next;
};
void insert(int val, struct node*);
int main() {
struct node* head ;
head->data = 2;
printf("%d \n", head->data);
}
You don’t technically, but maintaining all nodes with the same memory pattern is only an advantage to you, with no real disadvantages.
Just assume that all nodes are stored in the dynamic memory.
Your “insert” procedure would be better named something like “add” or (for full functional context) “cons”, and it should return the new node:
struct node* cons(int val, struct node* next)
{
struct node* this = (struct node*)malloc( sizeof struct node );
if (!this) return next; // or some other error condition!
this->data = val;
this->next = next;
return this;
}
Building lists is now very easy:
int main()
{
struct node* xs = cons( 2, cons( 3, cons( 5, cons( 7, NULL ) ) ) );
// You now have a list of the first four prime numbers.
And it is easy to handle them.
// Let’s print them!
{
struct node* p = xs;
while (p)
{
printf( "%d ", p->data );
p = p->next;
}
printf( "\n" );
}
// Let’s get the length!
int length = 0;
{
struct node* p = xs;
while (p)
{
length += 1;
p = p->next;
}
}
printf( "xs is %d elements long.\n", length );
By the way, you should try to be as consistent as possible when naming things. You have named the node data “data” but the constructor’s argument calls it “val”. You should pick one and stick to it.
Also, it is common to:
typedef struct node node;
Now in every place except inside the definition of struct node you can just use the word node.
Oh, and I almost forgot: Don’t forget to clean up with a proper destructor.
node* destroy( node* root )
{
if (!root) return NULL;
destroy( root->next );
free( root );
return NULL;
}
And an addendum to main():
int main()
{
node* xs = ...
...
xs = destroy( xs );
}
When you declare a variable, you define the type of the variable, then it's
name and optionally you declare it's initial value.
Every type needs an specific amount of memory. For example int would be
32 bit long on a 32bit OS, 8 bit long on a 64.
A variable declared in a function is usually stored in the stack associated
with the function. When the function returns, the stack for that function is
no longer available and the variable does not longer exist.
When you need the value/object of the variable to exist even after a function
returns, then you need to allocate memory on a different part of the program,
usually the heap. That's exactly what malloc, realloc and calloc do.
Doing
struct node* head ;
head->data = 2;
is just wrong. You've declaring a pointer named head of type struct node,
but you are not assigning anything to it. So it points to an unspecified
location in memory. head->data = 2 tries to store a value at an unspecified
location and the program will most likely crash with a segfault.
In main you could do this:
int main(void)
{
struct node head;
head.data = 2;
printf("%d \n", head.data);
return 0;
}
head will be saved in the stack and will persist as long as main doesn't
return. But this is only a very small example. In a complex program where you
have many more variables, objects, etc. it's a bad idea to simply declare all
variables you need in main. So it's best that objects get created when they
are needed.
For example you could have a function that creates the object and another one
that calls create_node and uses that object.
struct node *create_node(int data)
{
struct node *head = malloc(sizeof *head);
if(head == NULL)
return NULL; // no more memory left
head->data = data;
head->next = NULL;
return head;
}
struct node *foo(void)
{
struct node *head = create_node(112);
// do somethig with head
return head;
}
Here create_node uses malloc to allocate memory for one struct node
object, initializes the object with some values and returns a pointer to that memory location.
foo calls create_node and does something with it and it returns the
object. If another function calls foo, this function will get the object.
There are also other reasons for malloc. Consider this code:
void foo(void)
{
int numbers[4] = { 1, 3, 5, 7 };
...
}
In this case you know that you will need 4 integers. But sometimes you need an
array where the number of elements is only known during runtime, for example
because it depends on some user input. For this you can also use malloc.
void foo(int size)
{
int *numbers = malloc(size * sizeof *numbers);
// now you have "size" elements
...
free(numbers); // freeing memory
}
When you use malloc, realloc, calloc, you'll need to free the memory. If
your program does not need the memory anymore, you have to use free (like in
the last example. Note that for simplicity I omitted the use of free in the
examples with struct head.
What you have invokes undefined behavior because you don't really have a node,, you have a pointer to a node that doesn't actually point to a node. Using malloc and friends creates a memory region where an actual node object can reside, and where a node pointer can point to.
In your code, struct node* head is a pointer that points to nowhere, and dereferencing it as you have done is undefined behavior (which can commonly cause a segfault). You must point head to a valid struct node before you can safely dereference it. One way is like this:
int main() {
struct node* head;
struct node myNode;
head = &myNode; // assigning the address of myNode to head, now head points somewhere
head->data = 2; // this is legal
printf("%d \n", head->data); // will print 2
}
But in the above example, myNode is a local variable, and will go out of scope as soon as the function exists (in this case main). As you say in your question, for linked lists you generally want to malloc the data so it can be used outside of the current scope.
int main() {
struct node* head = malloc(sizeof struct node);
if (head != NULL)
{
// we received a valid memory block, so we can safely dereference
// you should ALWAYS initialize/assign memory when you allocate it.
// malloc does not do this, but calloc does (initializes it to 0) if you want to use that
// you can use malloc and memset together.. in this case there's just
// two fields, so we can initialize via assignment.
head->data = 2;
head->next = NULL;
printf("%d \n", head->data);
// clean up memory when we're done using it
free(head);
}
else
{
// we were unable to obtain memory
fprintf(stderr, "Unable to allocate memory!\n");
}
return 0;
}
This is a very simple example. Normally for a linked list, you'll have insert function(s) (where the mallocing generally takes place and remove function(s) (where the freeing generally takes place. You'll at least have a head pointer that always points to the first item in the list, and for a double-linked list you'll want a tail pointer as well. There can also be print functions, deleteEntireList functions, etc. But one way or another, you must allocate space for an actual object. malloc is a way to do that so the validity of the memory persists throughout runtime of your program.
edit:
Incorrect. This absolutely applies to int and int*,, it applies to any object and pointer(s) to it. If you were to have the following:
int main() {
int* head;
*head = 2; // head uninitialized and unassigned, this is UB
printf("%d\n", *head); // UB again
return 0;
}
this is every bit of undefined behavior as you have in your OP. A pointer must point to something valid before you can dereference it. In the above code, head is uninitialized, it doesn't point to anything deterministically, and as soon as you do *head (whether to read or write), you're invoking undefined behavior. Just as with your struct node, you must do something like following to be correct:
int main() {
int myInt; // creates space for an actual int in automatic storage (most likely the stack)
int* head = &myInt; // now head points to a valid memory location, namely myInt
*head = 2; // now myInt == 2
printf("%d\n", *head); // prints 2
return 0;
}
or you can do
int main() {
int* head = malloc(sizeof int); // silly to malloc a single int, but this is for illustration purposes
if (head != NULL)
{
// space for an int was returned to us from the heap
*head = 2; // now the unnamed int that head points to is 2
printf("%d\n", *head); // prints out 2
// don't forget to clean up
free(head);
}
else
{
// handle error, print error message, etc
}
return 0;
}
These rules are true for any primitive type or data structure you're dealing with. Pointers must point to something, otherwise dereferencing them is undefined behavior, and you hope you get a segfault when that happens so you can track down the errors before your TA grades it or before the customer demo. Murphy's law dictates UB will always crash your code when it's being presented.
Statement struct node* head; defines a pointer to a node object, but not the node object itself. As you do not initialize the pointer (i.e. by letting it point to a node object created by, for example, a malloc-statement), dereferencing this pointer as you do with head->data yields undefined behaviour.
Two ways to overcome this, (1) either allocate memory dynamically - yielding an object with dynamic storage duration, or (2) define the object itself as an, for example, local variable with automatic storage duration:
(1) dynamic storage duration
int main() {
struct node* head = calloc(1, sizeof(struct node));
if (head) {
head->data = 2;
printf("%d \n", head->data);
free(head);
}
}
(2) automatic storage duration
int main() {
struct node head;
head.data = 2;
printf("%d \n", head.data);
}

Binary Search Tree insertion not working when inserting nodes using while loop

I have posted the link to my BST code on ideone: http://ideone.com/P7850n
In the main function I am getting an error when I read values in the while loop and insert into BST, but it works fine if I use a for loop. What could be the possible explanation for this error which occurs only with the while loop ?
#include <stdio.h>
#include <stdlib.h>
//data struct for BST node
typedef struct BST
{
int data;
struct BST *left;
struct BST *right;
}node;
//make node from given data
node* makeNode(int data)
{
node *n=(node*)malloc(sizeof(node));
n->data=data;
n->left=NULL;
n->right=NULL;
return n;
}
//insert node in BST
node* insert(node* root,int key)
{
if(root==NULL)
return makeNode(key);
if(key < root->data)
root->left=insert(root->left,key);
else
root->right=insert(root->right,key);
return root;
}
//inorder printing prints in sorted order
void inorder(node* root)
{
if(root==NULL)
return;
inorder(root->left);
printf("%d ",root->data);
inorder(root->right);
}
//driver function
int main(void) {
// your code goes here
node *root;
int s,i,key;
scanf("%d",&s);
while(s--)
//for(i=0;i<s;i++)
{
scanf("%d",&key);
root=insert(root,key);
}
inorder(root);
return 0;
}
Most probably this is an uninitialized variable root.
The compiler re-uses the same memory for variables, either declared in your program or used internally, after they are not anymore needed, so that other variables later occupy the same memory. In C (unlike, say, Perl), when memory is assigned to a variable, it is not automatically cleared: you should do it yourself, which is called initialization: typically as soon as you declare a variable, you should assign it some value: int year = 2014;. If you use a variable before you assign it a value, it's value will be whatever happens to be in memory that it occupies, left from other variables or even other running programs.
In your case, when you initialize the for loop with i=0, this 0 probably uses the memory later used for root, so accidentally it works. When you initialize the while loop with non-zero s, root uses memory that happens to be non-zero.
The solution is to initialize root = NULL;, and in general it's a good habit to always initialize all variables.
Without node *root = NULL; you are trying to access undefined memory address as root will contain any random data. So you can get valid behavior or any other behavior including crash.
As root is not initialized in inser() function if(root==NULL) may or may not be true and hence you will get different behavior.
This has nothing to do with for or while loop.
its always to initialize any memory variable to NULL or any other variable to 0,while writing any piece of code,otherwise you will always get any unpredictable crash or result.
like in this case,do like this below:
node *root;
int s,i,key;
to
node *root = NULL;
int s =0;
int i = 0;
int key= 0;

Call by value vs call by reference while dealing with binary tree

When we want to change the value of an ordinary variable by in a function we pass it using call by reference. But I am not able to understand the intricacies when we have to pass a pointer variable(like a node of a binary tree) using call by refence. I understand that if we want to modify the poiter variable to point to another node we have to use call by reference. But what if we have to modify the data element of the root. I thought that to change it also we would need a call by reference. But the following code snippet is giving an output of 10, 10, 10 even though I have passed the root node of the tree using call by value in the function modifyTree. Am I missing something over here?
#include<stdio.h>
#include<stdlib.h>
struct node
{
int data;
struct node* left;
struct node* right;
};
/* Helper function that allocates a new node with the
given data and NULL left and right pointers. */
struct node* newNode(int data)
{
struct node* node = (struct node*)malloc(sizeof(struct node));
node->data = data;
node->left = NULL;
node->right = NULL;
return(node);
}
/* This function sets the data fields of some of the nodes of tree to 10*/
void modifyTree(struct node* node)
{
node->data = 10;
node->left->data = 10;
node->right->data = 10;
}
int main()
{
struct node *root = newNode(1);
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
modifyTree(root);
printf("%d\n", root->data);
printf("%d\n", root->left->data);
printf("%d\n", root->right->data);
getchar();
return 0;
}
Passing a pointer by value means the called function receives the exact same pointer value that the caller used, so any accesses through that pointer will refer to the same memory.
You would need a double pointer if you wanted the function to modify the pointer value the caller has (for instance by allocating a new tree, thus "creating" a new pointer value).
You are passing the pointer by value, but the pointer still points at the same thing. I'll use some hypothetical values to demonstrate.
In main you allocate a new struct node. Let's say it gets created at memory location 0x12345. So now your struct node *root contains 0x12345.
You now call modifyTree(root);. root gets passed by value to the root argument of modifyTree.
That root now contains 0x12345. It's pointing at the same memory location.
So when you access that location with node->data = 10, you are accessing the same memory you created in main.
You pass the pointer by value yes, but what you are changing inside the modifyTree function are elements of the struct that the pointer is pointing to. Passing the pointer to the struct by value will not prevent you from changing the internal contents of the struct being pointed at by your parameter. If it were the pointer itself that you were changing, then you would see the behaviour that you are expecting.

Segmentation Fault - Displaying Tree

I get a segfault when calling viewTree(root);
struct treeElement {
unsigned long weight;
unsigned short id;
char chr;
struct treeElement *lchild, *rchild, *parent;
};
typedef struct treeElement node;
node *root;
//INITIALIZE TREE
void initTree() {
root = malloc(sizeof(node));
currentNYT = root;
} //initTree
//VIEW TREE
void viewTree(node *tree) {
printf("%5d%5d%5d%5d%5c%lu", tree->id, tree->parent->id, tree->lchild->id, tree->rchild->id, tree->chr, tree->weight);
viewTree(tree->lchild);
viewTree(tree->rchild);
}
//ADD NODE
void addNode(char newNodeChr) {
node *newNYT, *newExternal;
newNYT = malloc(sizeof(node));
newNYT->id=maxNodes-idCount; idCount++;
newNYT->chr='\0';
newNYT->weight=0;
newNYT->parent=currentNYT;
newNYT->lchild=newNYT->rchild=NULL;
newExternal = malloc(sizeof(node));
newExternal->id=maxNodes-idCount;
newExternal->chr=newNodeChr;
newExternal->weight=1;
newExternal->parent=currentNYT;
newExternal->lchild=newExternal->rchild=NULL;
currentNYT->lchild = newNYT;
currentNYT->rchild = newExternal;
currentNYT=newNYT;
} //addNode
int main()
{
initTree();
addNode('a');
addNode('b');
viewTree(root);
getchar();
return 0;
}
Does the root node have a parent? Do the child leaf nodes have left and right children?
I think most of your problem lies in your printf statement - you don't check whether or not any of the objects you're accessing actually exist before you try to print their ids. Add some if statements in there and see if it helps.
In your viewTree(node *tree) you are not checking if tree is null or not. Definite recipe for segfault when you try to access tree->id when tree is null.
null will be passed for a subtree in a recursive call eventually.
EDIT: In general you have check for null every time you need to access a member of an object. So, tree != null before reading tree->id and tree->lchild != null before reading tree->lchild->id must be ensured.
Don't just allocate the root node, but initialize it, especially the pointers to siblings and to parent (set them to NULL). You are using the uninitialized pointers when adding nodes.

Resources