Is binary tree search lying to me? - c

Hey I'm trying to write a program that will take a list of strings (these are in order):
polymorphism
object
templates
structure
class
pointer
reference
traversal
inheritance
exceptions
recursive
overloading
And then store these strings in a binary tree and finally do an in-order traversal.
However, I'm having a problem that I just can't figure out. My function to add nodes keeps telling me that I've already added the node but, it never actually gets added?? My output is like this:
ADDED NODE: polymorphism
ERROR: Same Data: object, object
ERROR: Same Data: templates, templates
ERROR: Same Data: structure, structure
ERROR: Same Data: class, class
ERROR: Same Data: pointer, pointer
(etc...)
ERROR: overloading, overloading
ERROR: overloading, overloading
FINISHED BUILDING
overloading
Finally, here's the source code:
#include <stdlib.h>
#include <stdio.h>
struct tree {
char* data;
struct tree *left;
struct tree *right;
};
void buildTree(struct tree**);
void printAlpha(struct tree*);
void insert(struct tree **root, char *n);
int main(int argc, char* argv[]) {
struct tree* myTree = NULL;
buildTree(&myTree);
printf("FINISHED BUILDING\n\n");
printAlpha(myTree);
system("PAUSE");
return 0;
}
/*Builds tree from text file*/
void buildTree(struct tree **root) {
FILE* fIn = fopen("./in.txt", "r");
char* input = (char*) malloc(sizeof(char));
if(!fIn) {
printf("ERROR: Cannot find file\n");
return;
}
while(!feof(fIn) && fscanf(fIn, "%s", input)) {
// printf("INP:%s\n", input);
insert(root, input);
}
}
void insert(struct tree **root, char *n) {
if (*root == NULL) {
// found the spot, create and insert the node
struct tree *newNode = NULL;
newNode = (struct tree*) malloc(sizeof(struct tree) );
newNode->data = n;
newNode->left = NULL;
newNode->right = NULL;
*root = newNode;
printf("ADDED NODE: %s\n", newNode->data);
}
else if(strcmp(n, (*root)->data) < 0)
insert(&((*root)->left), n);
else if(strcmp(n, (*root)->data) > 0)
insert(&((*root)->right), n);
else
printf("ERROR: Same data: %s, %s\n", (*root)->data, n);
}
/*In order traversal*/
void printAlpha(struct tree *root) {
struct tree *curNode = root;
/*If empty something went wrong*/
if(!curNode) {
printf("Error: Binary Tree Is Empty!\n");
// return;
}
if(curNode->left != NULL) {
printAlpha(root->left);
}
printf("%s\n", curNode->data);
if(curNode->right != NULL) {
printAlpha(curNode->right);
}
}

You are creating a single string (char* input = (char*) malloc(sizeof(char));) and overwriting its contents each time. You insert this single string into the tree, then the next time compare it against itself.
Solution: Move the malloc inside the loop.

Related

Passing arguments in a BST insert function by extracting data using sscanf

I am writing a program that reads students (id, name, surname, grade) from a text file (line by line) and then stores them to a Binary Search Tree by using id as a key. To read each line i use fgets() and to extract the words from the line is use sscanf().
struct TreeNode* root = NULL;
FILE *text;
char *id, *onoma, *epitheto, *word, *line;
onoma = (char *)malloc(20 * sizeof(char));
epitheto = (char *)malloc(30 * sizeof(char));
id = (char *)malloc(9 * sizeof(char));
float vathmos;
text = fopen("students.txt", "r");
if (text == NULL) {
printf("Cannot read from the file!");
exit(1);
}
This is the loop where the data are extracted for each student:
while (fgets(line, 50, text) != NULL) {
printf("%d \n", root);
sscanf(line, "%s %s %s %f", id, onoma, epitheto, &vathmos);
printf("%s %s %s %.3f \n", id, onoma, epitheto, vathmos);
root = Insert(root, id);
}
And this is the insert function for the node:
TreeNode *Insert(struct TreeNode* root, char *data) {
if (root == NULL) { // empty tree
root = CreateNewNode(data);
}
// if data to be inserted is lesser, insert in left subtree.
else if ((strcmp(data, root->id)) <= 0) {
root->left = Insert(root->left,data);
}
// else, insert in right subtree.
else if ((strcmp(data, root->id)) > 0) {
root->right = Insert(root->right,data);
}
return root;
}
When I insert nodes "by hand" e.g.:
root = Insert(root, "AY881159");
root = Insert(root, "AA564510");
root = Insert(root, "AB784123");
the program works and the nodes are created and the tree can be manipulated.
But when the tree is created in the fgets() loop by getting the data from the sscanf(), there is a problem. While the variables store the data correctly (that's why I have the printf() after the sscanf() to check this), the root seems to reset and only the last student is kept in the tree.
Any ideas?
The code for the nodes is:
typedef struct TreeNode {
char *id;
struct TreeNode *left;
struct TreeNode *right;
} TreeNode;
and
TreeNode *CreateNewNode(char *data) {
struct TreeNode *NewNode = (TreeNode *)malloc(sizeof(TreeNode));
NewNode->id = data;
NewNode->left = NewNode->right = NULL;
return NewNode;
}
The code you posted cannot be compiled, it makes it more difficult to answer questions.
You create all nodes in the loop from the same id buffer. You need to make a copy of the buffer, either when calling Insert of preferably in the CreateNewnode() function. You did not post the code for that, nor did you post the definition of type TreeNode. Here is a possibility:
TreeNode *CreatNewNode(const char *data) {
TreeNode *node = calloc(1, sizeof(*node));
if (node != NULL) {
node->id = strdup(data);
node->left = node->right = NULL;
}
return node;
}
There is no need to allocate the arrays for the parse phase, local char arrays are fine for this, but data you store to the tree should be duplicated so you can reuse the buffers from the parsing code. Make the argument to Insert a const char *data to indicate the buffer will not be modified, nor owned by the tree after the call.
You must pass extra information to scanf to prevent buffer overflow.
Here is a modified version of the calling code:
int main(void) {
struct TreeNode *root = NULL;
FILE *text;
char id[9], onoma[20], epitheto[30], line[256];
float vathmos;
text = fopen("students.txt", "r");
if (text == NULL) {
printf("Cannot read from the file!");
exit(1);
}
// This is the loop where the data are extracted for each student:
while (fgets(line, sizeof line, text) != NULL) {
printf("%d \n", root);
if (sscanf(line, "%8s %19s %29s %f", id, onoma, epitheto, &vathmos) == 4) {
printf("%s %s %s %.3f \n", id, onoma, epitheto, vathmos);
root = Insert(root, id);
} else {
printf("invalid line: %s", line);
}
// I'm curious how you are going to store the other data...
}
...
}
The Insert function can be simplified:
TreeNode *Insert(struct TreeNode *root, const char *data) {
if (root == NULL) { // empty tree
root = CreateNewNode(data);
} else {
if (strcmp(data, root->id) <= 0) {
// if data to be inserted is lesser or equal, insert in left subtree.
root->left = Insert(root->left, data);
} else {
// else insert in the right subtree
root->right = Insert(root->right, data);
}
}
return root;
}
A better API for InsertNode would be to take a pointer to the root pointer and return a pointer to the new node:
TreeNode *Insert(struct TreeNode **nodep, const char *data) {
while (*nodep != NULL) {
if (strcmp(data, (*nodep)->id) <= 0) {
nodep = &(*nodep)->left;
} else {
nodep = &(*nodep)->right;
}
}
return *nodep = CreateNewNode(data);
}

Why does this program always print the same node from a binary search tree

I have this modular program that should store a text file line by line and store each line as a string in each node of a binary search tree, we have to use a binary search tree for the assignment. Then when the user is prompted to enter a word, e.g. "Bus" it should print any string which contain the word bus. But whichever word I input, it always prints the last line of the text file (last string put in to the BST):
Enter word to search for a phone number: Bus
General enquires 01179222000
phone.txt:
Household waste and street maintenance 01179222100
Textphone for deaf people only 01173574444
Council housing 01179222200
Housing benefit and council tax reduction 01179222300
Council tax 01179222900
Electoral services 01179223400
Planning and building regulations 01179223000
Home Choice Bristol 01179222400
Pest control and dog wardens pollution and public safety 01179222500
report anti social behaviour and nuisance 01179222500
Bus passes and disabled parking permits 0117 922 2600
Residents parking permits 01179222600
Adult care and social services 01179222700
Transport and streets 01179222100
Register office 01179222800
Business rates 01179223300
Highways 01179222100
General enquires 01179222000
map.h
#include <stdbool.h>
// node structure
struct node;
// tree wrapper structure
struct tree;
typedef struct tree Tree;
typedef struct node Node;
// creates a new tree
Tree *new_tree();
// create a new node
Node *NewNode(char *data);
// insert in to the binary tree
Node *insert(Node *node, char *data);
// search for nodes to see if they exist
bool NodeSearch(Node *node, char *data);
map.c
// Binary search tree implementation
#include <stdio.h>
#include <stdlib.h>
#include "map.h"
struct node {
char *data;
struct node *left;
struct node *right;
} ;
// tree wrapper structure
struct tree {
struct node *root;
} ;
//create a new tree
Tree *new_tree() {
Tree *t = malloc(sizeof(Tree));
t->root = NULL;
return t;
}
//create a new node
Node *NewNode(char *data) {
Node *node = malloc(sizeof *node);
node->data = malloc(sizeof data);
node->left = NULL;
node->right = NULL;
return(node);
}
// insert in to the binary tree
Node *insert(Node *node, char *data) {
// 1. If the tree is empty, return a new, single node
if (node == NULL) {
return (NewNode(data));
}
else {
// 2. Otherwise, recur down the tree
if (data <= node->data)
node->left = insert(node->left, data);
else
node->right = insert(node->right, data);
return (node); // return the (unchanged) node pointer
}
}
// search for nodes to see if they exist
bool NodeSearch(Node* node,char *data) {
if (node == NULL)
return false;
else if(node->data == data)
return true;
else if(data <= node->data)
return NodeSearch(node->left, data);
else
return NodeSearch(node->right, data);
}
mainphone.c
#include "map.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h>
int main() {
new_tree();
Node *node = NULL;
FILE *f;
char s[300];
f = fopen("phone.txt", "r");
while (fgets(s, 300, f) != NULL);
NewNode(s);
insert(node, s);
printf("Enter a word to search for a phone number: ");
char word[100];
scanf("%s", word);
NodeSearch(node, word);
if (strstr(word, s) == NULL) {
printf("%s\n", s);
}
else {
printf("ERROR\n");
}
fclose(f);
return 0;
}
The most important mistake I can find after fixing the code formatting which was very unreadable is,
while (fgets(s, 300, f) != NULL);
// ^ I don't think so
it's very unlikely that the semicolon was intentional.
The very reason for such a mistake is of course that your code is a mess, but there are other mistakes too.
This allocation seems to be wrong because it does not make sense
node->data = malloc(sizeof data);
this will allocate space to hold a pointer, sizeof data is probably 8 or 4, I don't think you want that but rather
node->data = malloc(1 + strlen(data));
You always assume that the allocation was successful, you must check if malloc() returns NULL which indicates a problem, most commonly that there is not enough available memory to allocate.
When you create a node you never copy the actual data into the data field of your structure, you should at least do something like this
size_t length;
length = strlen(data);
node->data = malloc(length + 1);
if (node->data != NULL)
memcpy(node->data, data, length + 1);
else
warn_about_memory_allocation_issue();
if let this go, and ignore the error then you should alwash check the data field for NULL before dereferencing it.
If you intend to create a sorted list, this will not work
if (data <= node->data)
node->left = insert(node->left, data);
else
node->right = insert(node->right, data);
return node;
because you are comparing the addresses of the involved pointers, not the strings. You need strcmp() instead
if (strcmp(data, node->data) < 0)
node->left = insert(node->left, data);
else
node->right = insert(node->right, data);
return node;

Binary search tree output file is not the expected output data

Okay so I am creating a Binary Search Tree that reads in strings and stores them in the tree. I am trying to confirm that each string has it's own node and each string is actually being read in. When my program is run, I believe it is creating seven nodes, one for each of the strings in the input file. So I created an Output file that prints the string that was just read to make sure each string is being stored in a node. There are seven strings in my input file :
bring
awake
anger
carry
global
fixed
halt
Here is the code for my program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLEN 15
typedef struct treeNode{
char string[MAXLEN+1];
struct treeNode *left;
struct treeNode *right;
}treeNode;
treeNode * insert(treeNode *node, char s[MAXLEN]){
puts("running insert");
if(node == NULL){
node = (treeNode *)malloc(sizeof(treeNode));
strncpy(node -> string, s, MAXLEN);
node -> left = NULL;
node -> right = NULL;
}
else if(strcmp(node->string, s)>0){
node -> right = insert(node->right, s);
}
else if(strcmp(node->string, s)<0){
node -> left = insert(node->left, s);
}
else if(strcmp(node->string, s) == 0){
node -> left = insert(node->left, s);
}
return node;
}
int main(int argc, char *argv[]){
treeNode *root = NULL;
FILE *ifp;
FILE *ofp;
char s[MAXLEN+1];
if(argc != 3){
fprintf(stderr, "Usage: %s file\n", argv[1]); exit(1);
}
if((ifp = fopen(argv[2], "r")) == NULL){
fprintf(stderr, "Could not open file: %s\n", argv[2]); exit(1);
}
ofp = fopen("output.txt", "w+");
while(fscanf(ifp, "%s\n", &s) != EOF){
root = insert(root, s);
fprintf(ofp, "%s\n", root->string);
}
return 0;
}
And this is what the output file looks like after running the program:
bring
bring
bring
bring
bring
bring
bring
Now there are seven strings in each file so i am assuming each one is read. But how can I know if my program successfully created a node for each string?
How can I fix the problem? Any help would be appreciated! Thanks!
FULL CODE repaired : http://pastebin.com/5BTnxTcd
OTHER optimizing code , leave it to you.
It seem your problem is return. It need to be like this, because you need to return root of tree->
treeNode * insert(treeNode *node, char s[MAXLEN]){
puts("running insert");
if(node == NULL){
node = (treeNode *)malloc(sizeof(treeNode));
strncpy(node -> string, s, MAXLEN);
node -> left = NULL;
node -> right = NULL;
}
else if(strcmp(node->string, s)>0){
node -> right = insert(node->right, s);
}
else if(strcmp(node->string, s)<0){
node -> left = insert(node->left, s);
}
else if(strcmp(node->string, s) == 0){
node -> left = insert(node->left, s);
}
return node;
}
**EDIT: ** It seem you have problem also with argv, argv[0] is name of program
fprintf(stderr, "Usage: %s file\n", argv[1]); exit(1);
need to be
fprintf(stderr, "Usage: %s file\n", argv[0]); exit(1);
EDITEDIT :
You need function to do recursive process to tree, and output first left side then right and it will be:
void treeprint( treeNode *node , FILE *OUTPUT_FILE)
{
if ( node != NULL)
{
treeprint(node->left , OUTPUT_FILE);
fprintf(OUTPUT_FILE , "%s" , node->string);
treeprint(node->right, OUTPUT_FILE);
}
}
and call after while loop , and call function like treeprint(root, ofp);.
The issue is with how you are printing. In your insert algorithm, you never modify the string in the node. But, when you print, you are printing the root each time. So, you have a couple of choices:
Use a debugger as #JoachimPileborg suggested.
print the input string s to see if it was read from the input properly.
Write another function to traverse your tree and print out the string in each node as you go. This is the most involved, but is likely to be useful later.
Lets come to the issue you are facing:
You want to know if your program successfully created a node for each string or not. There are more than one ways to tackle this problem. The easiest way is to print out the value of the node when it is actually being created. Thus your code will look something like:
treeNode * insert(treeNode *node, char s[MAXLEN]){
puts("running insert");
if(node == NULL){
node = (treeNode *)malloc(sizeof(treeNode));
strncpy(node -> string, s, MAXLEN);
node -> left = NULL;
node -> right = NULL;
**printToFile( root->string );**
}
else if(strcmp(node->string, s)>0){
node -> right = insert(node->right, s);
}
else if(strcmp(node->string, s)<0){
node -> left = insert(node->left, s);
}
else if(strcmp(node->string, s) == 0){
node -> left = insert(node->left, s);
}
return node;
}
Here is helper function for printing into file:
File* ofp; // Need to be declared globally or pass in functions as parameters
void openWritableFile()
{
ofp = fopen("output.txt", "w+");
}
void printToFile( char* data )
{
fprintf(ofp, "%s\n", data );
}
Another approach would be to travel the whole tree using tree traversal algorithms such as in-order, post-order or pre-order, but only after creation of the complete tree.
Here is how you can do inorder traversal by using your root element and thus printing contents of the tree:
void inorder(struct root* node)
{
if (root == NULL)
return;
inorder(root->left)
printToFile( root->string );
inorder(root->right);
}
You can learn about tree traversal here: Tree Traversal

Binary Tree segmentation fault after implementing search function

i am trying to write a program that will do the following
-read a file from std in
-read each line, and add each line to a binary tree
*if name is already in binary tree,dont add the name to the tree again but update its count of repititions
-print out the binary tree
the file being read in looks something like
dylan
bob
dylan
randall
randall
so when i print out the binary tree i would like it to print out
bob 1
dylan 2
randall 2
i was able to successfully print out the names without worrying about repetitions. I have commented out the blocks of code that mess my program up which is anything interacting with my search function that i added after the fact to take care of repetitions. The code builds a binary tree with each "leave" being a structure of 4 parts,the name,thecount,and the pointers to left and right childs.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct node {
char* name;
int count;
struct node* left;
struct node* right;
};
struct node* addNode(char* string);
void insert(struct node *root, char* stringgg);
void preorder(struct node *root);
int search(struct node* leaf,char* string2find);
int main()
{
char buffer[20];
struct node *root = NULL;
while( fgets(buffer, sizeof(buffer), stdin) != NULL )
{
if(root == NULL)
root = addNode(buffer);
else
insert(root,buffer);
}
preorder(root);
}
struct node* addNode(char* string)
{
struct node *temp = malloc(sizeof(struct node));
temp->name = malloc(strlen(string) + 1);
strcpy(temp->name,string);
temp->left = NULL;
temp->right = NULL;
return temp;
}
void insert(struct node *root, char* stringgg)
{
/* int flag = 5;
flag = search(root,stringgg);
if(flag == 1)
return; */
if(strcmp(stringgg,root->name) < 0)
{
if(root->left == NULL)
root->left = addNode(stringgg);
else
insert(root->left, stringgg);
}
else
{
if(root->right == NULL)
root->right = addNode(stringgg);
else
insert(root->right,stringgg);
}
}
/*int search(struct node* leaf,char* string2find)
{
if(strcmp(string2find,leaf->name) == 0)
{
leaf->count = leaf->count + 1;
return 1;
}
else if(strcmp(string2find,leaf->name) < 0)
{
return search(leaf->left,string2find);
}
else
{
return search(leaf->right,string2find);
}
return 0;
} */
void preorder(struct node *root)
{
if(root == NULL)
return;
printf("%s",root->name);
preorder(root->left);
preorder(root->right);
}
the above code prints out all the names even if there already in a tree. I was hoping that someone would be able to point out my search function errors so that it wont cause a segmentation fault when printing. Possible causes may be my inappropriate use of the return function in which i am trying to return to main if flag == 1 which means match was found so dont addnodes. but if flag does not equal 1 no match was found so go about adding nodes.
at main
while( fgets(buffer, sizeof(buffer), stdin) != NULL ){
char *p = strchr(buffer, '\n');
if(p) *p=0;//remove '\n'
at addNode
temp->count = 1;//initialize counter
return temp;
at insert
void insert(struct node *root, char* stringgg){
int cmp_stat = strcmp(stringgg,root->name);
if(cmp_stat == 0)
root->count++;
else if(cmp_stat < 0) {
if(root->left == NULL)
root->left = addNode(stringgg);
else
insert(root->left, stringgg);
} else {
if(root->right == NULL)
root->right = addNode(stringgg);
else
insert(root->right,stringgg);
}
}
at preorder
printf("%s %d\n",root->name, root->count);
The error is in searching for the very first item in the empty tree — you call
search(root, stringgg)
but root is NULL, so in search() you call
strcmp(string2find, leaf->name)
with leaf == NULL and the program crashes.
A cure: do not search BEFORE you update your tree, but rather search TO update it.
struct node* update(struct node* nd, const char* str)
{
int cmp;
// (sub)tree is empty? - create a new node with cnt==1
if(nd == NULL)
return CreateNode(str);
// test if the node found
cmp = strcmp(str, nd->name);
if(cmp == 0) // YES
nd->count ++; // update the counter
else if(cmp < 0) // NO - search in a subtree
nd->left = update(nd->left, str);
else
nd->right = update(nd->right, str);
return nd; // return the updated subtree
}
Then in main() you just update the tree and store it:
root = update(root, buffer);
Actually, the root value will change only once, on the first call, and all subsequent assignments will not change its value. However that makes the code much more readable.

Binary Search Tree Issues

I'm working on a binary tree with a list tacked on to the data, yet I can't tell if the list is being populated or not. The code runs alright but when I try to call to print out the tree I get a freeze in my code. I believe everything is being pointed to properly but it's obvious there is a flaw in the logic somewhere.
struct declarations
typedef struct lineList
{
int lineNum;
LIST *next;
}LIST;
typedef struct nodeTag{
char data[80];
LIST *lines;
struct nodeTag *left;
struct nodeTag *right;
} NODE;
declaration and pass to function from main
NODE *root = NULL;
readFromFile(argv[1], root);
readfromfile(working function) then calls insertword
insertWord(root, keyword, lineNum);
insertWord, addToList functions(problem area)
NODE *allocateNode(char *data, int line)
{
NODE *root;
LIST *newNum;
if(!(root = (NODE *) malloc (sizeof(NODE))))
printf( "Fatal malloc error!\n" ), exit(1);
strcpy(root->data, data); //copy word
(root)->left = (root)->right = root->lines = NULL; //initialize
if (!(newNum =(LIST *) malloc (sizeof(LIST))))
printf( "Fatal malloc error!\n" ), exit(1);
newNum->lineNum = line;
root->lines = newNum;
return root;
}
/****************************************************************
ITERATIVE Insert
*/
NODE *insertWord(NODE *root, char *data, int line)
{
NODE *ptr_root = root;
printf("inserting %s\n", data);
if(root == NULL)
{
root = allocateNode(data, line);
return root;
}
while(ptr_root)
{
if (strcmp(data, ptr_root->data > 0))
{
if(ptr_root->right)
ptr_root = ptr_root->right; //traverse right
else
ptr_root->right = allocateNode(data, line);
}
else if (strcmp(data, ptr_root->data) < 0)
{
if(ptr_root->left) //traverse left
ptr_root = ptr_root->left;
else
ptr_root->left = allocateNode(data, line);
}
else
{
printf("Node already in the tree!\n");
addToList(ptr_root, line);
}
}
printf("5\n");
return root;
}
void printTreeInorder(NODE *root)//simple print, freeze on call to function
{
if(root)
{
printTreeInorder(root->left);
printf( "%s\n", root->data );
printTreeInorder(root->right);
}
return;
}
Let's look at insertWord():
At the end of your while loop, we know that ptr_root == NULL.
We then allocate memory for ptr_root.
We then initialize the contents of ptr_root.
We then perform a memory leak on ptr_root.
Note that you need to retain the parent of the new node, and you need to point the its left or right pointer to this new node.
It also sounds like you understand how to use a debugger. If that's true, you should be able to see that root doesn't change between calls to insertWord().
In the code that you've posted with an attempted fix, you're missing one key thing. Let's look at a function:
void foo(NODE *root) {
printf("before malloc: %p\n", root);
root = malloc(sizeof(NODE));
printf("after malloc: %p\n", root);
}
int main() {
NODE *root = NULL;
printf("before function: %p\n", root);
foo(root);
printf("after function: %p\n", root);
}
This code will produce:
before function: 0x0
before malloc: 0x0
after malloc: 0x123ab129
after function: 0x0
Note that any changes to the value of root is not propagated out of the function. Things that you change to *root would though.

Resources