C - Loop seems to be halved whenever I encounter spaces - c

I'm in the process of making a program that parses words from a line, adding a word to a tree when it hits an nonalphanumeric character. Everything goes fine when there are no spaces in a line. However, when there are nonalphanumeric characters, the loop in question (beginning at the line commented in the code) halves in size!
Why does the loop halve?
Tree addin (char* filee, Tree tree)
{
int i;
FILE *fp;
fp = fopen(filee, "r");
char* hold2 = malloc(99);
int count=-1;
char* hold;
while ((hold=getLine(fp))!=NULL)
{
count=-1;
for (i=0; i<strlen(hold); i++) //The loop in question
{
count++;
if ((isalnum(hold[count])==0)&&(hold[count]!='\n'))
{
strncpy(hold2, hold, count);
hold2[count]='\0';
hold=strdup(&hold[count+1]);
count=-1;
tree = insertT(tree, hold2);
}
}
tree = insertT(tree, hold);
}
free(hold);
fclose(fp);
return tree;
}

When you find a non-alphanumeric character, your program moves hold to point to the remainder of your string, but doesn't reset i. That means you continue iterating from the new hold pointer, which is partway into the original one, plus whatever i happened to be at that time. Doing so presumably at least skips a bunch of characters, and possibly makes you start operating on memory outside of the string, which is definitely bad news.

It may be because you change the value of hold within the loop, since strlen(hold) is reevaluated at each iteration. A solution could be to save the value of strlen(hold) before entering the for loop.

Related

How to read in the entire word, and not just the first character?

I am writing a method in C in which I have a list of words from a file that I am redirecting from stdin. However, when I attempt to read in the words into the array, my code will only output the first character. I understand that this is because of a casting issue with char and char *.
While I am challenging myself to not use any of the functions from string.h, I have tried iterating through and am thinking of writing my own strcpy function, but I am confused because my input is coming from a file that I am redirecting from standard input. The variable numwords is inputted by the user in the main method (not shown).
I am trying to debug this issue via dumpwptrs to show me what the output is. I am not sure what in the code is causing me to get the wrong output - whether it is how I read in words to the chunk array, or if I am pointing to it incorrectly with wptrs?
//A huge chunk of memory that stores the null-terminated words contiguously
char chunk[MEMSIZE];
//Points to words that reside inside of chunk
char *wptrs[MAX_WORDS];
/** Total number of words in the dictionary */
int numwords;
.
.
.
void readwords()
{
//Read in words and store them in chunk array
for (int i = 0; i < numwords; i++) {
//When you use scanf with '%s', it will read until it hits
//a whitespace
scanf("%s", &chunk[i]);
//Each entry in wptrs array should point to the next word
//stored in chunk
wptrs[i] = &chunk[i]; //Assign address of entry
}
}
Do not re-use char chunk[MEMSIZE]; used for prior words.
Instead use the next unused memory.
char chunk[MEMSIZE];
char *pool = chunk; // location of unassigned memory pool
// scanf("%s", &chunk[i]);
// wptrs[i] = &chunk[i];
scanf("%s", pool);
wptrs[i] = pool;
pool += strlen(pool) + 1; // Beginning of next unassigned memory
Robust code would check the return value of scanf() and insure i, chunk do not exceed limits.
I'd go for a fgets() solution as long as words are entered a line at a time.
char chunk[MEMSIZE];
char *pool = chunk;
// return word count
int readwords2() {
int word_count;
// limit words to MAX_WORDS
for (word_count = 0; word_count < MAX_WORDS; word_count++) {
intptr_t remaining = &chunk[MEMSIZE] - pool;
if (remaining < 2) {
break; // out of useful pool memory
}
if (fgets(pool, remaining, stdin) == NULL) {
break; // end-of-file/error
}
pool[strcspn(pool, "\n")] = '\0'; // lop off potential \n
wptrs[word_count] = pool;
pool += strlen(pool) + 1;
}
return word_count;
}
While I am challenging myself to not use any of the functions from string.h, ...
The best way to challenge yourself to not use any of the functions from string.h is to write them yourself and then use them.
your program reads the next word in the i-esim position of the buffer chunk, so you are getting the first letters of each word (as long as i doesn't get above the size of chunk) as each time you read, you overwrite the second and rest of the chars of the last word with the ones of the just read one. Then, you are putting all the pointers in wptrs to point to these places, making it impossible to distinguish the end of one string to the next (you overwrote all the null terminators, leaving only the last) so you will get a first string with all the first letters of your words but the last, which is complete. then the second will have the same string, but beginning at the second... then the third.... etc.
Build your own version of strdup(3) and use chunk to store temporarily the string... then make a dynamically allocated copy of the string with your version of strdup(3) and make the pointer to point to it.... etc.
Finally, when you are finished, just free all the allocated strings and voilĂ !!
Also, this is very important: read How to create a Minimal, Complete, and Verifiable example as it is very frequent that your code lacks of some errors that you have eliminated from the posted code (you don't normally know where the error is, or you would have corrected it and no question here, right?)

how can i find the initial letters of all the words in a .txt file

I'm a bit green in C and whole programming so I need help on task.
i am trying to find the answer of this question and all i came up with is this code that works but
output:
vhiag
hwag
tiatg
required output:
vhiag
hw
tiat
size=sizeof(ss)/sizeof(ss[0]);
if(strcmp(op,"first")==0){
while(1){
if(fgets(ss,512,fp)==NULL){
break;
}
first(ss,size);
}
}
void first(char spaces[],int size)
{
int i=1;
char r[size];
r[0]=spaces[0];
int j;
for(j=0;j<size;j++)
{
if(spaces[j]==' ')
{
r[i]=spaces[j+1];
i++;
}
}
r[i]='\0';
printf("%s\n",&r);
return;
}
Your first() function scans the whole array presented to it, all size bytes, without regard to the presence of a string terminator within. Therefore, if an input line is shorter than the previous one, your function blithely scans the overlay of the second line on the first.
To stop your scan at the end of the line, break from the loop when you see the terminator:
for (j = 0; spaces[j] != '\0'; j++)
You may also break on the condition that j reaches or exceeds size (as an additional, not alternative condition), but it's not really necessary in your case because you can rely on fgets() to provide that terminator within the number of bytes specified to it.

fscanf won't work twice?

I'm just trying to get fscanf to read all the characters in a file, along with all the words, but whenever I try to run a whileloop on the file I opened twice it doesn't seem to work? It only seems to be able to use fscanfon a file one time. I found a work around where I can scan the same file twice except I need to open the file a 2nd time for this to work. How can I use fscanf on the same instance of a file twice?
/*Description: Program will open a file named Story.txt
then counts the number of words and characters
and prints them out */
int main(){
char word[225]; // will be used to hold words (array of chars)
char c; // will be used to hold chars
int wordCount = 0; // will be used to hold the number of words in the file
int charCount = 0; // will be used to hold the number of chars in the file
FILE* wordFile = fopen("Story.txt","r"); // opens the file Story.txt for counting words
printf("Words: "); // indicates that the following outout will be the words of the file
while(fscanf(wordFile,"%s",&word)==1){ // a loop to scan the file for all the words in it until the end of the file
printf(" %s ",word); // prints out the word in the given cycle
wordCount = wordCount + 1; // keeps count of the words the loop has scanned up till now
}
fclose(wordFile); // closes wordFile and frees the memory
FILE* charFile = fopen("Story.txt","r"); // opens file Story.txt for counting chars
printf("\n\nChars: "); // indicates the following output will be the chars of the file
while(fscanf(charFile,"%c",&c)==1){ // a loop to scan the file for all the chars in it until the end fot he file
if(c!=' '){ // will check if the char is a space, if it is it will not count it
printf(" %c ",c); // prints out the char for the given cycle
charCount = charCount + 1; // keeps count of the chars the loop has scanned up till now
}
}
fclose(charFile); // closes charFile and frees the memory
printf("\n\nWord count: %d and Char count: %d",wordCount,charCount-1); // prints out the word count and char count
return 0;
}
As you can see i have to create two instances of the file or else it will not work. The first instance is called wordFile and the 2nd instance is called charFile. Here's the thing though: Both loops work, it's just that I can't use them on the same file twice. How can I make it so that I will only need to open one instance of the file and then use it to count both the words and the chars in it?
Things I tried: adding the space as suggested here didn't work: C: Multiple scanf's, when I enter in a value for one scanf it skips the second scanf (i searched fscanf but scanf is all that came up so i went off of that).
Another work around that I found strange: if i use wordFilein the 2nd while loop to search for chars it works, the only problem is I have to declare fclose(wordFile); right before it's used in the while loop. I thought fclosewas suppose to close the file and make it unusable? Anyways that worked but what I really want is to use use one instance of the file to read all the chars and strings in it.
Do something like below - the lazy way of course
char word[225];
char c;
int wordCount = 0;
int charCount = 0;
FILE* wordFile = fopen("Story.txt","r");
printf("Words: ");
while(fscanf(wordFile,"%s",word)==1){ // word gives the address not &word
wordCount = wordCount + 1;
}
printf("%d\n",wordCount);
fseek(wordFile,0,SEEK_SET); // setting the file pointer to point to the beginning of file
printf("Chars: ");
while(fscanf(wordFile,"%c",&c)==1){
if(c!=' '&& c!='\n' && c!='\t'){
charCount = charCount + 1;
}
}
printf("%d\n",charCount);
fclose(wordFile); // Closing the file once for all
return 0;
fscanf(wordFile,"%s",&word);
The variable "word" is the pointer to the first element of your array (string), so here you are scanning the address not the value. You should rather use:
fscanf(wordFile,"%s",word);

Replace value in string with another from another string

I've been stuck for a while now. The program i'm writing basically changes the false words with the correct ones from the dictionary. However, when i run the program, it gives me no warnings or errors, but it doesn't display anything. Can you please help me?
#include<stdio.h>
#include<ctype.h>
#include<string.h>
int main(void){
char fname[20],word[2500], dictn[50];
int i,j;
float len1, len2;
FILE *inp, *dict, *outp, *fopen();
fpos_t pos1, pos2;
dict= fopen("dictionary.txt", "r");
printf("Enter the path of the file you want to check:\n");
scanf("%s", fname);
inp= fopen(fname, "r");
for(i=0;(fscanf(inp, "%s", word) != EOF); i++){
for(j=0;fscanf(dict, "%s", dictn) != EOF; j++){
fgetpos(inp, &pos1);
fgetpos(dictn, &pos2);
len1=(float)strlen(word);
len2=(float) strlen(dictn);
if(len1<=(0.6*len2)){
fsetpos(dictn, &pos1);
}
if(strncmp(word, dictn, 1)==0){
fsetpos(dictn, &pos1);
}
if(strcmp(word, dictn)==0){
fsetpos(dictn, &pos1);
}
}
printf("%s ", word);
}
fclose(inp);
fclose(dict);
return(0);
}
You can use
sprintf(word, "%s ", dictn);
If your code is working with printf it should work with sprintf, provided you don't overflow "word", including the NULL termination, so you might have to resize "word" if it is smaller than dictn.
First of all, I'm assuming you have created arrays word and dictn with enough size to hold the maximum length string any of your files.
First fault:
In loops you've created, i represents number of strings in input file and j represents number of strings in dictionary. word is your input string variable and dictn is your dictionary string variable. But you want to retrieve and alter word's ith or dictn's jth character. This may cause an error because there can be a case like this:
Suppose there are 10 words at inp file and 100 words at dictn. And in your loops, i have value of 8 and j have value of 88. Corresponding these i and j values, word has string value of, say, apple and dictn has string value of apple also. So this means apple is the 8th word at input file and 88th word at dictionary file. And if one of those if conditions was satisfied, compiler tries to apply a statement like word[i]=dictn[j];. This means word[8] = dictn[88]; for this example. But both of those string have apple as values which consists only 5 characters! And this will cause an error since you've tried to retrieve 88th character of a 5-length string and assign it to the 8th character of a 5-length string. So your code is wrong, it will only work for some cases which will be a rare situation.
Second fault:
I assume you want to read whole dictionary file for every word in input file but you will be able to read it for only first word of input file since you don't reopen it or set position indicator at the beginning of dictionary file after you read whole dictionary.
Third fault:
Your first if statement will never be reached assuming you have created len1 and len2 variables as integers. Because in your if statement, there is a multiplication of a decimal number and an integer which will return 0 as a result and since fscanf() ignores whitespaces, len1 and len2 will be at least 1.
Fourth fault:
Also your else if statement will never be reached because if a string has same value with another, their first character will also be equal to each other and your if statement where you compare their first characters will be also accepted.
Actually, I would write a code as solution but first of all you need to correct things up which are logically wrong because I do not know what you are really try to achieve by your code -just because I commented with full of assumptions-. But I can provide you some guidelines:
Convert your len1 and len2 variables from int to float and cast values which return from strlen() functions to float.
Reopen your dict file for every iteration of outside loop. (And do not forget not to close it).
To change your inp file, you can use a fpos_t type of variable to track your position indicator of your inp file (fgetpos() to get current position and fsetpos() to change position with value of fpos_t variable. You can search them.) and type the word with fprintf() or fputs() to that location to change that string.

printf("%s") not working properly

So i have this big code ( so wont be able to put the entire thing here).
But at a point i have this.
while(ptr1!=NULL)
{
printf("%sab ",ptr1->name);
puts(ptr1->name);
ptr1=ptr1->next;
}
Now my ptr1 point to a an entry of the array of a structure( each entry being a linked list), and the structure was populated from a file.
Now in this loop it prints
FIRSTab FIRST
SECONDab SECOND
THIRD
Now why doesnt my THIRD GETS PRINTED TWICE?
Also if i do
printf(" %s",ptr1->name); // i.e. one space before %s
I get
THIRDD
Putting 2 spaces before %s gives me
THIRDRD
3 spaces gives
THIRDIRD
And so on.
Also if i try to do strcmp(ptr1->name,"THIRD") i wont get the correct comparison for THIRD.
Why??
Here is how i populated my structure.
// G is the structure, fp is passed as argument to function.
//THe file format is like this.
//FIRST SECOND THIRD
//NINE ELEVEN
//FOUR FIVE SIX SEVEN
// and so on.
int i=0,j=0,k=0;
char string[100];
while(!feof(fp))
{
if(fgets(string,100,fp))
{
G[i].index=i;
k=0;j=0;
//\\printf("%d",i);
//puts(string);
node *new=(node*)malloc(sizeof(node));
new->next=NULL;
G[i].ptr=new;
node* pointer;
pointer=G[i].ptr;
while(string[j]!='\n')
{
if(string[j]==' ')
{
pointer->name[k]='\0';
k=0;
node *new=(node*)malloc(sizeof(node));
new->next=NULL;
pointer->next=new;
pointer=pointer->next;
j++;
}
else
{
pointer->name[k++]=string[j];
j++;
}
}
pointer->name[k]='\0';
i++;
}
Your third string probably contains the characters THIRD followed by \r (carriage return). Why it contains this can only be determined by knowing the contents of the file and how your read it.
It is likely that you are either working on a system that uses a single newline character as a line terminator (but the file you are opening comes from a system that uses a carriage return and newline pair) or that the file pointer that you were passed (fp) was opened in binary mode.
If you can't change the file pointer to be opened in text mode then a quick fix might be to change this condition while(string[j]!='\n') to while(string[j]!='\n' && string[j] != '\r'), although you might want a more robust solution that handles multiple whitespace characters.

Resources