C - char array getting phantom values after memset - c

My program reads in a text file line by line and prints out the largest word in each sentence line. However, it sometimes prints out previous highest words although they have nothing to do with the current sentence and I reset my char array at the end of processing each line. Can someone explain to me what is happening in memory to make this happen? Thanks.
//Program Written and Designed by R.Sharpe
//LONGEST WORD CHALLENGE
//Purpose: Find the longest word in a sentence
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "memwatch.h"
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
//printf("%s\n", sentence);
char sub[100];
char maxWord[100];
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
char* word;
int maxLength = 0;
word = strtok(sentence, " ");
while(word != NULL)
{
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
//printf("%s\n", maxWord);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
memset(maxWord, 0, sizeof(char));
maxLength = 0; //reset for next sentence;
}
free(sentence);
return 0;
}
my text file contains . .
some line with text
another line of words
Jimmy John took the a apple and something reallyreallylongword it was nonsense
test test BillGatesSteveJobsWozWasMagnificant
a b billy
the output of the program is . .
some
another
reallyreallylongword
BillGatesSteveJobsWozWasMagnificantllyreallylongword
BillGatesSteveJobsWozWasMagnificantllyreallylongword //should be billy
Also when I arbitrarily change the length of the 5th sentence the last word sometimes
comes out to be "reallyreallylongword" which is odd.
EDIT: Even when I comment MEMSET out I still get the same result so it may not have anything to do with memset but not completely sure

Trailing NULL bytes (\0) are the bane of string manipulation. You have a copy sequence that is not quite doing what you desire of it:
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
Sentence is copied into sub, and then back again. Except, strncpy does not copy the '\0' out of sentence. When you copy the string from sub back into sentence, you are copying an unknown length of data back into sentence. Because the stack is being reused and the char arrays are uninitialized, the data is likely residing there from the previous iteration and thus being seen by the next execution.
Adding the following between the two strcpys fixes the problem:
sub[strlen(sentence) - 1] = '\0';

You've got a missing null terminator.
char sub[100];
char maxWord[100];
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
When you strncpy, if src is longer than the number of characters to be copied, no null terminator is added. You've guaranteed this is the case, so sub has no terminator, and you're rapidly running into behavior you don't want. It looks like you're trying to trim the last character from the string; the easier way to do that is simply set the character at index strlen(sentence)-1 to '\0'.

This is bad:
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
The strncpy function does not null-terminate its buffer if the source string doesn't fit. By doing strlen(sentence)-1 you guaranteed it doesn't fit. Then the strcpy causes undefined behaviour because sub isn't a string.
My advice is to not use strncpy, it is almost never a good solution to a problem. Use strcpy or snprintf.
In this case you never even use sub so you could replace these lines with:
sentence[ strlen(sentence) - 1 ] = 0;
which has the effect of removing the \n on the end that was left by fgets. (If the input was longer than 100 then this deletes a character of input).

Find the corrected code in below
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char sub[100];
char maxWord[100];
char* word;
int maxLength = 0;
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
maxLength = 0;
strncpy(sub, sentence, strlen(sentence)-1);
sub[strlen(sentence) - 1] = '\0'; //Fix1
strcpy(sentence, sub);
word = strtok(sentence, " ");
while(word != NULL)
{
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
memset(maxWord, 0, sizeof(char));
maxLength = 0; //reset for next sentence;
}
free(sentence);
fclose (file); //Fix2
return 0;
}
Ensure that the file is closed at the end. It is good practice.

Related

Seperating all characters in a string before a strstr pointer in C

I am building a find and replace program in C. it needs to be able to replace whatever is searched for in a text file with some thing else.
Eg. Search elo in input.txt replace with ELO
Result: developers devELOpers
eg search el replace EL hello hELlo welcome wELcome
I am having real trouble getting all the characters in a string before a pointer. I can get the single words and the pointer for the search word and i can get everything after the search term but i cannot get everything before the searched characters.
Tried strlen on the whole word and subtracting to no avail. I also can get the length of the first part of the word i need.
Thanks for any help.
search for era in
operating:
Outputs are
firstMatch= erating
End part =ting
startLength 2
#include <stdio.h>
#include<stdlib.h>
#include <string.h>
#define LINE_LENGTH 1000
//Function to remove a newline character and replace it with a null terminator.
void remove_newline(char *str)
{
int len =strlen (str);
if (len>0 &&str[len -1] == '\n')
str[len -1] = '0';
}
int main(int argc, char **argv)
{
char* searchWord = argv[1]; //Define the string searched for as argument 1.
char* replaceWord = ""; // Initialise the word for replacing search word.
char* text_File= ""; // Initialise the text file
for(int i = 0; i < argc; i++)
{
if (strcmp(argv[i],"-r")==0)
{
replaceWord = argv[i+1]; // Identify the replacemnt word as the argument that comes after "-r".
}
if (strcmp(argv[i],"-i")==0)
{
text_File = argv[i+1]; // Identify the file word as the argument that comes after "-i".
}
}
char slen= strlen(searchWord);
printf("You have searched for %s\n", searchWord); // Clarify for the user their search terms and input file.
printf("In the file %s\n", text_File);
FILE *input_file = fopen(text_File, "r"); // Open the text file
char line [LINE_LENGTH];
FILE *write_file = fopen("output.txt", "w"); //Create a new file for writing to.
` while (fgets(line, LINE_LENGTH, input_file) != NULL) ***// Loop through the contents of the input file.***
{
char *currentWord; ***// Separate the words at any of "\n ,.-".***
char *tempWord;
currentWord = strtok(line, "\n ,.-");
while (currentWord != NULL) ***// Loop through every seperated word.***
{
remove_newline(currentWord); //Remove the newline character form the current word.
if (strstr(currentWord, searchWord)!= NULL) ***// Check if the current word contains the searh word***
{
printf ("%s ", currentWord); ***// If it does print to console the word containing it***
char currLength= strlen(currentWord);
printf("%d\n", currLength);
char* firstMatch= strstr(currentWord, searchWord); ***// Find the first occurence of the searched term***
printf ("firstMatch %s\n ", firstMatch); ***//Print evrything after and including search.***
char* lastPart = firstMatch + slen; ***// Get the part after the search***
printf ("End part %s\n ", lastPart);
char rest = strlen(firstMatch);
char startLength = currLength - rest;
This is where it doesn't work.
char* startPart = firstMatch - startLength;
printf ("start Part %s\n ", startPart);
printf ("startLength %d\n\n ", startLength);`
The reason for the unexpected result when you try to print the "before the match" portion of the word is that there's nothing in the word string that will cause the
printf("start Part %s\n ", startPart);
call to stop after it has printed the first startLength characters of the word. When printf is told to print a string, it prints all characters from the starting point until it encounters a \0 terminator. Here, the only \0 is at the end of the word, so printf prints the entire word.
If you want to only print the first few characters of the word, you have to either construct a \0-terminated string that only contains those first few characters or you have to print them by using a mechanism that does not try to treat them as a string.
To construct a \0-terminated start string you could temporarily overwrite the first character of the match with a \0, then call printf, and then restore the match character. Something like:
char savedFirstMatch = *firstMatch;
*firstMatch = '\0';
printf("start Part %s\n ", startPart);
*firstMatch = savedFirstMatch;
If you don't want to do that then you could use a for loop to print only the first startLength characters as individual characters, not as a string, preceded and followed by a printf or puts that emits whatever extra stuff you want to print around those characters. In this case the extra stuff is a "start Part " string before the characters, and a newline and space afterwards (assuming that that space isn't just a typo). That would look something like:
puts("start Part ");
unsigned startIndex;
for (startIndex = 0; startIndex < startLength; ++startIndex) {
putchar(startPart[startIndex]);
}
puts("\n ");
Of course if you aren't comfortable with puts and putchar you can use printf("%s", ...) and printf("%c", ...) instead.

Code not letting me input and ends immediately

I have created a function that reverses all the words in a sentence, meaning that if the input is "Hello World" the output is supposed to be "World Hello". The code below is the function.
char* reversesentence(char sent[]) {
int lth = strlen(sent);
int i;
for(i = lth -1; i >= 0; i--) {
if(sent[i] == ' ') {
sent[i] = '\0';
printf("%s ", &(sent[i]) + 1);
}
}
printf("%s", sent);
}
In the main I am trying to ask the user for the sentence and calling the function in the main.
int main(void)
{
char sentence[2000];
printf("Please enter the sentence you want to be reversed.\n");
scanf("%s", sentence);
reversesentence(sentence);
printf("%s", sentence);
}
It seems that the array is only storing the first word of the sentence only.
Output:
Please enter the sentence you want to be reversed.
hello my name is
hellohello
Process finished with exit code 0`
Can someone help me fix this please? Searched online and found nothing useful.
scanf stops reading when it occurs whitespace,tabs or newline.
Matches a sequence of non-white-space characters; the next pointer
must be a pointer to character array that is long enough to hold the
input sequence and the terminating null byte ('\0'), which is added
automatically. The input string stops at white space or at the maximum
field width, whichever occurs first.
Thus you are not reading the entire string as you input.
Try using fgets as below.
fgets(sentence, sizeof(sentence), stdin);
Note fgets appends \n to end of the string. see how to trim the
new line from
fgets
You have two problems
as it was already said you only read one word using scanf
reversesentence just replace all spaces by a null character, so even you read a full line you cut it at the first space. so if you enter "hello world" the result will be "hello", and if you enter " hello world" the result will be an empty string
The simple way is to read words using scanf in a loop until it returns EOF, memorizing them, to at the end produce the list of words returned
For instance :
#include <stdlib.h>
#include <string.h>
int main()
{
size_t len = 0;
char * result = 0;
char word[256];
while (scanf("%256s", word) != EOF) {
if (result == 0) {
result = strdup(word);
len = strlen(word);
}
else {
size_t l = strlen(word);
char * r = malloc(len + l + 2);
strcpy(r, word);
r[l] = ' ';
strcpy(r + l + 1, result);
free(result);
result = r;
len += l + 1;
}
}
puts(result);
free(result);
return 0;
}
The reading finishes at the end of the input (^d under a linux shell), the words can be given on several lines.
With the input
hello world
how
are you
?
that prints
? you are how world hello

C - Find longest word in a sentence

Hi I have this program that reads a text file line by line and it's supposed to output the longest word in each sentence. Although it works to a degree, it's overwriting the biggest word with an equally big word which is something I am not sure how to fix. What do I need to think about when editing this program? Thanks
//Program Written and Designed by R.Sharpe
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "memwatch.h"
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
char* word;
int maxLength = 0;
char* maxWord;
maxWord = (char*)calloc(40, sizeof(char));
word = (char*)calloc(40, sizeof(char));
word = strtok(sentence, " ");
while(word != NULL)
{
//printf("%s\n", word);
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
maxLength = 0; //reset for next sentence;
}
return 0;
}
My textfile that the program is accepting contains this
some line with text
another line of words
Jimmy John took the a apple and something reallyreallylongword it was nonsense
and my output is this
text
another
reallyreallylongword
but I would like the output to be
some
another
reallyreallylongword
EDIT: If anyone plans on using this code, remember when you fix the newline character issue don't forget about the null terminator. This is fixed by setting
sentence[strlen(sentence)-1] = 0 which in effect gets rid of newline character and replaces it with null terminating.
You get each line by using
fgets(sentence, 100, file)
The problem is, the new line character is stored inside sentence. For instance, the first line is "some line with text\n", which makes the longest word "text\n".
To fix it, remove the new line character every time you get sentence.

Arrays in C not working

Well, I declared a global array of chars like this char * strarr[];
in a method I am tokenising a line and try to put everything into that array like this
*line = strtok(s, " ");
while (line != NULL) {
*line = strtok(NULL, " ");
}
seems like this is not working.. How can I fix it?
Thanks
Any number of things could be going wrong with the code you haven't shown us, such as undefined behaviour by strtoking a string constatnt, or getting your parameters wrong when calling the function.
But the most likely problem from the code we can see is the use of *line instead of line, assuming that line is of type char *.
Use the following code as a baseline:
#include <stdio.h>
#include <string.h>
int main (void) {
char str[] = "My name is paxdiablo";
// Start tokenising words.
char *line = strtok (str, " ");
while (line != NULL) {
// Print current token and get next word.
printf ("[%s]\n", line);
line = strtok(NULL, " ");
}
return 0;
}
This outputs:
[My]
[name]
[is]
[paxdiablo]
and should be easily modifiable into something you can use.
Be aware that, if you're trying to save the character pointers returned from strtok (which would make sense for using *line), they are transitory and will not be what you expect after you're done. That's because modifications are made in-place within the source string. You can do it with something like:
#include <stdio.h>
#include <string.h>
int main (void) {
char *word[4]; // The array of words.
size_t i; // General counter.
size_t nextword = 0; // For preventing array overflow.
char str[] = "My name is paxdiablo";
// Start tokenising.
char *line = strtok (str, " ");
while (line != NULL) {
// If array not full, duplicate string to array and advance index.
if (nextword < sizeof(word) / sizeof(*word))
word[nextword++] = strdup (line);
// Get next word.
line = strtok(NULL, " ");
}
// Print out all stored words.
for (i = 0; i < nextword; i++)
printf ("[%s]\n", word[i]);
return 0;
}
Note the specific size of the word array in that code above. The use of char * strarr[] in your code, along with the message tentative array definition assumed to have one element is almost certainly where the problem lies.
If your implementation doesn't come with a strdup, you can get a reasonably-priced one here :-)

help with C string i/o

Reads through a .txt file and puts all chars that pass isalpha() in a char array. for spaces it puts \0, so the characters in the array are separated by strings. This works.
I need help with the second part. I need to read a string that the user inputs (lets call it the target string), find all instances of the target string in the char array, and then for each instance:
1. print the 5 words before the target string
2. print the target string itself
3. and print the 5 words after the target string
I can't figure it out, i'm new to C in general, and I find this i/o really difficult after coming from Java. Any help would be appreciated, here's the code I have right now:
#include <stdio.h>
#include <string.h>
main(argc, argv)
int argc;
char *argv[];
{
FILE *inFile;
char ch, ch1;
int i, j;
int arrayPointer = 0;
char wordArray [150000];
for (i = 0; i < 150000; i++)
wordArray [i] = ' ';
/* Reading .txt, strip punctuation, conver to lowercase, add char to array */
void extern exit(int);
if(argc > 2) {
fprintf(stderr, "Usage: fread <filename>\n");
exit(-1);
}
inFile = fopen(argv[1], "r");
ch = fgetc(inFile);
while (ch != EOF) {
if(isalpha(ch)) {
wordArray [arrayPointer] = tolower(ch);
arrayPointer++;
}
else if (isalpha(ch1)) {
wordArray [arrayPointer] = '\0';
arrayPointer++;
}
ch1 = ch;
ch = fgetc(inFile);
}
fclose;
/* Getting the target word from the user */
char str [20];
do {
printf("Enter a word, or type \"zzz\" to quit: ");
scanf ("%s", str);
char* pch;
pch = strstr(wordArray, str);
printf("Found at %d\n", pch - wordArray + 1);
pch = strstr(pch + 1, str);
} while (pch != NULL);
}
There are a number of problems here, but the one that is probably tripping you up the most is the use of strstr as you've got it. Both parameters are strings; the first is the haystack, and the second is the needle. The definition of a C string is (basically) a sequence of characters terminated by '\0'. Take a look at how you've constructed your wordArray; it's effectively a series of strings one right after the other. So when you are using strstr the first time, you are only ever looking at the first string.
I realize this isn't the entire answer you are looking for, but hopefully it points you in the right direction. You may want to consider building up an array of char * that points into your wordArray at each word. Iterate over that new array checking for the string the user is looking for. If you find it, you now have an index you can work backwards and forwards from.

Resources