How to read only the first word from each line? - c

I've done many simple procedures, but I'm only trying to read the first word into a char word[30], from each line of a text file.
I've tried, but without success. Oh, I have to reuse that char each time I read it. (To put in an ordered list each time I read it).
Can anyone show me a way to read this way from a file, in a simple and "cleany" way?
FILE *fp;
char word[30];
fp = fopen("/myhome/Desktop/tp0_test.txt", "r");
if (fp == NULL) {
printf("Erro ao abrir ficheiro!\n");
} else {
while (!feof(fp)) {
fscanf(fp,"%*[^\n]%s",word);//not working very well...
printf("word read is: %s\n", word);
strcpy(word,""); //is this correct?
}
}
fclose(fp);
For example for a file that contains:
word1 word5
word2 kkk
word3 1322
word4 synsfsdfs
it prints only this:
word read is: word2
word read is: word3
word read is: word4
word read is:

Just swap the conversion specifications in your format string
// fscanf(fp,"%*[^\n]%s",word);//not working very well...
fscanf(fp,"%s%*[^\n]",word);
Read the first word and ignore the rest, rather than ignore the line and read the first word.
Edit some explanation
%s ignores whitespace, so if the input buffer has " forty two", scanf ignores the first space, copies "forty" to the destination and leaves the buffer positioned at the space before "two"
%*[^\n] ignores everything up to a newline, excluding the newline. So a buffer containing "one \n two" gets positioned at the newline after the scanf (as if it was "\n two")

so ross$ expand < first.c
#include <stdio.h>
int main(void) {
char line[1000], word[1000];
while(fgets(line, sizeof line, stdin) != NULL) {
word[0] = '\0';
sscanf(line, " %s", word);
printf("%s\n", word);
}
return 0;
}
so ross$ ./a.out < first.c
#include
int
char
while(fgets(line,
word[0]
sscanf(line,
printf("%s\n",
}
return
}
Update: Ok, here is one that just uses scanf(). Really, scanf doesn't deal well with discrete lines and you lose the option of avoiding word buffer overflow by setting the word buffer to be the same size as the line buffer, but, for what it's worth...
so ross$ expand < first2.c
#include <stdio.h>
int main(void) {
char word[1000];
for(;;) {
if(feof(stdin) || scanf(" %s%*[^\n]", word) == EOF)
break;
printf("%s\n", word);
}
return 0;
}
so ross$ ./a.out < first2.c
#include
int
char
for(;;)
if(feof(stdin)
break;
printf("%s\n",
}
return
}

Have a look at this, strtok function is what we needed. You may tell to function where to split the string with parameters, like strtok (singleLine," ,'(");. Here it will cut every time it see white space "," " ' " and (.
strtok (singleLine," "); or just in white spaces.
FILE *fPointer,*fWords,*fWordCopy;
char singleLine[150];
fPointer= fopen("words.txt","r");
fWordCopy= fopen("wordscopy.txt","a");
char * pch;
while(!feof(fPointer))
{
fgets(singleLine,100,fPointer);
pch = strtok (singleLine," ,'(");
fprintf(fWordCopy,pch);
fprintf(fWordCopy, "\n");
}
fclose(fPointer);

Related

Conditionally skipping entire line with just fscanf

Need to skip entire line (comment) if the first character is #
Some solutions in other posts suggested fgets
but in my case fscanf is the preferred option as I need to parse each word 1 by 1 later. How can this be done with just fscanf ?
Thank you.
File to be read
#This line is a comment <== skip this entire line
BEGIN {
THIS IS A WORD
}
CODE
void read_file(FILE *file_pointer, Program *p)
{
char buffer[FILESIZE];
int count = 0;
while (fscanf(file_pointer, "%s", buffer) != EOF)
{
if (buffer[0] == '#')
{
continue; <============ need to skip until the end of the line
}
else
{
strcpy(p->wds[count++], buffer);
}
}
}
How can this be done with just fscanf?
fscanf(stdin, "%*[^\n]"); will read all characters other than a new-line character, until an error or end-of-file occurs.
* says to suppress assignment of the data being read.
By default, [ starts a list of characters to accept. However, ^ negates that; [^\n] says to accept all characters other than a new-line character.
Some solutions in other posts suggested fgets but in my case fscanf is the preferred option as I need to parse each word 1 by 1 later. How can this be done with just fscanf ?
I recommend that you use fgets to read a whole line and then you can use sscanf instead of fscanf to read that line word by word.
#include <stdio.h>
void read_file( FILE *file_pointer )
{
char line[100];
while ( fgets( line, sizeof line, file_pointer) != NULL )
{
char *p = line;
char word[50];
int chars_read;
//skip line if it starts with "#"
if ( line[0] == '#' )
{
continue;
}
//read all words on the line one by one
while ( sscanf( p, "%49s%n", word, &chars_read ) == 1 )
{
//do something with the word
printf( "Found word: %s\n", word );
//make p point past the end of the word
p += chars_read;
}
}
}
int main( void )
{
//this function can also be called with an opened file,
//however for simplicity, I will simply pass "stdin"
read_file( stdin );
}
With the input
This is a test.
#This line should be ignored.
This is another test.
this program has the following output:
Found word: This
Found word: is
Found word: a
Found word: test.
Found word: This
Found word: is
Found word: another
Found word: test.
As you can see, the line starting with the # was successfully skipped.
Use fgets() after you've read the first word to read the rest of the line.
while (fscanf(file_pointer, "%s", buffer) != EOF)
{
if (buffer[0] == '#')
{
fgets(file_pointer, buffer, sizeof buffer);
}
else
{
strcpy(p->wds[count++], buffer);
}
}

How to print a string with its \n characters included?

Let's say we have char* str = "Hello world!\n". Obviously when you print this you will see Hello world!, but I want to make it so it will print Hello world!\n. Is there any way to print a string with its line break characters included?
Edit: I want to print Hello world!\n without changing the string itself. Obviously I could just do char* str = "Hello world \\n".
Also, the reason I'm asking this question is because I'm using fopen to open a txt file with a ton of line breaks. After making the file into a string, I want to split the string by each of its line breaks so I can modify each line individually.
I think it's a typical case of an XY Problem: you ask about a particular solution without really focusing on the original problem first.
After making the file into a string
Why do you think you need to read the entire file in at once? That's not normally necessary.
I want to split the string by each of its line breaks so I can modify each line individually.
You don't need to print the string to do that (you wanted "to make it so it will print Hello World!\n). You don't need to modify the string. You just need to read it in line by line! That's what fgets is for:
void printFile(void)
{
FILE *file = fopen("myfile.txt", "r");
if (file) {
char linebuf[1024];
int lineno = 1;
while (fgets(linebuf, sizeof(linebuf), file)) {
// here, linebuf contains each line
char *end = linebuf + strlen(linebuf) - 1;
if (*end == '\n')
*end = '\0'; // remove the '\n'
printf("%5d:%s\\n\n", lineno ++, linebuf);
}
fclose(file);
}
}
I want to make it so it will print Hello world!\n
If you really wanted to do it, you'd have to translate the ASCII LF (that's what \n represents) to \n on output, for example like this:
#include <stdio.h>
#include <string.h>
void fprintWithEscapes(FILE *file, const char *str)
{
const char *cr;
while ((cr = strchr(str, '\n'))) {
fprintf(file, "%.*s\\n", (int)(cr - str), str);
str = cr + 1;
}
if (*str) fprintf(file, "%s", str);
}
int main() {
fprintWithEscapes(stdout, "Hello, world!\nA lot is going on.\n");
fprintWithEscapes(stdout, "\nAnd a bit more...");
fprintf(stdout, "\n");
}
Output:
Hello, world!\nA lot is going on.\n\nAnd a bit more...

C - Find longest word in a sentence

Hi I have this program that reads a text file line by line and it's supposed to output the longest word in each sentence. Although it works to a degree, it's overwriting the biggest word with an equally big word which is something I am not sure how to fix. What do I need to think about when editing this program? Thanks
//Program Written and Designed by R.Sharpe
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "memwatch.h"
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
char* word;
int maxLength = 0;
char* maxWord;
maxWord = (char*)calloc(40, sizeof(char));
word = (char*)calloc(40, sizeof(char));
word = strtok(sentence, " ");
while(word != NULL)
{
//printf("%s\n", word);
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
maxLength = 0; //reset for next sentence;
}
return 0;
}
My textfile that the program is accepting contains this
some line with text
another line of words
Jimmy John took the a apple and something reallyreallylongword it was nonsense
and my output is this
text
another
reallyreallylongword
but I would like the output to be
some
another
reallyreallylongword
EDIT: If anyone plans on using this code, remember when you fix the newline character issue don't forget about the null terminator. This is fixed by setting
sentence[strlen(sentence)-1] = 0 which in effect gets rid of newline character and replaces it with null terminating.
You get each line by using
fgets(sentence, 100, file)
The problem is, the new line character is stored inside sentence. For instance, the first line is "some line with text\n", which makes the longest word "text\n".
To fix it, remove the new line character every time you get sentence.

C - char array getting phantom values after memset

My program reads in a text file line by line and prints out the largest word in each sentence line. However, it sometimes prints out previous highest words although they have nothing to do with the current sentence and I reset my char array at the end of processing each line. Can someone explain to me what is happening in memory to make this happen? Thanks.
//Program Written and Designed by R.Sharpe
//LONGEST WORD CHALLENGE
//Purpose: Find the longest word in a sentence
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "memwatch.h"
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
//printf("%s\n", sentence);
char sub[100];
char maxWord[100];
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
char* word;
int maxLength = 0;
word = strtok(sentence, " ");
while(word != NULL)
{
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
//printf("%s\n", maxWord);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
memset(maxWord, 0, sizeof(char));
maxLength = 0; //reset for next sentence;
}
free(sentence);
return 0;
}
my text file contains . .
some line with text
another line of words
Jimmy John took the a apple and something reallyreallylongword it was nonsense
test test BillGatesSteveJobsWozWasMagnificant
a b billy
the output of the program is . .
some
another
reallyreallylongword
BillGatesSteveJobsWozWasMagnificantllyreallylongword
BillGatesSteveJobsWozWasMagnificantllyreallylongword //should be billy
Also when I arbitrarily change the length of the 5th sentence the last word sometimes
comes out to be "reallyreallylongword" which is odd.
EDIT: Even when I comment MEMSET out I still get the same result so it may not have anything to do with memset but not completely sure
Trailing NULL bytes (\0) are the bane of string manipulation. You have a copy sequence that is not quite doing what you desire of it:
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
Sentence is copied into sub, and then back again. Except, strncpy does not copy the '\0' out of sentence. When you copy the string from sub back into sentence, you are copying an unknown length of data back into sentence. Because the stack is being reused and the char arrays are uninitialized, the data is likely residing there from the previous iteration and thus being seen by the next execution.
Adding the following between the two strcpys fixes the problem:
sub[strlen(sentence) - 1] = '\0';
You've got a missing null terminator.
char sub[100];
char maxWord[100];
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
When you strncpy, if src is longer than the number of characters to be copied, no null terminator is added. You've guaranteed this is the case, so sub has no terminator, and you're rapidly running into behavior you don't want. It looks like you're trying to trim the last character from the string; the easier way to do that is simply set the character at index strlen(sentence)-1 to '\0'.
This is bad:
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
The strncpy function does not null-terminate its buffer if the source string doesn't fit. By doing strlen(sentence)-1 you guaranteed it doesn't fit. Then the strcpy causes undefined behaviour because sub isn't a string.
My advice is to not use strncpy, it is almost never a good solution to a problem. Use strcpy or snprintf.
In this case you never even use sub so you could replace these lines with:
sentence[ strlen(sentence) - 1 ] = 0;
which has the effect of removing the \n on the end that was left by fgets. (If the input was longer than 100 then this deletes a character of input).
Find the corrected code in below
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char sub[100];
char maxWord[100];
char* word;
int maxLength = 0;
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
maxLength = 0;
strncpy(sub, sentence, strlen(sentence)-1);
sub[strlen(sentence) - 1] = '\0'; //Fix1
strcpy(sentence, sub);
word = strtok(sentence, " ");
while(word != NULL)
{
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
memset(maxWord, 0, sizeof(char));
maxLength = 0; //reset for next sentence;
}
free(sentence);
fclose (file); //Fix2
return 0;
}
Ensure that the file is closed at the end. It is good practice.

C : Comparing 2 Strings

I have to compare 2 strings, one from a file, and one from a user input, here is the file:
Password
abcdefg
Star_wars
jedi
Weapon
Planet
long
nail
car
fast
cover
machine
My_little
Alone
Love
Ghast
The code for getting the string from the line is fine but the code for comparing the 2 strings does not give the right output
int main(void){
int loop, line;
char str[512];
char string[512];
FILE *fd = fopen("Student Passwords.txt", "r");
if (fd == NULL) {
printf("Failed to open file\n");
return -1;
}
printf("Enter the string: ");
scanf("%s",string);
printf("Enter the line number to read : ");
scanf("%d", &line);
for(loop = 0;loop<line;++loop){
fgets(str, sizeof(str), fd);
}
printf("\nLine %d: %s\n", line, str);
if(strcmp(string,str) == 0 ){
printf("Match");
}else{
printf("No Match");
}
fclose(fd);
getch();
return 0;
}
Perhaps the str resets but i don't know, perhaps some of the talented programmers here can see the problem.
Anyone know what is wrong with my string comparison?
Correct output:
Input: jedi, 4 Output: Match
edit: Both strings are the same, in the same case
edit: dreamlax fixed this.
fgets() does not discard any newline character after reading, so it will be part of str, which will cause the comparison to fail since string won't have a newline character. To get around this, you simply need to remove the newline character from str.
str[strlen(str) - 1] = '\0';
if (strcmp(string, str) == 0)
// ...
Ideally, make sure strlen(str) > 0 first, otherwise you will invoke undefined behaviour.

Resources