Best way to iterate through a file with new line separators - c

Let's say I want to read a file where each line has a string, and when there is a new line or the end of the file, I print the number of characters read. For example,
abcdf
asd
sdfsd
aa
This would print (counting new line characters at the end of each string):
10
8
(there is no new line at the end of the last line, so we get 8 instead of 9). I could do something like this
FILE* f;
// ...
int charCount = 0;
char line[20];
while (fgets(line, sizeof line, f))
{
if (strcmp(line, "\n") == 0)
{
printf("%d\n", charCount);
charCount = 0;
}
else
{
charCount += strlen(line);
}
}
printf("%d\n", charCount);
Notice that I have to repeat the printf after the loop ends, because if I don't, I wouldn't print the last value (because the file reached the end and there is not a new line at the end). For a printf, this is not that bad, but if I had something more complicated, it would result in a lot of repeated code. My workaround is putting what I want inside a function and just call the function after the loop, but I feel like there has to be a better way. Is there a better way to parse through a file like this? Preferably not character by character in case I have some formatted data that I need to use fscanf with.

You can move your fgets call into the body of the while loop, while checking its result in the loop condition and in the printing condition. It should be properly initialized before the loop to a non-NULL value.
FILE* f;
// ...
int charCount = 0;
char line[20];
char *result = line;
while (result)
{
result = fgets(line, sizeof line, f);
if ( result == NULL || strcmp(line, "\n") == 0 )
{
printf("%d\n", charCount);
charCount = 0;
}
else
{
charCount += strlen(line);
}
}

You could just do it the caveman way...
char ch;
int i = 0;
FILE *fp = fopen("yourfile.txt", "r");
while (feof(fp) == 0)
{
i++;
if ((ch = fgetc(fp)) == '\n')
printf("%d\n", --i), i = 0;
}
if (i > 1) printf("%d\n", --i);
fclose(fp);

Related

Find text inside the beg and end () parentheses in textile and read/print into a buffer. IN C

I am new to C and am getting very frustrated with learning this language. Currently I'm trying to write a program that reads in a program textfile, reads and prints all the string literals, and tokens each on separate line. I have most of it except for one snag. within the text file there is a line such as: (..text..). I need to be able to search, read and print all the text is inside the parentheses on it's own line. Here is an idea I have so far:
#define KEY 32
#define BUFFER_SIZE 500
FILE *fp, *fp2;
int main()
{
char ch, buffer[BUFFER_SIZE], operators[] = "+-*%=", separators[] = "(){}[]<>,";
char *pus;
char source[200 + 1];
int i, j = 0, k = 0;
char *words = NULL, *word = NULL, c;
fp = fopen("main.txt", "r");
fp2 = fopen ("mynewfile.txt","w") ;
while ((ch = fgetc(fp)) != EOF)
{
// pus[k++] = ch;
if( ch == '(')
{
for ( k = 0;, k < 20, K++){
buffer[k] = ch;
buffer[k] = '\0';
}
printf("%s\n", buffer)
}
....
The textfile is this:
#include <stdio.h>
int main(int argc, char **argv)
{
for (int i = 0; i < argc; ++i)
{
printf("argv[%d]: %s\n", i, argv[i]);
}
}
So far I've been able to read char by char and place it into a buffer. But this idea just isn't working, and I'm stumped. I've tried dabbling with strcopy(), ands strtok, but they all take char arrays. Any ideas would be appreciated thank you.
Most likely the best way would be to use fgets() with a file to read in each line as a string (char array) and then delimit that string. See the short example below:
char buffer[BUFFER_SIZE];
int current_line = 0;
//Continually read in lines until nothing is left...
while(fgets(buffer, BUFFER_SIZE - 1, fp) != NULL)
{
//Line from file is now in buffer. We can delimit it.
char copy[BUFFER_SIZE];
//Copy as strtok will overwrite a string.
strcpy(copy, buffer);
printf("Line: %d - %s", current_line, buffer); //Print the line.
char * found = strtok(copy, separators); //Will delmit based on the separators.
while(found != NULL)
{
printf("%s", found);
found = strtok(NULL, separators);
}
current_line++;
}
strtok will return a char pointer to where the first occurrence of a delimiter is. It will replace the delimiter with the null terminator, thereby making "new" string. We can pass NULL to strtok to tell it to continue where it left off. Using this, we can parse line by line from a file based on multiple delimiters. You could save these individual string or evaluate them further.

strncmp gives 0 even when strings are NOT equal - C

I am having a situation with strncmp function in C, it is returning 0 even when the words do not match, in the example below, I am testing it with the letter 'R' and when running the code it returns 0 even when the compared word in the txt document is 'RUN'. Do you happen to know whether
Am I missing something in the strncmp function or somewhere else in my code?
Thank you for your input.
bool lookup(string s);
int main(void) {
char *s;
s = "R";
if (lookup(s)) {
printf("Word found =)\n");
} else {
printf("Word not found =(\n");
}
}
// Looks up word, s, in txt document.
bool lookup(string s)
{
// TODO
char *wordtosearch;
wordtosearch = s;
int lenwordtosearch = strlen(wordtosearch);
char arraywordindic[50];
// Open txt file
FILE *file = fopen("text.txt", "r");
if (file == NULL)
{
printf("Cannot open file, please try again...\n");
return false;
}
while (!feof(file)) {
if (fgets(arraywordindic, 50, file) != NULL) {
char *wordindic;
wordindic = arraywordindic;
int result = strncmp(wordindic, wordtosearch, lenwordtosearch);
if (result == 0) {
printf("%i\n", result);
printf("%s\n", wordindic);
printf("%s\n", wordtosearch);
fclose(file);
return true;
}
}
}
fclose(file);
return false;
}
The thing is that it compares R with RUN and it gives 0. I want it to
return 0 when it finds R only.
In this case you need to compare whole words using the function strcmp instead of comparing only lenwordtosearch characters using the function strncmp.
Take into account that the function fgets can append the new line character '\n' to the entered string. You need to remove it before comparing strings.
if (fgets(arraywordindic, 50, file) != NULL) {
arraywordindic[ strcspn( arraywordindic, "\n" ) ] = '\0';
int result = strcmp(arraywordindic, wordtosearch);
if (result == 0) {
printf("%i\n", result);
printf("%s\n", arraywordindic);
printf("%s\n", wordtosearch);
As a result these declarations
int lenwordtosearch = strlen(wordtosearch);
and
char *wordindic;
wordindic = arraywordindic
may be removed.
And the condition of the while loop should be written like
while ( fgets(arraywordindic, 50, file) != NULL ) {
arraywordindic[ strcspn( arraywordindic, "\n" ) ] = '\0';
int result = strcmp(arraywordindic, wordtosearch);
if (result == 0) {
printf("%i\n", result);
printf("%s\n", arraywordindic);
printf("%s\n", wordtosearch);
//...
int result = strncmp(wordindic, wordtosearch, lenwordtosearch);
This is going to give you zero if the first lenwordtosearch characters of wordtosearch matches the first lenwordtosearch characters of any word in the dictionary.
Given that the word you're searching for is S, any word in the dictioanary that starts with S is going to give you a match.
You should probably be checking the entire word. That probably means cleaning up the word you've read in from the file (i.e., removing newline) and using strcmp() instead, something like:
wordindic = arraywordindic;
// Add this:
size_t sz = strlen(wordindic);
if (sz > 0 && wordindic[sz - 1] == '\n')
wordindic[sz - 1] = '\0';
// Modify this:
// int result = strncmp(wordindic, wordtosearch, lenwordtosearch);
int result = strcmp(wordindic, wordtosearch);

I am kinda new to C programming. I Need help to push a string from a file(test.txt) into an array without the commas

I am trying to solve a problem on Dynamic memory allocation by reading the input from a file by malloc(),free(),realloc(); i just need help to push the strings into an array from the file, without the commas . My test.txt file are as follows:
a,5,0
a,25,1
a,1,2
r,10,1,3
f,2
int i;
int count;
char line[256];
char *str[20];//to store the strings without commas
char ch[20];
int main (void)
{
FILE *stream;
if ( (stream = fopen ( "test.txt", "r" )) == NULL )
{ printf ("Cannot read the new file\n");
exit (1);
}
while(fgets(line, sizeof line, stream))
{
printf ("%s", line);
int length = strlen(line);
strcpy(ch,line);
for (i=0;i<length;i++)
{
if (ch[i] != ',')
{
printf ("%c", ch[i]);
}
}
}
//i++;
//FREE(x);
//FREE(y);
//FREE(z);
fclose (stream);
the str[] array should only store values like a520. (excluding the commas)
First of all DO NOT use global variables unless it is absolutely requires.
I am assuming you want str as array of pointers and str[0] stores first line, str[1] stores second line and so on.
For this:
int line_pos = 0; //stores line_number
int char_pos = 0; //stores position in str[line_pos]
while(fgets(line, sizeof(line), stream))
{
printf ("%s", line);
int length = strlen(line);
strcpy(ch,line);
str[line_pos] = calloc(length, sizeof(char)); //allocating memory
for (i=0;i<length;i++)
{
if (ch[i] != ',')
{
*(str[line_pos]+char_pos) = ch[i]; //setting value of str[line][pos]
char_pos++;
}
}
char_pos = 0;
line_pos++;
}
printf("%s", str[0]); //print first line without comma
Note that it only works for 20 lines (because you declared *str[20]) and then for 21st or later lines it leads to overflow and can cause variety of disasters. You can include:
if (line_pos >= 20)
break;
as a safety measure.
Note that slighty more memory is allocated for str(memory allocated = memory_required + number of comma). To prevent this you can set ch to text without comma:
for (i=0;i<length;i++)
{
int j = 0; //stores position in ch
if (line[i] != ',')
{
ch[j++] = line[i];
}
Then allocate memory for str[line_pos] like:
str[line_pos] = calloc(strlen(ch0, sizeof(char));

Program to remove preceding whitespace

I am working on a program that should remove preceding spaces and tabs from each line of text in a given file (case b). I read the file from stdin, which I got working fine. However I am getting a nasty seg fault that I can't figure out. It happens when I call strcat() in case b. Basically what I was trying to do in case b is iterate through each line (80 characters) in the text file, remove any preceding tabs or spaces from the line, then put these lines back into finalText. Can anyone see where am I going wrong? Or if there might be a simpler approach?
Here's my code:
int main(int argc, char* argv[]) {
int x = 0;
int i = 0;
int j = 0;
int y = 0;
int count = 1;
char *text = malloc(sizeof(char) * 1024);
char *finalText = malloc(sizeof(char) * 1024);
char buff[80];
while(fgets(buff, 80, stdin) != NULL){
strcat(text, buff);
}
while ((x = getopt(argc, argv, "bic:")) != -1){
switch (x){
case 'b':
for(; text[i] != EOF; i += 80){
char buff2[80];
char *buff3;
j = i;
y = 0;
while(j != (80 * count)){
buff2[y] = text[j];
y++;
j++;
}
buff3 = buff2;
while(*buff3 && isspace(*buff3)){
++buff3;
}
count++;
strcat(finalText, buff3);
}
printf(finalText);
break;
default:
break;
}
}
return 0;
}
#include <stdio.h>
int main(){
char buff[80];
int n;
while(fgets(buff, sizeof(buff), stdin)){
sscanf(buff, " %n", &n);
if(n && buff[n-1] == '\n')//only whitespaces line.(nothing first word)
//putchar('\n');//output a newline.
fputs(buff, stdout);//output as itself .
else
fputs(buff + n, stdout);
}
return 0;
}
Firstly before the 'b' case, there is another problem too. You have allocated 1024 byte for text. Each line you read from stdin is concatenated at text string. If the total characters read from stdin exceed 1024 bytes you will receive a segmentation fault.
For your problem at 'b' case:
Why searching for EOF? EOF is not a character and your loop will continue to iterating incrementing i until you receive a segmentation fault. You just want to iterate until the end of the string which can be retrieved with strlen() for example.

The curse of the \n character in C

I've made a simple spellchecker that reads in a dictionary and user text file to check against it. The program needs to display the line and word index of any word not in the dictionary. So it works fine until the user text file has a return \n character in it (at the end of a paragraph or sentence). So Hello is actually tested against the dictionary as Hello\n and the program believes its spelled incorrectly. Can anyone advise a method to remove the \n character? Here is my code:
#include <stdio.h>
#include <string.h>
void StrLower(char str[])
{
int i;
for (i = 0; str[i] != '\0'; i++)
str[i] = (char)tolower(str[i]);
}
int main (int argc, const char * argv[]) {
FILE *fpDict, *fpWords;
fpWords = fopen(argv[2], "r");
if((fpDict = fopen(argv[1], "r")) == NULL) {
printf("No dictionary file\n");
return 1;
}
char dictionaryWord[50]; // current word read from dictionary
char line[100]; // line read from spell check file (max 50 chars)
int isWordfound = 0; // 1 if word found in dictionary
int lineCount = 0; // line in spellcheck file we are currently on
int wordCount = 0; // word on line of spellcheck file we are currently on
while ( fgets ( line, sizeof line, fpWords ) != NULL )
{
lineCount ++;
wordCount = 0;
char *spellCheckWord;
spellCheckWord = strtok(line, " ");
while (spellCheckWord != NULL) {
wordCount++;
spellCheckWord = strtok(NULL, " ,");
if(spellCheckWord==NULL)
continue;
StrLower(spellCheckWord);
printf("'%s'\n", spellCheckWord);
while(!feof(fpDict))
{
fscanf(fpDict,"%s",dictionaryWord);
int res = strcmp(dictionaryWord, spellCheckWord);
if(res==0)
{
isWordfound = 1;
break;
}
}
if(!isWordfound){
printf("word '%s' not found in Dictionary on line: %d, word index: %d\n", spellCheckWord, lineCount, wordCount); //print word and line not in dictionary
}
rewind(fpDict); //resets dictionarry file pointer
isWordfound = 0; //resets wordfound for next iteration
}
}
fclose(fpDict);
fclose(fpWords);
return 0;
}
Wow thanks for the quick responses everyone. You guys are great, over the moon with that!
Remove the '\n' Immediately after the fgets() call:
while ( fgets ( line, sizeof line, fpWords ) != NULL )
{
size_t linelen = strlen(line);
assert((linelen > 0) && "this can happen only when file is binary");
if (line[linelen - 1] == '\n') line[--linelen] = 0; /* remove trailing '\n' and update linelen */
Try adding \n to the argument you pass to strtok.
If you simply want to remove the character for the sake of comparison, and know it will be at the end of a line, then when you read the word into your buffer, do a strchr() for \n and then replace that position with \0 if you find it.
How about:
size_t length = strlen(dictionaryWord);
if (length > 0 && dictionaryWord[length-1] == '\n') {
dictionaryWord[length-1] = 0;
}

Resources