File I/O word count program in C - c

I know there are several asks already regarding this topic, but it seemed like those were either not completely answered or hard to apply to my code, so I apologize if this is a repeat.
I am having trouble with the below function in an overall I/O program that also does word and line count (those work). char* filename is pulled from the command line. In this example it is pulling from a txt file with lorum ipsum. (69 words) In theory the below function should read from filename and write it to an array. Then read that array and checks if the current character is a space ' ' and the next character is not. It currently returns 0 regardless.
int wordcount(char* filename) {
int wc=0,i=0,z=0;
char w, test[1000];
FILE *fp;
fp = fopen(filename, "r");
while (feof(fp) == 0) {
fscanf(fp, "%c", &test[i]);
i++;
}
while (z>i-1) {
if (test[z] = ' ' && test[z+1] != ' ' ) {
wc++;z++;
}
}
return wc;
}
NOTES: i know it's super inefficient to declare a 1000 char array, but I wasn't sure how else to do it. If you have any improvements or other methods to accomplish this, it would be greatly appreciated if you shared. Also, i'm aware that this currently ignores others types of whitespace, but I am just testing this first and will expand after.
Thanks for any assistance.

There is a sample function doing what you need. Some suggestions for you code, fopen() must be followed by fclose() when you no longer need the file. Always check if the pointer returned by fopen is not NULL and do nothing in that case, just return error code. The presence of new word can be safely detected by the space character followed by a non space character in that case increment world count ++wc. Use getc() to read one character from the file object and use isspace() function to check if the character is a space one. You don't need an array to store the file if no one modifies that file during the worldcount run.
int wordcount(const char* filename)
{
int wc=0;
char c;
FILE *fp;
fp = fopen(filename, "r");
if(fp == NULL)
{
return -1;
}
bool previsspace = 1;
while ((c=getc(fp)) != EOF)
{
if (isspace(c)==0 && (previsspace==1)) ++wc;
previsspace = isspace(c);
}
fclose(fp);
return wc;
};
You will need the following include files:
#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>

The concrete problem why you always get a result of 0 words:
while (z>i-1) {
z is never larger than i-1. Probably you meant to loop while z is smaller than i-1 instead:
while (z<i-1) {
Additionally you only increment z when you find a word. You should increment it for every character you test, no matter if it's a space or not.

Related

Why should I put SEEK_SET twice

I want to modify some vowels of a file by "5". The following code works. However, I do not understand why I should put fseek twice.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
void print_file_contents(const char *filename)
{
FILE *fp;
char letter;
if((fp=fopen(filename,"r+"))==NULL)
{
printf("error\n");
exit(1);
}
fseek(fp,0,SEEK_END);
int size=ftell(fp);
rewind(fp);
for(int i=0;i<size;i++)
{
fseek(fp,i,SEEK_SET);
letter=fgetc(fp);
if((letter=='a') || (letter=='e') || (letter=='i'))
{
fseek(fp,i,SEEK_SET); // WHY THIS FSEEK ?
fwrite("5",1,sizeof(char),fp);
}
}
fclose(fp);
}
int main(int argc, char *argv[])
{
print_file_contents("myfile");
return 0;
}
In my opinion, the first fseek(fp, i, SEEK_SET) is used to set the file position indicator to the current character being processed, so that the character can be read using fgetc. Hence, the cursor is updated every time so there is no need to add another fseek(fp, i, SEEK_SET);.
The fgetc advanced the file position; if you want to replace the character you just read, you need to rewind back to the same position you were in when you read the character to replace.
Note that the C standard mandates a seek-like operation when you switch between reading and writing (and between writing and reading).
§7.21.5.s The fopen function ¶7:
¶7 When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end- of-file.
Also, calling fgetc() moves the file position forward one character; if the write worked (it's undefined behaviour if you omit the seek-like operation), you'd overwrite the next character, not the one you just read.
Your intuition is correct: two of the three fseek calls in this program are unnecessary.
The necessary fseek is the one inside the if((letter=='a') || (letter=='e') || (letter=='i')) conditional. That one is needed to back up the file position so you overwrite the character you just read (i.e. the vowel), not the character after the vowel.
The fseek inside the loop (but outside the if) is unnecessary because both fgetc and fwrite advance the file position, so it will always set the file position to the position it already has. And the fseek before the loop is unnecessary because you do not need to know how big the file is to implement this algorithm.
This code can be tightened up considerably. I'd write it like this:
#include <stdio.h>
void replace_aie_with_5_in_place(const char *filename)
{
FILE *fp = fopen(filename, "r+"); // (1)
if (!fp) {
perror(filename); // (2)
exit(1);
}
int letter;
while ((letter = fgetc(fp)) != EOF) { // (3)
if (letter == 'a' || letter == 'e' || letter == 'i') { // (4)
fseek(fp, -1, SEEK_CUR); // (5)
fputc('5', fp);
if (fflush(fp)) { // (6)
perror(filename);
exit(1);
}
}
if (fclose(fp)) { // (7)
perror(filename);
exit(1);
}
}
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "usage: %s filename\n", argv[0]);
return 1;
}
replace_aei_with_5_in_place(argv[1]); // (8)
return 0;
}
Notes:
It is often (but not always) better to write operations with side effects, like fopen, separately from conditionals checking whether they succeeded.
When a system-level operation fails, always print both the name of any file involved, and the decoded value of errno. perror(filename) is a convenient way to do this.
You don't need to know the size of the file you're crunching because you can use a loop like this, instead. Also, this is an example of an exception to (1).
Why not 'o' and 'u' also?
Here's the necessary call to fseek, and the other reason you don't need to know the size of the file: you can use SEEK_CUR to back up by one character.
This fflush is necessary because we're switching from writing to reading, as stated in Jonathan Leffler's answer. Inconveniently, it also consumes the notification for some (but not all) I/O errors, so you have to check whether it failed.
Because you are writing to the file, you must also check for delayed I/O errors, reported only on fclose. (This is a design error in the operating system, but one that we are permanently stuck with.)
Best practice is to pass the name of the file to munge on the command line, not to hardcode it into the program.
#Jonathan Leffler well states why code used multiple fseek(): To cope with changing between reading and writing.
int size=ftell(fp); is weak as the range of returned values from ftell() is long.
Seeking in a text file (as OP has) also risks undefined behavior (UB).
For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET. C17dr § 7.21.9.1 3.
Better to use #zwol like approach with a small change.
Do not assume a smooth linear mapping. Instead, note the location and then return to it as needed.
int replacement = '5';
for (;;) {
long position = ftell(fp);
if (ftell == -1) {
perror(filename);
exit(1);
}
int letter = fgetc(fp);
if (letter == EOF) {
break;
}
if (letter == 'a' || letter == 'e' || letter == 'i') {
fseek(fp, position, SEEK_SET);
fputc(replacement, fp);
if (fflush(fp)) {
perror(filename);
exit(1);
}
}
}
Research fgetpos(), fsetpos() for an even better solution that handles all file sizes, even ones longer than LONG_MAX.

How to save every line in file (IN C) in a variable? :)

I need to save every line of text file in c in a variable.
Here's my code
int main()
{
char firstname[100];
char lastname[100];
char string_0[256];
char string[256] = "Vanilla Twilight";
char string2[256];
FILE *file;
file = fopen("record.txt","r");
while(fgets(string_0,256,file) != NULL)
{
fgets(string2, 256, file);
printf("%s\n", string2);
if(strcmp(string, string2)==0)
printf("A match has been found");
}
fclose(file);
return 0;
}
Some lines are stored in the variable and printed on the cmd but some are skipped.
What should I do? When I tried sscanf(), all lines were complete but only the first word of each line is printed. I also tried ffscanf() but isn't working too. In fgets(), words per line are complete, but as I've said, some lines are skipped (even the first line).
I'm just a beginner in programming, so I really need help. :(
You're skipping over the check every odd number of lines, as you have two successive fgets() calls and only one strcmp(). Reduce your code to
while(fgets(string_0,256,file) != NULL)
{
if( ! strcmp(string_0, string2) )
printf("A match has been found\n");
}
FWIW, fgets() reads and stores the trailing newline, which can cause problem is string comparison, you need to take care of that, too.
As a note, you should always check the return value of fopen() for success before using the returned pointer.

Array of strings being overwritten

I have a program that is trying to take a text file that consists of the following and feed it to my other program.
Bruce, Wayne
Bruce, Banner
Princess, Diana
Austin, Powers
This is my C code. It is trying to get the number of lines in the file, parse the comma-separated keys and values, and put them all in a list of strings. Lastly, it is trying to iterate through the list of strings and print them out. The output of this is just Austin Powers over and over again. I'm not sure if the problem is how I'm appending the strings to the list or how I'm reading them off.
#include<stdio.h>
#include <stdlib.h>
int main(){
char* fileName = "Example.txt";
FILE *fp = fopen(fileName, "r");
char line[512];
char * keyname = (char*)(malloc(sizeof(char)*80));
char * val = (char*)(malloc(sizeof(char)*80));
int i = 0;
int ch, lines;
while(!feof(fp)){
ch = fgetc(fp);
if(ch == '\n'){ //counts how many lines there are
lines++;
}
}
rewind(fp);
char* targets[lines*2];
while (fgets(line, sizeof(line), fp)){
strtok(line,"\n");
sscanf(line, "%[^','], %[^',']%s\n", keyname, val);
targets[i] = keyname;
targets[i+1] = val;
i+=2;
}
int q = 0;
while (q!=i){
printf("%s\n", targets[q]);
q++;
}
return 0;
}
The problem is with the two lines:
targets[i] = keyname;
targets[i+1] = val;
These do not make copies of the string - they only copy the address of whatever memory they point to. So, at the end of the while loop, each pair of target elements point to the same two blocks.
To make copies of the string, you'll either have to use strdup (if provided), or implement it yourself with strlen, malloc, and strcpy.
Also, as #mch mentioned, you never initialize lines, so while it may be zero, it may also be any garbage value (which can cause char* targets[lines*2]; to fail).
First you open the file. The in the while loop, check the condition to find \n or EOF to end the loop. In the loop, if you get anything other than alphanumeric, then separate the token and store it in string array. Increment the count when you encounter \n or EOF. Better use do{}while(ch!=EOF);

Passing a file as an argument and reading the data

So I am trying to write a C code that takes in a file name as the argument and reads the file and stores it into an array. I have tried but failed epically :(
Can anyone please point me in the right direction? Here is what I came up with (I know it may be completely off track :/ )
#include <stdio.h>
int main (int argc, char *argv[]) {
char content[500];
int k=0;
FILE* inputF;
inputF = fopen("argv[0]", "r");
do {
fscanf(inputF, "%c", &content[k]);
k++;
} while (content[k] != EOF );
return 0;
}
You passed "argv[0]" string to fopen, I'm sure that isn't the name of you file you are trying to open.
You should pass a pointer to a string that contains the file name.
inputF = fopen(argv[1], "r");
Also note the usage of argv[1] not argv[0].
argv[0] contains the full filepath and name of the executable and argv[1] the first string entered as command line parameter.
A couple of points to help get you started:
argc is the number of arguments, and the first argv pointer is the name of the executable file. The second is what you want.
You have to check that your file pointer is valid before trying to use it.
Maybe look at using fgetc to read each character, and test for EOF.
You need to check that you don't overrun your content buffer.
If you're stuck, here's an example of a main loop using a do while:
do {
ch = fgetc(fp);
content[a] = ch;
a++;
} while (ch != EOF && a < 500);
This will store an EOF (if found) in your array.

Find and Replace in a C File

The Problem was to find and replace a string in a C File.
I am new to C Files. I have tried the following code but I didnt get any output:
#include<stdio.h>
#include<string.h>
int main()
{
FILE *f1,*f2;
char *src,*dest,*s1,ch,ch1,ch2,ch3;
int i;
f1=fopen("input.txt","rw");
f2=fopen("dummy.txt","rw");
src="mor";
dest="even";
while(ch!=EOF)
{
ch=fgetc(f1);
if(ch==src[0]) //Finding 1st char of src
{
fgets(s1,strlen(src),f1);
if(strcmp(src+1,s1)==0) //Finding occurance of "src" in file
{
fseek(f1,strlen(src)-1,SEEK_CUR);
while(ch1!=EOF) //Copying remaining data into another file
{
ch1=fgetc(f1);
fputc(ch1,f2);
}
fseek(f1,-strlen(src),SEEK_CUR);
for(i=0;i<strlen(dest);i++) //replacing "src" with "dest"
{
ch2=dest[i];
fputc(ch2,f1);
}
fclose(f1);
f1=fopen("input.txt","a");
while(ch3!=EOF) //Appending previosly copied data into 1st file
{
ch3=fgetc(f2);
fputc(ch3,f1);
}
}
}
}
fclose(f1);
fclose(f2);
}
The Contents of input.txt is "morning".
Kindly point the ERROR in the logic and also give an efficient code for the same.
Thanks in Advance.
Reading files in C is usually a bit messy. The first problem that I see is the way ch is used in the main loop. The first time
while (ch != EOF)
is executed, ch is uninitialized, and if it happens to hold EOF, the main loop will not execute at all. I usually use the following structure for reading from files:
FILE *fInput = fopen("input.txt", "r");
int ch; /* need an int to hold EOF */
for (;;)
{
ch = fgetc(fInput);
if (ch == EOF) break;
...
}
In addition, you may need to read up on file pointer concept. For example, after reading the remainder of src, you fseek() forward, and skip some more characters before you copy data to f2. Essentially, you read m, read or (with fgets() - and into an unallocated buffer s1 that would go ka-boom on you some time in the near future), skip 2 more characters (now your pointer is at last n of "morning"), copy "ng" into f2, try to write EOF to f2 in this loop (hence the above pattern for reading until EOF), seek two characters back (which may fail once you reach EOF, my C file functions are a bit rusty these days), write "even" to f1 (which should, if I am wrong about seek after EOF, set input file to "mornieven", and not change it if I am correct). In summary, I don't think the code does what you intend it to do.
I would recommend building up your function. Each one of the following can be written as a program that you should test and finish before going to next step:
read the file safely, and print it out
detect the contents of src, and print the rest of input
save the rest of the input to second file instead of printing
replace src with dest in first file, and ignore the rest (since you open input file with 'rw', this will truncate the rest of input). You may need to do an fseek() to clear the EOF status. Also look at ftell() to record a position that you can jump back to using fseek()
finally, copy in everything you have saved to second file after replacing src with dest (no need to close f1 here. But it is better to open f2 as write, close after copy from first file, and reopen as read to perform the copy back to f1).
In addition, when you need a buffer (such as s1), just use a large enough array for now, but look into malloc() and free() functions to perform dynamic memory allocations for situations like these.
One simple way to do the replace would be to first read in the whole file into a buffer
e.g.
FILE* fpIn = fopen("file.txt","rb");
fseek(fpIn, 0L, SEEK_END);
size_t s = ftell(fpIn);
fseek(fpIn, 0L, SEEK_SET);
void* buf = malloc(s);
fread(buf,s,1,fpIn);
now while writing the file, check for your string
char src[] = "mor";
char dest[] = "even";
int lenSrc = strlen(src);
int lenDest = strlen(dest);
for (char* ch = buf; ch < buf + s; ++ch)
{
if ( !memcmp( ch, src, lenSrc ) )
{
fwrite( dest, 1,lenDest, fpOut );
ch += lenSrc;
}
else
{
fputc( *ch, fp );
}
}
disclaimer: haven't compiled this
You are printing the wrong thing in your output. Print, "ch", not the file pointer.
while(ch!=EOF)
{
ch=getc(f1);
printf("%c",ch);
}
while(ch!=EOF)
{
ch=getc(f2);
printf("%c",ch);
}
Also, f2 is closed at the end during your output. You'll have to reopen it (just like you do with f1.)
At first glance, I see that your code to call fgets is wrong. You have not allocated any memory and you are reading a string into an uninitialized pointer. Read into an array or dynamically allocated memory.
Another problem is that you are declaring ch as char. fgetc() returns an int, and for good reason. It is good to be able to return any possible character or EOF, so EOF shouldn't be a character, so ideally fgetc() returns a bigger type than char.
The upshot is that the loop may well never end, since ch can't possibly hold EOF on some standard implementation. Declare it (and ch1 and ch3) as int.

Resources