Counting characters one by one from a textfile - c

My program requires to get input from a textfile and output it on a single line.
However if the amount of characters on a singular line exceed 60, there should be a new line.
The code I have so far:
int main(int argc, char *argv[]){
FILE *pFile;
char x[10];
char *token;
pFile = fopen("test5.txt","r");
int wordLength;
int totalCharLength;
char newline[10] = "newline";
if(pFile != NULL){
while (fgets(x, sizeof(x), pFile) != NULL) {
token = strtok(x,"\r\n");
if(token != NULL){
printf("%s",x);
}
else{
printf("%s",x);
}
//counter++;
wordLength = strlen(x);
totalCharLength += wordLength;
printf("%d",totalCharLength);
if(totalCharLength == 60){
printf("%s",newline);
}
}
fclose(pFile);
}
}
Textfile:
a dog chased
a cat up the
tree. The cat
looked at the dog from
the top of the tree.
There are 5 separate lines. Which displays this. This is not all on one line.
Output:
a dog chased a cat up the tree. The cat looked at the dog from the top of the tree.
Now the program needs to be able to get input text in that format and print out in a single line.
So the above Output there should be a new line at the 60th char which is before second word "dog".
However I want to add a feature so that there is a counter for character and it prints a new line when character count = 60.
Now with some more debugging I added the code
printf("%d",totalCharLength);
just before the if(totalCharLength==60) line and I realize that the characters get counted in random increments instead of one by one. Output:
a dog cha9sed 13a cat up 22the 26tree. The35 cat 40looked at49 the dog 58 from 63 the top o72f the tre81e. 85
So this shows that it does not count character by character. However when I change charx[10] to a lower value it does not print everything on one line. It will leave gaps.
Example: Changing charx[10] to charx[2]
This correctly gets character by character however the output is not in on one line.
The only time there should be a new line is when characters exceed 60 in a line. Not 14 (First line).
a2 3d4o5g6 7c8h9a10s11e12d13 14
30a17 18c19a20t21 22u23p24 25t26h27e28 29
46t32r33e34e35.36 37T38h39e40 41c42a43t44 45
71l48o49o50k51e52d53 54a55t56 57t58h59e60newline 61d62o63g64 65f66r67o68m69 70
72t73h74e75 76t77o78p79 80o81f82 83t84h85e86 87t88r89e90e91.92 93 94

I think you are best just printing each char on his own, and check for a space before adding the newline.
Something like:
((charCount > 60) && (currentChar == ' ')) {
// print newline and reset counter
printf("\n");
charCount = 0;
}
Edit:
before anything check if the current char is a newline already and skip it.
Check the carriage return \r as well since in windows newlines are \r\n.
#include <stdio.h>
int main(void) {
FILE *f = fopen("test.txt", "r");
if(!f) {
printf("File not found.\n");
return 1;
}
char c;
int cCount = 0;
while((c = fgetc(f)) != EOF) {
if(c == '\r') continue;
if(c == '\n') c = ' ';
printf("%c", c);
if((cCount++ > 60) && (c == ' ')) {
printf("\n");
cCount = 0;
}
}
fclose(f);
return 0;
}

There you go
int main() {
FILE *pFile = fopen("test.txt","r");
int c, counter = 0;
if (pFile != NULL) {
while ((c = getc(pFile)) != EOF) {
counter++;
if (c != '\n') { putchar(c); }
if (counter == 60) {
counter = 0;
printf("\n");
}
}
}
fclose(pFile);
}

if(token != NULL){
printf("%s",x);
strcat(newArray, x);
}
After printing out the token, have a separate array that you concatenate the token too. This will give you the entire file on one line. It will be much easier from here to parse out the 60 characters then print lines as you go.

It goes like this,
#include<stdio.h>
//#pragma warning(disable : 4996)
int main()
{
FILE *pfile;
char data;
int count = 1;
pfile = fopen("test.txt", "r");
printf("Opening file...\n");
if (pfile == NULL)
{
printf("Error!\n");
}
while ((data = fgetc(pfile)) != EOF)
{
if (count != 60)
{
if (data != '\n')
{
printf("%c", data);
count++;
}
}
else
{
printf("\n");
count = 1;
}
}
fclose(pfile);
return 0;
}

In the first case, when you had char x[10], fgets will read up to 9 characters, or until a newline is found. So the output is
a dog cha9sed 13a cat up 22the 26
because the first call to fgets reads 9 characters (and then prints 9), the second call reads up to the newline (prints 13), the third call reads nine characters (prints 22), and the fourth call reads up to the newline (prints 26).
When you change to char x[2], fgets will only read one character at a time. And this causes strtok to work differently than you expect. When a string contains only delimiter characters, strtok will return NULL and the string is unmodified. So if the string contains only a newline character, strtok won't remove the newline, like you expect.
To fix the code, use fgetc to read characters and don't use strtok at all.
int main( void )
{
FILE *pFile;
if ( (pFile = fopen("test5.txt","r")) == NULL )
exit( 1 );
int c, count;
count = 0;
while ( (c = fgetc(pFile)) != EOF ) {
if ( c == '\n' )
putchar( ' ' );
else
putchar( c );
count++;
if ( count == 60 ) {
putchar( '\n' );
count = 0;
}
}
putchar( '\n' );
fclose(pFile);
}
Here's the file that I tested with
a dog chased
a cat up the
tree. The cat
looked at the dog from
the top of the tree.

#include <stdio.h>
#include <ctype.h>
int main(void){//unused argv
FILE *pFile;
int ch, prev = 0;
int totalCharLength = 0;
int wrap_size = 60;
//char newline[] = "newline";
pFile = fopen("test5.txt","r");
if(pFile != NULL){
while((ch = fgetc(pFile)) != EOF){
if(isspace(ch))
ch = ' ';
if(prev == ' ' && ch == ' ')
continue;//unified
prev = ch;
putchar(ch);
if(++totalCharLength == wrap_size -1){//-1 : include newline
putchar('\n');//printf("%s", newline);
totalCharLength = 0;
}
}
fclose(pFile);//put inside if-block
}
return 0;
}

Related

Counting chars, words and lines in a file

I try to count the number of characters, words, lines in a file.
The txt file is:
The snail moves like a
Hovercraft, held up by a
Rubber cushion of itself,
Sharing its secret
And here is the code,
void count_elements(FILE* fileptr, char* filename, struct fileProps* properties) // counts chars, words and lines
{
fileptr = fopen(filename, "rb");
int chars = 0, words = 0, lines = 0;
char ch;
while ((ch = fgetc(fileptr)) != EOF )
{
if(ch != ' ') chars++;
if (ch == '\n') // check lines
lines++;
if (ch == ' ' || ch == '\t' || ch == '\n' || ch == '\0') // check words
words++;
}
fclose(fileptr);
properties->char_count = chars;
properties->line_count = lines;
properties->word_count = words;
}
But when i print the num of chars, words and lines, outputs are 81, 18, 5 respectively
What am i missing?
(read mode does not changes anything, i tried "r" as well)
The solution I whipped up gives me the same results as the gedit document statistics:
#include <stdio.h>
void count_elements(char* filename)
{
// This can be a local variable as its not used externally. You do not have to put it into the functions signature.
FILE *fileptr = fopen(filename, "rb");
int chars = 0, words = 0, lines = 0;
int read;
unsigned char last_char = ' '; // Save the last char to see if really a new word was there or multiple spaces
while ((read = fgetc(fileptr)) != EOF) // Read is an int as fgetc returns an int, which is a unsigned char that got casted to int by the function (see manpage for fgetc)
{
unsigned char ch = (char)read; // This cast is safe, as it was already checked for EOF, so its an unsigned char.
if (ch >= 33 && ch <= 126) // only do printable chars without spaces
{
++chars;
}
else if (ch == '\n' || ch == '\t' || ch == '\0' || ch == ' ')
{
// Only if the last character was printable we count it as new word
if (last_char >= 33 && last_char <= 126)
{
++words;
}
if (ch == '\n')
{
++lines;
}
}
last_char = ch;
}
fclose(fileptr);
printf("Chars: %d\n", chars);
printf("Lines: %d\n", lines);
printf("Words: %d\n", words);
}
int main()
{
count_elements("test");
}
Please see the comments in the code for remarks and explanations. The code also would filter out any other special control sequences, like windows CRLF and account only the LF
Your function takes both a FILE* and filename as arguments and one of them should be removed. I've removed filename so that the function can be used with any FILE*, like stdin.
#include <ctype.h>
#include <stdint.h>
#include <stdio.h>
typedef struct { /* type defining the struct for easier usage */
uintmax_t char_count;
uintmax_t word_count;
uintmax_t line_count;
} fileProps;
/* a helper function to print the content of a fileProps */
FILE* fileProps_print(FILE *fp, const fileProps *p) {
fprintf(fp,
"chars %ju\n"
"words %ju\n"
"lines %ju\n",
p->char_count, p->word_count, p->line_count);
return fp;
}
void count_elements(FILE *fileptr, fileProps *properties) {
if(!fileptr) return;
properties->char_count = 0;
properties->line_count = 0;
properties->word_count = 0;
char ch;
while((ch = fgetc(fileptr)) != EOF) {
++properties->char_count; /* count all characters */
/* use isspace() to check for whitespace characters */
if(isspace((unsigned char)ch)) {
++properties->word_count;
if(ch == '\n') ++properties->line_count;
}
}
}
int main() {
fileProps p;
FILE *fp = fopen("the_file.txt", "r");
if(fp) {
count_elements(fp, &p);
fclose(fp);
fileProps_print(stdout, &p);
}
}
Output for the file you showed in the question:
chars 93
words 17
lines 4
Edit: I just noticed your comment "trying to count only alphabetical letters as a char". For that you can use isalpha and replace the while loop with:
while((ch = fgetc(fileptr)) != EOF) {
if(isalpha((unsigned char)ch)) ++properties->char_count;
else if(isspace((unsigned char)ch)) {
++properties->word_count;
if(ch == '\n') ++properties->line_count;
}
}
Output with the modified version:
chars 74
words 17
lines 4
A version capable of reading "wide" characters (multibyte):
#include <locale.h>
#include <stdint.h>
#include <stdio.h>
#include <wchar.h>
#include <wctype.h>
typedef struct {
uintmax_t char_count;
uintmax_t word_count;
uintmax_t line_count;
} fileProps;
FILE* fileProps_print(FILE *fp, const fileProps *p) {
fprintf(fp,
"chars %ju\n"
"words %ju\n"
"lines %ju\n",
p->char_count, p->word_count, p->line_count);
return fp;
}
void count_elements(FILE *fileptr, fileProps *properties) {
if(!fileptr) return;
properties->char_count = 0;
properties->line_count = 0;
properties->word_count = 0;
wint_t ch;
while((ch = fgetwc(fileptr)) != WEOF) {
if(iswalpha(ch)) ++properties->char_count;
else if(iswspace(ch)) {
++properties->word_count;
if(ch == '\n') ++properties->line_count;
}
}
}
int main() {
setlocale(LC_ALL, "sv_SE.UTF-8"); // set your locale
FILE *fp = fopen("the_file.txt", "r");
if(fp) {
fileProps p;
count_elements(fp, &p);
fclose(fp);
fileProps_print(stdout, &p);
}
}
If the_file.txt contains one line with öäü it'll report
chars 3
words 1
lines 1
and for your original file, it'd report the same as above.

C programming- read from text file

it's about reading from a text file.
I have 3 command line arguments:
name of text file
delay time
how many line(s) want to read.
I want to read that text file by user specified line numbers till text file ends.
For example, the first time I read 5 lines and then the program asks how many line(s) do you want to read?. I would enter 7 it reads lines 5 to 12.
This would repeat until the end of the file.
#include <stdlib.h>
#include <stdio.h>
#include<time.h>
#include <string.h>
void delay(unsigned int mseconds)
{
clock_t goal = mseconds + clock();
while (goal > clock());
}
int countlines(const char *filename) {
FILE *fp = fopen(filename, "r");
int ch, last = '\n';
int lines = 0;
if (fp != NULL) {
while ((ch = fgetc(fp)) != EOF) {
if (ch == '\n')
lines++;
last = ch;
}
fclose(fp);
if (last != '\n')
lines++;
}
return lines;
}
int main(int argc, char *arg[])
{
FILE *ptDosya;
char ch;
ch = arg[1][0];
int s2;
int satir = 0;
int spaceCounter=0;
int lineCount, x = 0;
lineCount = atoi(arg[3]);
s2 = atoi(arg[2]);
printf("dosya %d satir icerir.\n", countlines(arg[1]));
ptDosya = fopen(arg[1], "r");
if (ptDosya != NULL)
{
while (ch != EOF&& x < lineCount)
{
ch = getc(ptDosya);
printf("%c", ch);
if (ch == '\n')
{
delay(s2);
x++;
}
}
while (x < countlines(arg[1]))
{
printf("satir sayisi giriniz:");
scanf("%d", &lineCount);
// i don't know what should i do in this loop..
x=x+lineCount;
}
}
else {
printf("dosya bulunamadi");
}
printf("\n\nend of file!\n");
fclose(ptDosya);
return 0;
system("PAUSE");
}
Your delay function uses a busy loop. This is unnecessarily expensive in terms of computing power. It would be very unwelcome to do this on a battery operated device. Furthermore, clock() does not necessarily return a number of milliseconds. The unit used by the clock() function can be determined using the CLOCKS_PER_SEC macro. Unfortunately, there is no portable way to specify a delay expressed in milliseconds, POSIX conformant systems have usleep() and nanosleep().
Your line counting function is incorrect: you count 1 line too many, unless the file ends without a trailing linefeed.
Here is an improved version:
int countlines(const char *filename) {
FILE *fp = fopen(filename, "r");
int ch, last = '\n';
int lines = 0;
if (fp != NULL) {
while ((ch = fgetc(fp)) != EOF) {
if (ch == '\n')
lines++;
last = ch;
}
fclose(fp);
if (last != '\n')
lines++;
}
return lines;
}
There are issues in the main() function too:
You so not verify that enough arguments are passed on the command line.
You do not check for EOF in the main reading loop.
You do not repeat the process in a loop until end of file, nor do you even ask the question how many line(s) do you want to read? after reading the specified number of lines...
First, if the file cannot be found, the countlines method returns zero. You should use that value to write the error message, and skip the rest of the code.
Second, in the next loop, you use
if (ch != '\n') {
printf("%c", ch);
} else {
printf("\n");
delay(s2);
x++;
}
Why the two printf statements? They will print the same thing.
Perhaps something like this:
ch = getc(ptDosya);
/* exit the loop here if you hit EOF */
printf("%c", ch); /* Why not putc() or putchar() ? */
if (ch == '\n') {
x++;
if ( x == lineCount ) {
x = 0;
lineCount = requestNumberOfLinesToRead();
} else {
sleep(s2); /* instead of delay(). Remember to #include unistd.h */
}
}

Counting number of lines in the file in C

I'm writing a function that reads the number of lines in the given line. Some text files may not end with a newline character.
int line_count(const char *filename)
{
int ch = 0;
int count = 0;
FILE *fileHandle;
if ((fileHandle = fopen(filename, "r")) == NULL) {
return -1;
}
do {
ch = fgetc(fileHandle);
if ( ch == '\n')
count++;
} while (ch != EOF);
fclose(fileHandle);
return count;
}
Now the function doesn't count the number of lines correctly, but I can't figure out where the problem is. I would be really grateful for your help.
Here is another option (other than keeping track of last character before EOF).
int ch;
int charsOnCurrentLine = 0;
while ((ch = fgetc(fileHandle)) != EOF) {
if (ch == '\n') {
count++;
charsOnCurrentLine = 0;
} else {
charsOnCurrentLine++;
}
}
if (charsOnCurrentLine > 0) {
count++;
}
fgets() reads till newline character or till the buffer is full
char buf[200];
while(fgets(buf,sizeof(buf),fileHandle) != NULL)
{
count++;
}
fgetc() is an issue here because you encounter EOF first and exit your do while loop and never encounter a \n character so count remains untouched for the last line in your file.If it happens to be there is a single line in your file that the count will be 0

Count paragraph in file

i am working in file system in that i am counting paragraph from the file but
i am not getting please suggest me how can i do that i tried this but not getting what i want
int main()
{
FILE *fp=fopen("200_content.txt ","r");
int pCount=0;
char c;
while ((c=fgetc(fp))!=EOF)
{
if(c=='\n'){pCount++;}
else{continue;}
}
printf("%d",pCount);
return 0;
}
You should declare c as int instead of char.
Also, remember to fclose(fp); before main() returns.
A paragraph contains two subsequent '\n's, use a variable for counting the two '\n's, like this,
int main()
{
FILE *fp=fopen("200_content.txt ","r");
int pCount=0;
char c;
int newln_cnt=0;
while ((c=fgetc(fp))!=EOF)
{
if(c=='\n')
{
newln_cnt++;
if(newln_cnt==2)
{
pCount++;
newln_cnt=0;
}
}
else{continue;}
}
printf("%d",pCount);
return 0;
}
You code counts the number of newline '\n' characters, not empty line which demarcates the paragraphs. Use fgets to read lines from the file. I suggest this -
#include <stdio.h>
// maximum length a line can have in the file.
// +1 for the terminating null byte added by fgets
#define MAX_LEN 100+1
int main(void) {
char line[MAX_LEN];
FILE *fp = fopen("200_content.txt", "r");
if(fp == NULL) {
printf("error in opening the file\n");
return 1;
}
int pcount = 0;
int temp = 0;
while(fgets(line, sizeof line, fp) != NULL) {
if(line[0] == '\n') {
// if newline is found and temp is 1 then
// this means end of the paragraph. increase
// the paragraph counter pcount and set temp to 0
if(temp == 1)
pcount++;
temp = 0;
}
else {
// if a non-empty line is found, this means
// the start of the paragraph
temp = 1;
}
}
// if the last para doesn't end with empty line(s)
if(temp == 1)
pcount++;
printf("number of para in the file is %d\n", pcount);
return 0;
}
For starters, I assume that you consider a new line to be a new paragraph.
i.e.
This is line 1.
This is line 2.
has 2 paragraphs.
What your code does is neglect the case where there is an EOF and not a newline character (\n) after This is line 2.
One way to fix this is to use an extra char variable.
int main()
{
FILE *fp=fopen("200_content.txt ","r");
int pCount=0;
char c; // char that checks
char last_c; //record of the last character read in the loop
while ((c=fgetc(fp))!=EOF)
{
if(c=='\n'){pCount++;}
last_c = c;
else{continue;} //this line is redundant. You can remove it
}
if (last_c != '\n') pCount++; //if EOF at the end of line and not '\n'
printf("%d",pCount);
return 0;
}
void analyze_file(const char *filename) {
FILE* out_file;
out_file = fopen(filename,"r");
int size;
if(out_file == NULL)
{
printf("Error(analyze_file): Could not open file %s\n",filename);
return;
}
fseek(out_file,0,SEEK_SET);
char ch,ch1;
int alpha_count = 0,num_count = 0,non_alnum =0,charac=0;
int word_count =0,line=0;
int para=0;
while(!feof(out_file))
{
ch = fgetc(out_file);
if (isalpha(ch))
alpha_count++;
else if(isdigit(ch))
num_count++;
else if(!isalnum(ch) && ch!='\n' && !isspace(ch))
++non_alnum;
else if(ch=='\n')
{ line++;
ch1 = fgetc(out_file);// courser moves ahead , as we read
fseek(out_file,-1,SEEK_CUR); // bringing courser back
}
else if(ch == ch1)
{para++; //paragraph counter
word_count--;
}
if(ch==' '||ch=='\n')
{
word_count++;
}
if(ch==EOF)
{
word_count++;line++;para++;
}
}
non_alnum -=1;// EOF character subtracted.
charac = alpha_count + non_alnum + num_count;
fclose(out_file);
printf("#Paragraphs = %d\n",para);
printf("#lines = %d\n",line);
printf("#Words = %d\n",word_count);
printf("#Characters = %d\n",charac);
printf("Alpha = %d\n",alpha_count);
printf("Numerical = %d\n",num_count);
printf("Other = %d\n",non_alnum);
printf("\n");
return;
}

How to count blank lines from file in C?

So what I'm trying to do is to count blank lines, which means not only just containing '\n'but space and tab symbols as well. Any help is appreciated! :)
char line[300];
int emptyline = 0;
FILE *fp;
fp = fopen("test.txt", "r");
if(fp == NULL)
{
perror("Error while opening the file. \n");
system("pause");
}
else
{
while (fgets(line, sizeof line, fp))
{
int i = 0;
if (line[i] != '\n' && line[i] != '\t' && line[i] != ' ')
{
i++;
}
emptyline++;
}
printf("\n The number of empty lines is: %d\n", emptyline);
}
fclose(fp);
You should try and get your code right when posting on SO. You are incrementing both i and emptyline but the use el in your call to printf(). And then I don't know what that is supposed to be in your code where it has }ine. Please, at least make an effort.
For starters, you are incrementing emptyline for every line because it is outside of your if statement.
Second, you need to test the entire line to see if it contains any character that is not a whitespace character. Only if that is true should you increment emptyline.
int IsEmptyLine(char *line)
{
while (*line)
{
if (!isspace(*line++))
return 0;
}
return 1;
}
Before getting into the line loop increment the emptyLine counter and if an non whitespace character is encountred decrement the emptyLine counter then break the loop.
#include <stdio.h>
#include <string.h>
int getEmptyLines(const char *fileName)
{
char line[300];
int emptyLine = 0;
FILE *fp = fopen("text.txt", "r");
if (fp == NULL) {
printf("Error: Could not open specified file!\n");
return -1;
}
else {
while(fgets(line, 300, fp)) {
int i = 0;
int len = strlen(line);
emptyLine++;
for (i = 0; i < len; i++) {
if (line[i] != '\n' && line[i] != '\t' && line[i] != ' ') {
emptyLine--;
break;
}
}
}
return emptyLine;
}
}
int main(void)
{
const char fileName[] = "text.txt";
int emptyLines = getEmptyLines(fileName);
if (emptyLines >= 0) {
printf("The number of empty lines is %d", emptyLines);
}
return 0;
}
You are incrementing emptyline on every iteration, so you should wrap it in an else block.
Let's think of this problem logically, and let's use functions to make it clear what is going on.
First, we want to detect lines that only consist of whitespace. So let's create a function to do that.
bool StringIsOnlyWhitespace(const char * line) {
int i;
for (i=0; line[i] != '\0'; ++i)
if (!isspace(line[i]))
return false;
return true;
}
Now that we have a test function, let's build a loop around it.
while (fgets(line, sizeof line, fp)) {
if (StringIsOnlyWhitespace(line))
emptyline++;
}
printf("\n The number of empty lines is: %d\n", emptyline);
Note that fgets() will not return a full line (just part of it) on lines that have at least sizeof(line) characters.

Resources