Removing punctuation and capitalizing in C - c

I'm writing a program for school that asks to read text from a file, capitalizes everything, and removes the punctuation and spaces. The file "Congress.txt" contains
(Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the government for a redress of grievances.)
It reads in correctly but what I have so far to remove the punctuation, spaces, and capitalize causes some major problems with junk characters. My code so far is:
void processFile(char line[]) {
FILE *fp;
int i = 0;
char c;
if (!(fp = fopen("congress.txt", "r"))) {
printf("File could not be opened for input.\n");
exit(1);
}
line[i] = '\0';
fseek(fp, 0, SEEK_END);
fseek(fp, 0, SEEK_SET);
for (i = 0; i < MAX; ++i) {
fscanf(fp, "%c", &line[i]);
if (line[i] == ' ')
i++;
else if (ispunct((unsigned char)line[i]))
i++;
else if (islower((unsigned char)line[i])) {
line[i] = toupper((unsigned char)line[i]);
i++;
}
printf("%c", line[i]);
fprintf(csis, "%c", line[i]);
}
fclose(fp);
}
I don't know if it's an issue but I have MAX defined as 272 because that's what the text file is including punctuation and spaces.
My output I am getting is:
C╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠Press any key to continue . . .

The fundamental algorithm needs to be along the lines of:
while next character is not EOF
if it is alphabetic
save the upper case version of it in the string
null terminate the string
which translates into C as:
int c;
int i = 0;
while ((c = getc(fp)) != EOF)
{
if (isalpha(c))
line[i++] = toupper(c);
}
line[i] = '\0';
This code doesn't need the (unsigned char) cast with the functions from <ctype.h> because c is guaranteed to contain either EOF (in which case it doesn't get into the body of the loop) or the value of a character converted to unsigned char anyway. You only have to worry about the cast when you use char c (as in the code in the question) and try to write toupper(c) or isalpha(c). The problem is that plain char can be a signed type, so some characters, notoriously ÿ (y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS), will appear as a negative value, and that breaks the requirements on the inputs to the <ctype.h> functions. This code will attempt to case-convert characters that are already upper-case, but that's probably cheaper than a second test.
What else you do in the way of printing, etc is up to you. The csis file stream is a global scope variable; that's a bit (tr)icky. You should probably terminate the output printing with a newline.
The code shown is vulnerable to buffer overflow. If the length of line is MAX, then you can modify the loop condition to:
while (i < MAX - 1 && (c = getc(fp)) != EOF)
If, as would be a better design, you change the function signature to:
void processFile(int size, char line[]) {
and assert that the size is strictly positive:
assert(size > 0);
and then the loop condition changes to:
while (i < size - 1 && (c = getc(fp)) != EOF)
Obviously, you change the call too:
char line[4096];
processFile(sizeof(line), line);

in the posted code, there is no intermediate processing,
so the following code ignores the 'line[]' input parameter
void processFile()
{
FILE *fp = NULL;
if (!(fp = fopen("congress.txt", "r")))
{
printf("File could not be opened for input.\n");
exit(1);
}
// implied else, fopen successful
unsigned int c; // must be integer so EOF (-1) can be recognized
while( EOF != (c =(unsigned)fgetc(fp) ) )
{
if( (isalpha(c) || isblank(c) ) && !ispunct(c) ) // a...z or A...Z or space
{
// note toupper has no effect on upper case characters
// note toupper has no effect on a space
printf("%c", toupper(c));
fprintf(csis, "%c", toupper(c));
}
}
printf( "\n" );
fclose(fp);
} // end function: processFile

Okay so what I did was created a second character array. My first array read in the entire file. I created a second array which would only take in alphabetical characters from the first array then make them uppercase. My correct and completed function for that part of my homework is as follows:
void processFile(char line[], char newline[]) {
FILE *fp;
int i = 0;
int j = 0;
if (!(fp = fopen("congress.txt", "r"))) { //checks file open
printf("File could not be opened for input.\n");
exit(1);
}
line[i] = '\0';
fseek(fp, 0, SEEK_END); //idk what they do but they make it not crash
fseek(fp, 0, SEEK_SET);
for (i = 0; i < MAX; ++i) { //reads the file into the first array
fscanf(fp, "%c", &line[i]);
}
for (i = 0; i < MAX; ++i) {
if (isalpha(line[i])){ //if it's an alphabetical character
newline[j] = line[i]; //read into new array
newline[j] = toupper(newline[j]); //makes that letter capitalized
j++;
}
}
fclose(fp);
}
Just make sure that after creating the new array, it will be smaller than your defined MAX. To make it easy I just counted the now missing punctuation and spaces (which was 50) so for future "for" loops it was:
for (i = 0; i < MAX - 50; ++i)

Related

Convert a column in CSV to a string in C

So, I was trying to write a program to take the first column of a CSV file and copy that do a string, but it's not going well. I put the code and the CSV file and I really appreciate any help.
int main () {
FILE *fp;
fp = fopen("filePATH", "r");
char column[80];
int line_n = 0;
char ch;
while ((ch = fgetc(fp)) != EOF) {
fgets(column, sizeof column, fp);
for (int i = 0; i < sizeof column; ++i){
fscanf(fp, "%[^;]", column);
}
printf("%s \n", column);
}
fclose(fp);
return 0;
}
CSV file:
2;51.5;144.0;24.80
5;62.3;157.0;25.30
10;52.8;141.0;26.60
10;34.5;120.0;24.00
1;41.6;131.0;24.20
5;49.0;144.0;23.80
6;47.1;142.0;23.50
2;51.8;144.5;24.80
1;55.6;135.0;30.50
9;51.9;150.0;23.10
9;48.5;139.0;25.10
The output I have is:
5
10
10
1
5
6
2
1
9
9
48.5;139.0;25.10
So, I don't understand why the program shows me the first column but copies only the last line for the string column.
To check the string column, I used:
char copy[20];
strncpy(copy, column, 18);
printf("%s ", copy);
And the output is:
48.5;139.0;25.10
Almost certainly you want to restructure the loops, but here are some minimal changes to your code that might be instructive:
#include<stdio.h>
char sample_input[] =
"2;51.5;144.0;24.80\n"
"5;62.3;157.0;25.30\n"
"10;52.8;141.0;26.60\n"
"10;34.5;120.0;24.00\n"
"1;41.6;131.0;24.20\n"
"5;49.0;144.0;23.80\n"
"6;47.1;142.0;23.50\n"
"2;51.8;144.5;24.80\n"
"1;55.6;135.0;30.50\n"
"9;51.9;150.0;23.10\n"
"9;48.5;139.0;25.10\n"
;
int
main(void)
{
FILE *fp = fmemopen(sample_input, sizeof sample_input, "r");
char column[80];
int line_n = 0;
int ch; /* fgetc returns an int. */
while( (ch = fgetc(fp)) != EOF ){
/* put ch back so it can be read by scanf */
ungetc(ch, fp);
/* This fgets does not seem to serve any purpose.
/* fgets(column, sizeof column, fp); */
ch = ';';
while( ch == ';' && fscanf(fp, "%79[^;\n]", column) == 1 ){
ch = fgetc(fp); /* Consume the ; or \n */
printf("%s%c", column, ch);
}
}
fclose(fp);
return 0;
}
It would probably be better to use the fgets to read each line of data and then use sscanf to parse the line, but I wanted to show that your current code can be made to (mostly) work with minimal changes. Note that the variable used to store the value returned by fgetc must be int rather than char in order to properly compare to EOF. Also, whenever you use %s or %[] in scanf, it is best to add a width modifier to prevent a buffer overflow.

"How to make my program read visible characters and spaces from a file

I am supposed to be "fixing" code given to me to make it display the correct number of visible characters in a file (spaces too). The correct number is supposed to be 977. I have never dealt with files before and I don't understand what I need to do to display the correct number.
* Driver Menu System for Homework
* Andrew Potter - Mar 5, 2019 <-- Please put your name/date here
*/
#include <stdio.h>//header file for input/output -
#include <stdlib.h>
#include <ctype.h>
// since you will place all your assigned functions (programs) in this file, you do not need to include stdio.h again!
int menu(void); //prototype definition section
void hello(void);
void countall(void);
int main(void)
{
int selection = menu();
while(selection != 99) {
switch(selection) {
case 1:
hello();
break;
case 2:
countall();
break;
case 3:
break;
case 4:
break;
default:
printf("Please enter a valid selection.\n");
}
selection = menu();
}
return 0;
}
int menu(void) {
int choice;
printf("***************************\n");
printf(" 1. Hello \n");
printf(" 2. Countall\n");
printf(" 3. \n");
printf(" 4. \n");
printf("99. Exit\n");
printf("Please select number and press enter:\n");
printf("***************************\n");
scanf("%d", &choice);
getchar();
return choice;
}
void hello(void) {
printf("Hello, World!!!\n");
}
//*****Andrew 5/1/19*****
#define SLEN 81 /* from reverse.c */
/* original header: int count(argc, *argv[]) */
void countall(void)
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long unsigned count = 0;
char file[SLEN]; /* from reverse.c */
/*Checks whether a file name was included when run from the command prompt
* The argument count includes the program file name. A count of 2 indicates
* that an additional parameter was passed
if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(EXIT_FAILURE);
}
* The following uses the second parameter as the file name
* and attempts to open the file
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(EXIT_FAILURE);
} */
/*************************************
Code from reverse.c included to make the program work from within our IDE
*************************************/
puts("Enter the name of the file to be processed:");
scanf("%s", file);
if ((fp = fopen(file,"rb")) == NULL) /* read mode */
{
printf("count program can't open %s\n", file);
exit(EXIT_FAILURE);
}
/* EOF reached when C realizes it tried to reach beyond the end of the file! */
/* This is good design - see page 573 */
while ((ch = getc(fp)) != EOF)
{
if (isprint(ch)) {
count++;
}
else if (isprint(ch)) {
count++;
}
putc(ch,stdout); // same as putchar(ch);
count++;
}
fclose(fp);
printf("\nFile %s has %lu characters\n", file, count);
}
I expected I would get the correct number of visible characters using the combination of isprint and isspace but I usually get 2086.
The assignment directions are: "Word identifies 977 characters including spaces. Your current countall() believes there are 1043. Make the corrections necessary to your code to count only the visible characters and spaces! (Hint: check out 567 in your textbook.)" Before I edited any code the count was 1043, now i am getting 2020. I need 977.
isprint() returns a Boolean result - zero if the character is not "printable", and non-zero if it is. As such isprint(ch) != '\n'makes no sense. Your complete expression in the question makes even less sense, but I'll come on to that at the end.
isprint() on its own returns true (non-zero) for all printable characters, so you need no other tests. Moreover you increment count unconditionally and in every conditional block, so you are counting every character and some twice.
You just need:
if( isprint(ch) )
{
count++;
}
putc( ch, stdout ) ;
While your code is clearly an incomplete fragment, it is not clear where or how your are reading ch. You need a getc() or equivalent in there somewhare.
while( (ch = getc(fp)) != EOF )
{
if( isprint(ch) )
{
count++;
}
putc( ch, stdout ) ;
}
It is not clear whether you need to count all whitespace (including space, tab and newline) or just "spaces" as you stated. If so be clear that isprint() will match space, but not control characters newline or tab. isspace() matches all these, but should not be counted separately to isprint() because 'space' is in both white-space and printable sets. If newline and tab are to be counted (and less likely; "vertical tab") then:
while( (ch = getc(fp)) != EOF )
{
if( isprint(ch) || isspace(ch) )
{
count++;
}
putc( ch, stdout ) ;
}
Another aspect of C that you seem to misunderstand is how Boolean expressions work. To test a single variable for multiple values you must write:
if( var == x || var == y || var == z )
You have written:
if( var == x || y || z )
which may make sense in English (or other natural language) when you read it out aloud, but in C it means:
if( var == (x || y || z ) )
evaluating (x || y || z ) as either true or false and comparing it to var.
It is probably worth considering the semantics of your existing solution to show why it actually compiles, but produces the erroneous result it does.
Firstly,
isprint(ch) != '\n' || '\t' || '\0'
is equivalent to isprint(ch) != true, for the reasons described earlier. So you increment the counter for all characters that are not printable.
Then here:
isspace(ch) == NULL
NULL is a macro representing an invalid pointer, and isspace() does not return a pointer. However NULL will implicitly cast to zero (or false). So here you increment the counter for all printable characters that are not spaces.
Finally, you unconditionally count every character here:
putc(ch,stdout); // same as putchar(ch);
count++;
So your result will be:
number-of-non-printing-characters +
number-of-printing-characters - number-of-spaces +
total-number-of-characters
which is I think (2 x file-length) - number-of-spaces
Finally note that if you open a text file that has CR+LF line ends (conventional for text files on Windows) in "binary" mode, isspace() will count two characters for every new-line. Be sure to open in "text" mode (regardless of the platform).
From isprint():
A printable character is a character that occupies a printing position on a display (this is the opposite of a control character, checked with iscntrl).
and
A value different from zero (i.e., true) if indeed c is a printable character. Zero (i.e., false) otherwise.
So that function should be sufficient. Please note that you have to make sure to feed all these is...() functions from <ctype.h> unsigned values. So if you use it with a value of uncertain origin, better cast to char unsigned.
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
int main(void)
{
char const *filename = "test.txt";
FILE *input = fopen(filename, "r");
if (!input) {
fprintf(stderr, "Couldn't open \"%s\" for reading. :(\n\n", filename);
return EXIT_FAILURE;
}
long long unsigned count = 0;
for (int ch; (ch = fgetc(input)) != EOF;) {
if (isprint(ch))
++count;
}
fclose(input);
printf("Count: %llu\n\n", count);
}
If I wasn't lucky enough to guess which characters you want to be counted, have a look at ctype.h, there is a table.
if ((ch == '\t') || isprint(ch))
count++;
If you want to handle tabs differently (maybe to count how many spaces they use):
if (ch == '\t') {
/* Do smth */
} else if (isprint(ch)) {
count++;
}
This should be enough.

C Reading a file of digits separated by commas

I am trying to read in a file that contains digits operated by commas and store them in an array without the commas present.
For example: processes.txt contains
0,1,3
1,0,5
2,9,8
3,10,6
And an array called numbers should look like:
0 1 3 1 0 5 2 9 8 3 10 6
The code I had so far is:
FILE *fp1;
char c; //declaration of characters
fp1=fopen(argv[1],"r"); //opening the file
int list[300];
c=fgetc(fp1); //taking character from fp1 pointer or file
int i=0,number,num=0;
while(c!=EOF){ //iterate until end of file
if (isdigit(c)){ //if it is digit
sscanf(&c,"%d",&number); //changing character to number (c)
num=(num*10)+number;
}
else if (c==',' || c=='\n') { //if it is new line or ,then it will store the number in list
list[i]=num;
num=0;
i++;
}
c=fgetc(fp1);
}
But this is having problems if it is a double digit. Does anyone have a better solution? Thank you!
For the data shown with no space before the commas, you could simply use:
while (fscanf(fp1, "%d,", &num) == 1 && i < 300)
list[i++] = num;
This will read the comma after the number if there is one, silently ignoring when there isn't one. If there might be white space before the commas in the data, add a blank before the comma in the format string. The test on i prevents you writing outside the bounds of the list array. The ++ operator comes into its own here.
First, fgetc returns an int, so c needs to be an int.
Other than that, I would use a slightly different approach. I admit that it is slightly overcomplicated. However, this approach may be usable if you have several different types of fields that requires different actions, like a parser. For your specific problem, I recommend Johathan Leffler's answer.
int c=fgetc(f);
while(c!=EOF && i<300) {
if(isdigit(c)) {
fseek(f, -1, SEEK_CUR);
if(fscanf(f, "%d", &list[i++]) != 1) {
// Handle error
}
}
c=fgetc(f);
}
Here I don't care about commas and newlines. I take ANYTHING other than a digit as a separator. What I do is basically this:
read next byte
if byte is digit:
back one byte in the file
read number, irregardless of length
else continue
The added condition i<300 is for security reasons. If you really want to check that nothing else than commas and newlines (I did not get the impression that you found that important) you could easily just add an else if (c == ... to handle the error.
Note that you should always check the return value for functions like sscanf, fscanf, scanf etc. Actually, you should also do that for fseek. In this situation it's not as important since this code is very unlikely to fail for that reason, so I left it out for readability. But in production code you SHOULD check it.
My solution is to read the whole line first and then parse it with strtok_r with comma as a delimiter. If you want portable code you should use strtok instead.
A naive implementation of readline would be something like this:
static char *readline(FILE *file)
{
char *line = malloc(sizeof(char));
int index = 0;
int c = fgetc(file);
if (c == EOF) {
free(line);
return NULL;
}
while (c != EOF && c != '\n') {
line[index++] = c;
char *l = realloc(line, (index + 1) * sizeof(char));
if (l == NULL) {
free(line);
return NULL;
}
line = l;
c = fgetc(file);
}
line[index] = '\0';
return line;
}
Then you just need to parse the whole line with strtok_r, so you would end with something like this:
int main(int argc, char **argv)
{
FILE *file = fopen(argv[1], "re");
int list[300];
if (file == NULL) {
return 1;
}
char *line;
int numc = 0;
while((line = readline(file)) != NULL) {
char *saveptr;
// Get the first token
char *tok = strtok_r(line, ",", &saveptr);
// Now start parsing the whole line
while (tok != NULL) {
// Convert the token to a long if possible
long num = strtol(tok, NULL, 0);
if (errno != 0) {
// Handle no value conversion
// ...
// ...
}
list[numc++] = (int) num;
// Get next token
tok = strtok_r(NULL, ",", &saveptr);
}
free(line);
}
fclose(file);
return 0;
}
And for printing the whole list just use a for loop:
for (int i = 0; i < numc; i++) {
printf("%d ", list[i]);
}
printf("\n");

How to calculate number of lines in file?

I work in C-language at first time and have some question.
How can I get the number of lines in file?
FILE *in;
char c;
int lines = 1;
...
while (fscanf(in,"%c",&c) == 1) {
if (c == '\n') {
lines++;
}
}
Am I right? I actually don't know how to get the moment , when string cross to the new line.
OP's code functions well aside from maybe an off-by-one issue and a last line issue.
Standard C library definition
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined. C11dr §7.21.2 2
A line ends with a '\n' and the last line may or may not end with a '\n'.
If using the idea that the last line of a file does not require a final '\n, then the goal is to count the number of occurrences that a character is read after a '\n'.
// Let us use a wide type
// Start at 0 as the file may be empty
unsigned long long line_count = 0;
int previous = '\n';
int ch;
while ((ch = fgetc(in)) != EOF) {
if (previous == '\n') line_count++;
previous = ch;
}
printf("Line count:%llu\n", line_count);
Reading a file one character at a time may be less efficient than other means, but functionally meets OP goal.
This answer uses (ch = fgetc(in)) != EOF instead of fscanf(in,"%c",&c) == 1 which is typically ""faster", but with an optimizing compiler, either may emit similar performance code. Such details of speed can be supported with analysis or profiling. When in doubt, code for clarity.
Can use this utility function
/*
* count the number of lines in the file called filename
*
*/
int countLines(char *filename)
{
FILE *in = fopen(filename,"r");
int ch=0;
int lines=0;
if(in == NULL){
return 0; // return lines;
}
while((ch = fgetc(in)) != EOF){
if(ch == '\n'){
lines++;
}
}
fclose(in);
return lines;
}
When counting the 'number of \n characters', you have to remember that you are counting the separators, and not the items. See 'Fencepost Error'
Your example should work, but:
if the file does not end with a \n, then you might be off-by-one (depending on your definition of 'a line').
depending on your definition of 'a line' you may be missing \r characters in the file (typically used by Macs)
it will not be very efficient or quick (calling scanf() is expensive)
The example below will ingest a buffer each time, looking for \r and \n characters. There is some logic to latch these characters, so that the following line endings should be handled correctly:
\n
\r
\r\n
#include <stdio.h>
#include <errno.h>
int main(void) {
FILE *in;
char buf[4096];
int buf_len, buf_pos;
int line_count, line_pos;
int ignore_cr, ignore_lf;
in = fopen("my_file.txt", "rb");
if (in == NULL) {
perror("fopen()");
return 1;
}
line_count = 0;
line_pos = 0;
ignore_cr = 0;
ignore_lf = 0;
/* ingest a buffer at a time */
while ((buf_len = fread(&buf, 1, sizeof(buf), in)) != 0) {
/* walk through the buffer, looking for newlines */
for (buf_pos = 0; buf_pos < buf_len; buf_pos++) {
/* look for '\n' ... */
if (buf[buf_pos] == '\n') {
/* ... unless we've already seen '\r' */
if (!ignore_lf) {
line_count += 1;
line_pos = 0;
ignore_cr = 1;
}
/* look for '\r' ... */
} else if (buf[buf_pos] == '\r') {
/* ... unless we've already seen '\n' */
if (!ignore_cr) {
line_count += 1;
line_pos = 0;
ignore_lf = 1;
}
/* on any other character, count the characters per line */
} else {
line_pos += 1;
ignore_lf = 0;
ignore_cr = 0;
}
}
}
if (line_pos > 0) {
line_count += 1;
}
fclose(in);
printf("lines: %d\n", line_count);
return 0;
}

What is the easiest way to count the newlines in an ASCII file?

Which is the fastest way to get the lines of an ASCII file?
Normally you read files in C using fgets. You can also use scanf("%[^\n]"), but quite a few people reading the code are likely to find that confusing and foreign.
Edit: on the other hand, if you really do just want to count lines, a slightly modified version of the scanf approach can work quite nicely:
while (EOF != (scanf("%*[^\n]"), scanf("%*c")))
++lines;
The advantage of this is that with the '*' in each conversion, scanf reads and matches the input, but does nothing with the result. That means we don't have to waste memory on a large buffer to hold the content of a line that we don't care about (and still take a chance of getting a line that's even larger than that, so our count ends up wrong unless we got to even more work to figure out whether the input we read ended with a newline).
Unfortunately, we do have to break up the scanf into two pieces like this. scanf stops scanning when a conversion fails, and if the input contains a blank line (two consecutive newlines) we expect the first conversion to fail. Even if that fails, however, we want the second conversion to happen, to read the next newline and move on to the next line. Therefore, we attempt the first conversion to "eat" the content of the line, and then do the %c conversion to read the newline (the part we really care about). We continue doing both until the second call to scanf returns EOF (which will normally be at the end of the file, though it can also happen in case of something like a read error).
Edit2: Of course, there is another possibility that's (at least arguably) simpler and easier to understand:
int ch;
while (EOF != (ch=getchar()))
if (ch=='\n')
++lines;
The only part of this that some people find counterintuitive is that ch must be defined as an int, not a char for the code to work correctly.
Here's a solution based on fgetc() which will work for lines of any length and doesn't require you to allocate a buffer.
#include <stdio.h>
int main()
{
FILE *fp = stdin; /* or use fopen to open a file */
int c; /* Nb. int (not char) for the EOF */
unsigned long newline_count = 0;
/* count the newline characters */
while ( (c=fgetc(fp)) != EOF ) {
if ( c == '\n' )
newline_count++;
}
printf("%lu newline characters\n", newline_count);
return 0;
}
Maybe I'm missing something, but why not simply:
#include <stdio.h>
int main(void) {
int n = 0;
int c;
while ((c = getchar()) != EOF) {
if (c == '\n')
++n;
}
printf("%d\n", n);
}
if you want to count partial lines (i.e. [^\n]EOF):
#include <stdio.h>
int main(void) {
int n = 0;
int pc = EOF;
int c;
while ((c = getchar()) != EOF) {
if (c == '\n')
++n;
pc = c;
}
if (pc != EOF && pc != '\n')
++n;
printf("%d\n", n);
}
Common, why You compare all characters? It is very slow. In 10MB file it is ~3s.
Under solution is faster.
unsigned long count_lines_of_file(char *file_patch) {
FILE *fp = fopen(file_patch, "r");
unsigned long line_count = 0;
if(fp == NULL){
return 0;
}
while ( fgetline(fp) )
line_count++;
fclose(fp);
return line_count;
}
What about this?
#include <stdio.h>
#include <string.h>
#define BUFFER_SIZE 4096
int main(int argc, char** argv)
{
int count;
int bytes;
FILE* f;
char buffer[BUFFER_SIZE + 1];
char* ptr;
if (argc != 2 || !(f = fopen(argv[1], "r")))
{
return -1;
}
count = 0;
while(!feof(f))
{
bytes = fread(buffer, sizeof(char), BUFFER_SIZE, f);
if (bytes <= 0)
{
return -1;
}
buffer[bytes] = '\0';
for (ptr = buffer; ptr; ptr = strchr(ptr, '\n'))
{
++count;
++ptr;
}
}
fclose(f);
printf("%d\n", count - 1);
return 0;
}

Resources