This question already has answers here:
Counting lines, words, characters and top ten words?
(5 answers)
Count lines, words and characters in C
(4 answers)
Closed 9 years ago.
So the assignment is to emulate the unix command wc in C. I've got most of the structure down but I've got some problems with the actual counting pieces.
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
int main(int argc, char *argv[]){
int file;
int newLine=0, newWord=0, newChar=0, i=0;
char *string;
char buf[101];
file = open(argv[1], O_RDONLY, 0644);
int charRead=read(file, buf, 101);
if (file == -1){
printf("file does not exist");
}
else{
for (i; i<100; i++){
if (buf[i]!='\0'){
newChar++;
}
if (buf[i]==' '){
newWord++;
}
if (buf[i]=='\n'){
newLine++;
}
}
}
printf("%d\n",newWord);
printf("%d\n",newLine);
printf("%d\n",newChar);
printf("%s",argv[1]);
close(file);
}
So the line counter works perfectly well.
The word count is always one short unless there is a space at the end of the word. I've tried to ameliorate this by making the special case:
if(buf[i]!='\0' || (buf[i]=='\0' && buf[i]!=' '))
but this doesnt' seem to work either.
The other problem is that the character count is always way off. I think it has something to do with the buffer size, but I can't seem to find much documentation on how to make the buffer work in this scenario.
Please advise. Thanks!
EDIT I looked at the answers given in the "duplicate" questions, and I don't think they really addressed the question you had. I have written a short program that does what you want (and is "safe" in that it handles any size of input). I tested it against a short file, where it gave identical results to wc.
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[]) {
// count characters, words, lines
int cCount = 0, wCount = 0, lCount = 0, length;
char buf[1000];
FILE *fp;
if (argc < 2) {
printf("usage: wordCount fileName\n");
return -1;
}
if((fp = fopen(argv[1], "r")) == NULL) {
printf("unable to open %s\n", argv[1]);
return -1;
}
while(fgets(buf, 1000, fp)!=NULL) {
int ii, isWord, isWhite;
lCount++;
isWord = 0;
length = strlen(buf);
cCount += length;
for(ii = 0; ii<length; ii++) {
isWhite = (buf[ii]!=' ' && buf[ii]!= '\n' && buf[ii] != '\t') ? 1 : 0;
if (isWhite == 1) {
if(isWord != 1) wCount++;
isWord = 1;
}
if(isWhite == 0 && isWord == 1) {
isWord = 0;
}
}
}
printf("Characters: %d\nWords: %d\nLines: %d\n\n", cCount, wCount, lCount);
return 0;
}
Note - if you have a line with more than 1000 characters in it, the above may give a false result; this could be addressed by using getline() which is a very safe (but non standard) function that will take care of allocating enough memory for the line that is read in. I don't think you need to worry about it here.
If you do worry about it, you can use the same trick as above (where you have the "isWord" state) and extend it to isLine (reset when you encounter a \n). Then you don't need the inner for loop. It is marginally more memory efficient, but slower.
Related
The wordlist.txt is including like:
able
army
bird
boring
sing
song
And I want to use fscanf() to read this txt file line by line and store them into a string array by indexed every word like this:
src = [able army bird boring sing song]
where src[0]= "able", src[1] = "army" and so on. But my code only outputs src[0] = "a", src[1] = "b"... Could someone help me figure out what's going wrong in my code:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
FILE *fp = fopen("wordlist.txt", "r");
if (fp == NULL)
{
printf("%s", "File open error");
return 0;
}
char src[1000];
for (int i = 0; i < sizeof(src); i++)
{
fscanf(fp, "%[^EOF]", &src[i]);
}
fclose(fp);
printf("%c", src[0]);
getchar();
return 0;
}
Pretty appreciated!
For example like this.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define MAX_ARRAY_SIZE 1000
#define MAX_STRING_SIZE 100
int main(int argc, char *argv[]) {
FILE *fp = fopen("wordlist.txt", "r");
if (fp == NULL) {
printf("File open error\n");
return 1;
}
char arr[MAX_ARRAY_SIZE][MAX_STRING_SIZE];
int index = 0;
while (1) {
int ret = fscanf(fp, "%s", arr[index]);
if (ret == EOF) break;
++index;
if (index == MAX_ARRAY_SIZE) break;
}
fclose(fp);
for (int i = 0; i < index; ++i) {
printf("%s\n", arr[i]);
}
getchar();
return 0;
}
Some notes:
If there is an error, it is better to return 1 and not 0, for 0 means successful execution.
For a char array, you use a pointer. For a string array, you use a double pointer. A bit tricky to get used to them, but they are handy.
Also, a check of the return value of the fscanf would be great.
For fixed size arrays, it is useful to define the sizes using #define so that it is easier to change later if you use it multiple times in the code.
It's reading file one character at a time, Which itself is 4 in size like we see sizeof('a') in word able. Same goes for 'b' and so on. So one approach you can use is to keep checking when there is a space or newline character so that we can save the data before these two things as a word and then combine these small arrays by adding spaces in between and concatenating them to get a single array.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Why does my output contain extra characters? Why is only the first line of every file via notepad++ being encrypted and not the entire file?
Happy coding!
P.S I have the Second Edition of C programming language by Kernighan and Ritchie
EDIT: This code is my code after I fixed it, the question's has been answered. Thank you guys!
Here is my source NEW code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define getchar() getc(stdin)
#define putchar() putc((c),stdout)
#define XOR_BYTE 0x9E
char * xorBuffer(char *buffer, long bufferSize){
int i;
for(i = 0;i <= bufferSize;i++){
buffer[i] ^= XOR_BYTE;
}
return buffer;
}
int xorFile(char *fileIn, char * fileOut){
FILE *fpi, *fpo;
char *fileBuffer = NULL;
fpi = fopen(fileIn,"rb");
fpo = fopen(fileOut,"wb");
if(NULL == fpi){
printf("Error opening input file %s: %s\n", fileIn, strerror(errno));
return 1;
}
if(NULL == fpo){
printf("Error opening output file %s: %s\n", fileOut, strerror(errno));
return 2;
}
fseek(fpi,0L,SEEK_END);
long fileSize = ftell(fpi);
fileBuffer = malloc(sizeof(char)* (fileSize + 1));
fseek(fpi,0L,SEEK_SET);
size_t length = fread(fileBuffer, sizeof(char), fileSize,fpi);
fileBuffer[length];
fileBuffer = (char *)xorBuffer(fileBuffer,fileSize);
int c;
for(c = 0;c < fileSize;c++){
putc(((fileBuffer[c])),fpo);
}
fclose(fpi);
fclose(fpo);
free(fileBuffer);
return 0;
}
int main(int argc, char*argv[]){
if(argc == 3){
if(xorFile(argv[1],argv[2]) == 0)
printf("File encryption was successful.");
else
printf("An error occured.");
}else{
printf("usage --- xor [input file][output file]");
}
}
Your prototype for XOR_FILE is incorrect: you should take 2 strings.
There are more issues in your code:
you must learn to indent your code and use spaces wisely. Use the Kernighan and Ritchie style shown in the book.
you cannot reliably get the file size with fseek and ftell, it is not needed in general and you can implement a bufferized version with a fixed sized buffer anyway.
avoid overwriting the input file in your program. If you make a mistake, it will be lost.
you do not need to null terminate the array into which you read the file, just iterate over all bytes, but stop at the size read: use for (i = 0; i < newLen; i++) otherwise you would output an extra byte when encrypting and one more when deciphering...
do not iterate until '\0' in XOR_BUFFER(char *FILE_BUFFER), pass the size and use it. Otherwise you will fail to encrypt binary files that contain null bytes.
you forget to close fpo with fclose(fpo);
do not redefine standard functions such as getchar() and putchar().
do not use uppercase letters for function names and/or variable names, but it is indeed common practice to use uppercase letters for macros.
Here is a simplified version:
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define XOR_BYTE 0x9E
int xor_file(const char *infile, const char *outfile) {
FILE *fpi, *fpo;
int c;
if ((fpi = fopen(infile, "rb")) == NULL) {
fprintf("cannot open input file %s: %s\n", infile, strerror(errno));
return 1;
}
if ((fpo = fopen(outfile, "wb")) == NULL)
fprintf("cannot open output file %s: %s\n", outfile, strerror(errno));
fclose(fpi);
return 2;
}
while ((c = getc(fpi)) != EOF) {
putc(c ^ XOR_BYTE, fpo);
}
fclose(fpi);
fclose(fpo);
return 0;
}
int main(int argc, char *argv[]) {
if (argc == 3) {
xor_file(argv[1], argv[2]);
} else {
fprintf(stderr, "usage: xor_file input_file output_file\n");
}
//getch(); // avoid the need for this by running your program in the terminal
return 0;
}
well you are passing strings into a function that expects FILE* parameters. This tells me that the compiler is complaining big time at you and you are ignoring it
Also you are not testing any return values from your fopen functions.
So fix those 2 things then repost
I am a biology student and I am trying to learn perl, python and C and also use the scripts in my work. So, I have a file as follows:
>sequence1
ATCGATCGATCG
>sequence2
AAAATTTT
>sequence3
CCCCGGGG
The output should look like this, that is the name of each sequence and the count of characters in each line and printing the total number of sequences in the end of the file.
sequence1 12
sequence2 8
sequence3 8
Total number of sequences = 3
I could make the perl and python scripts work, this is the python script as an example:
#!/usr/bin/python
import sys
my_file = open(sys.argv[1]) #open the file
my_output = open(sys.argv[2], "w") #open output file
total_sequence_counts = 0
for line in my_file:
if line.startswith(">"):
sequence_name = line.rstrip('\n').replace(">","")
total_sequence_counts += 1
continue
dna_length = len(line.rstrip('\n'))
my_output.write(sequence_name + " " + str(dna_length) + '\n')
my_output.write("Total number of sequences = " + str(total_sequence_counts) + '\n')
Now, I want to write the same script in C, this is what I have achieved so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
input = FILE *fopen(const char *filename, "r");
output = FILE *fopen(const char *filename, "w");
double total_sequence_counts = 0;
char sequence_name[];
char line [4095]; // set a temporary line length
char buffer = (char *) malloc (sizeof(line) +1); // allocate some memory
while (fgets(line, sizeof(line), filename) != NULL) { // read until new line character is not found in line
buffer = realloc(*buffer, strlen(line) + strlen(buffer) + 1); // realloc buffer to adjust buffer size
if (buffer == NULL) { // print error message if memory allocation fails
printf("\n Memory error");
return 0;
}
if (line[0] == ">") {
sequence_name = strcpy(sequence_name, &line[1]);
total_sequence_counts += 1
}
else {
double length = strlen(line);
fprintf(output, "%s \t %ld", sequence_name, length);
}
fprintf(output, "%s \t %ld", "Total number of sequences = ", total_sequence_counts);
}
int fclose(FILE *input); // when you are done working with a file, you should close it using this function.
return 0;
int fclose(FILE *output);
return 0;
}
But this code, of course is full of mistakes, my problem is that despite studying a lot, I still can't properly understand and use the memory allocation and pointers so I know I especially have mistakes in that part. It would be great if you could comment on my code and see how it can turn into a script that actually work. By the way, in my actual data, the length of each line is not defined so I need to use malloc and realloc for that purpose.
For a simple program like this, where you look at short lines one at a time, you shouldn't worry about dynamic memory allocation. It is probably good enough to use local buffers of a reasonable size.
Another thing is that C isn't particularly suited for quick-and-dirty string processing. For example, there isn't a strstrip function in the standard library. You usually end up implementing such behaviour yourself.
An example implementation looks like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXLEN 80 /* Maximum line length, including null terminator */
int main(int argc, char *argv[])
{
FILE *in;
FILE *out;
char line[MAXLEN]; /* Current line buffer */
char ref[MAXLEN] = ""; /* Sequence reference buffer */
int nseq = 0; /* Sequence counter */
if (argc != 3) {
fprintf(stderr, "Usage: %s infile outfile\n", argv[0]);
exit(1);
}
in = fopen(argv[1], "r");
if (in == NULL) {
fprintf(stderr, "Couldn't open %s.\n", argv[1]);
exit(1);
}
out = fopen(argv[2], "w");
if (in == NULL) {
fprintf(stderr, "Couldn't open %s for writing.\n", argv[2]);
exit(1);
}
while (fgets(line, sizeof(line), in)) {
int len = strlen(line);
/* Strip whitespace from end */
while (len > 0 && isspace(line[len - 1])) len--;
line[len] = '\0';
if (line[0] == '>') {
/* First char is '>': copy from second char in line */
strcpy(ref, line + 1);
} else {
/* Other lines are sequences */
fprintf(out, "%s: %d\n", ref, len);
nseq++;
}
}
fprintf(out, "Total number of sequences. %d\n", nseq);
fclose(in);
fclose(out);
return 0;
}
A lot of code is about enforcing arguments and opening and closing files. (You could cut out a lot of code if you used stdin and stdout with file redirections.)
The core is the big while loop. Things to note:
fgets returns NULL on error or when the end of file is reached.
The first lines determine the length of the line and then remove white-space from the end.
It is not enough to decrement length, at the end the stripped string must be terminated with the null character '\0'
When you check the first character in the line, you should check against a char, not a string. In C, single and double quotes are not interchangeable. ">" is a string literal of two characters, '>' and the terminating '\0'.
When dealing with countable entities like chars in a string, use integer types, not floating-point numbers. (I've used (signed) int here, but because there can't be a negative number of chars in a line, it might have been better to have used an unsigned type.)
The notation line + 1 is equivalent to &line[1].
The code I've shown doesn't check that there is always one reference per sequence. I'll leave this as exercide to the reader.
For a beginner, this can be quite a lot to keep track of. For small text-processing tasks like yours, Python and Perl are definitely better suited.
Edit: The solution above won't work for long sequences; it is restricted to MAXLEN characters. But you don't need dynamic allocation if you only need the length, not the contents of the sequences.
Here's an updated version that doesn't read lines, but read characters instead. In '>' context, it stored the reference. Otherwise it just keeps a count:
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h> /* for isspace() */
#define MAXLEN 80 /* Maximum line length, including null terminator */
int main(int argc, char *argv[])
{
FILE *in;
FILE *out;
int nseq = 0; /* Sequence counter */
char ref[MAXLEN]; /* Reference name */
in = fopen(argv[1], "r");
out = fopen(argv[2], "w");
/* Snip: Argument and file checking as above */
while (1) {
int c = getc(in);
if (c == EOF) break;
if (c == '>') {
int n = 0;
c = fgetc(in);
while (c != EOF && c != '\n') {
if (n < sizeof(ref) - 1) ref[n++] = c;
c = fgetc(in);
}
ref[n] = '\0';
} else {
int len = 0;
int n = 0;
while (c != EOF && c != '\n') {
n++;
if (!isspace(c)) len = n;
c = fgetc(in);
}
fprintf(out, "%s: %d\n", ref, len);
nseq++;
}
}
fprintf(out, "Total number of sequences. %d\n", nseq);
fclose(in);
fclose(out);
return 0;
}
Notes:
fgetc reads a single byte from a file and returns this byte or EOF when the file has ended. In this implementation, that's the only reading function used.
Storing a reference string is implemented via fgetc here too. You could probably use fgets after skipping the initial angle bracket, too.
The counting just reads bytes without storing them. n is the total count, len is the count up to the last non-space. (Your lines probably consist only of ACGT without any trailing space, so you could skip the test for space and use n instead of len.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]){
FILE *my_file = fopen(argv[1], "r");
FILE *my_output = fopen(argv[2], "w");
int total_sequence_coutns = 0;
char *sequence_name;
int dna_length;
char *line = NULL;
size_t size = 0;
while(-1 != getline(&line, &size, my_file)){
if(line[0] == '>'){
sequence_name = strdup(strtok(line, ">\n"));
total_sequence_coutns +=1;
continue;
}
dna_length = strlen(strtok(line, "\n"));
fprintf(my_output, "%s %d\n", sequence_name, dna_length);
free(sequence_name);
}
fprintf(my_output, "Total number of sequences = %d\n", total_sequence_coutns);
fclose(my_file);
fclose(my_output);
free(line);
return (0);
}
I wrote a simple program that would open a csv file, read it, make a new csv file, and only write some of the columns (I don't want all of the columns and am hoping removing some will make the file more manageable). The file is 1.15GB, but fopen() doesn't have a problem with it. The segmentation fault happens in my while loop shortly after the first progress printf().
I tested on just the first few lines of the csv and the logic below does what I want. The strange section for when index == 0 is due to the last column being in the form (xxx, yyy)\n (the , in a comma separated value file is just ridiculous).
Here is the code, the while loop is the problem:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
long size;
FILE* inF = fopen("allCrimes.csv", "rb");
if (!inF) {
puts("fopen() error");
return 0;
}
fseek(inF, 0, SEEK_END);
size = ftell(inF);
rewind(inF);
printf("In file size = %ld bytes.\n", size);
char* buf = malloc((size+1)*sizeof(char));
if (fread(buf, 1, size, inF) != size) {
puts("fread() error");
return 0;
}
fclose(inF);
buf[size] = '\0';
FILE *outF = fopen("lessColumns.csv", "w");
if (!outF) {
puts("fopen() error");
return 0;
}
int index = 0;
char* currComma = strchr(buf, ',');
fwrite(buf, 1, (int)(currComma-buf), outF);
int progress = 0;
while (currComma != NULL) {
index++;
index = (index%14 == 0) ? 0 : index;
progress++;
if (progress%1000 == 0) printf("%d\n", progress/1000);
int start = (int)(currComma-buf);
currComma = strchr(currComma+1, ',');
if (!currComma) break;
if ((index >= 3 && index <= 10) || index == 13) continue;
int end = (int)(currComma-buf);
int endMinusStart = end-start;
char* newEntry = malloc((endMinusStart+1)*sizeof(char));
strncpy(newEntry, buf+start, endMinusStart);
newEntry[end+1] = '\0';
if (index == 0) {
char* findNewLine = strchr(newEntry, '\n');
int newLinePos = (int)(findNewLine-newEntry);
char* modifiedNewEntry = malloc((strlen(newEntry)-newLinePos+1)*sizeof(char));
strcpy(modifiedNewEntry, newEntry+newLinePos);
fwrite(modifiedNewEntry, 1, strlen(modifiedNewEntry), outF);
}
else fwrite(newEntry, 1, end-start, outF);
}
fclose(outF);
return 0;
}
Edit: It turned out the problem was that the csv file had , in places I was not expecting which caused the logic to fail. I ended up writing a new parser that removes lines with the incorrect number of commas. It removed 243,875 lines (about 4% of the file). I'll post that code instead as it at least reflects some of the comments about free():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
long size;
FILE* inF = fopen("allCrimes.csv", "rb");
if (!inF) {
puts("fopen() error");
return 0;
}
fseek(inF, 0, SEEK_END);
size = ftell(inF);
rewind(inF);
printf("In file size = %ld bytes.\n", size);
char* buf = malloc((size+1)*sizeof(char));
if (fread(buf, 1, size, inF) != size) {
puts("fread() error");
return 0;
}
fclose(inF);
buf[size] = '\0';
FILE *outF = fopen("uniformCommaCount.csv", "w");
if (!outF) {
puts("fopen() error");
return 0;
}
int numOmitted = 0;
int start = 0;
while (1) {
char* currNewLine = strchr(buf+start, '\n');
if (!currNewLine) {
puts("Done");
break;
}
int end = (int)(currNewLine-buf);
char* entry = malloc((end-start+2)*sizeof(char));
strncpy(entry, buf+start, end-start+1);
entry[end-start+1] = '\0';
int commaCount = 0;
char* commaPointer = entry;
for (; *commaPointer; commaPointer++) if (*commaPointer == ',') commaCount++;
if (commaCount == 14) fwrite(entry, 1, end-start+1, outF);
else numOmitted++;
free(entry);
start = end+1;
}
fclose(outF);
printf("Omitted %d lines\n", numOmitted);
return 0;
}
you're malloc'ing but never freeing. possibly you run out of memomry, one of your mallocs returns NULL, and the subsequent call to str(n)cpy segfaults.
adding free(newEntry);, free(modifiedNewEntry); immediately after the respective fwrite calls should solve your memory shortage.
also note that inside your loop you compute offsets into the buffer buf which contains the whole file. these offsets are held in variables of type int whose maximum value on your system may be too small for the numbers you are handling. also note that adding large ints may result in a negative value which is another possible cause of the segfault (negative offsets into buf take you to some address outside the buffer possibly not even readable).
The malloc(3) function can (and sometimes does) fail.
At least code something like
char* buf = malloc(size+1);
if (!buf) {
fprintf(stderr, "failed to malloc %d bytes - %s\n",
size+1, strerror(errno));
exit (EXIT_FAILURE);
}
And I strongly suggest to clear with memset(buf, 0, size+1) the successful result of a malloc (or otherwise use calloc ....), not only because the following fread could fail (which you are testing) but to ease debugging and reproducibility.
and likewise for every other calls to malloc or calloc (you should always test them against failure)....
Notice that by definition sizeof(char) is always 1. Hence I removed it.
As others pointed out, you have a memory leak because you don't call free appropriately. A tool like valgrind could help.
You need to learn how to use the debugger (e.g. gdb). Don't forget to compile with all warnings and debugging information (e.g. gcc -Wall -g). And improve your code till you get no warnings.
Knowing how to use a debugger is an essential required skill when programming (particularly in C or C++). That debugging skill (and ability to use the debugger) will be useful in every C or C++ program you contribute to.
BTW, you could read your file line by line with getline(3) (which can also fail and you should test that).
I am currently attempting to write a program that will tell it's user how many times the specified 8-bit byte appears in the specified file.
I have some ground work laid out, but when it comes to making sure that the file makes it in to an array or buffer or whatever format I should put the file data into to check for the bytes, I feel I'm probably very far off from using the correct methods.
After that, I need to check whatever the file data gets put in to for the byte specified, but I am also unsure how to do this.
I think I may be over-complicating this quite a bit, so explaining anything that needs to be changed or that can just be scrapped completely is greatly appreciated.
Hopefully didn't leave out any important details.
Everything seems to be running (this code compiles), but when I try to printf the final statement at the bottom, it does not spit out the statement.
I have a feeling I just did not set up the final for loop correctly at all..
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
//#define BUFFER_SIZE (4096)
main(int argc, char *argv[]){ //argc = arg count, argv = array of arguements
char buffer[4096];
int readBuffer;
int b;
int byteCount = 0;
b = atoi(argv[2]);
FILE *f = fopen(argv[1], "rb");
unsigned long count = 0;
int ch;
if(argc!=3){ /* required number of args = 3 */
fprintf(stderr,"Too few/many arguements given.\n");
fprintf(stderr, "Proper usage: ./bcount path byte\n");
exit(0);
}
else{ /*open and read file*/
if(f == 0){
fprintf(stderr, "File could not be opened.\n");
exit(0);
}
}
if((b <= -1) || (b >= 256)){ /*checks to see if the byte provided is between 0 & 255*/
fprintf(stderr, "Byte provided must be between 0 and 255.\n");
exit(0);
}
else{
printf("Byte provided fits in range.\n");
}
int i = 0;
int k;
int newFile[i];
fseek(f, 0, SEEK_END);
int lengthOfFile = ftell(f);
for(k = 0; k < sizeof(buffer); k++){
while(fgets(buffer, lengthOfFile, f) != NULL){
newFile[i] = buffer[k];
i++;
}
}
if(newFile[i] = buffer[k]){
printf("same size\n");
}
for(i = 0; i < sizeof(newFile); i++){
if(b == newFile[i]){
byteCount++;
}
printf("Final for loop is working???"\n");
}
}
OP is mixing fgets() with binary reads of a file.
fgets() reads a file up to the buffer size provided or reaching a \n byte. It is intended for text processing. The typical way to determine how much data was read via fgets() is to look for a final \n - which may or may not be there. The data read could have embedded NUL bytes in it so it becomes problematic to know when to stop scanning the buffer. on a NUL byte or a \n.
Fortunately this can all be dispensed with, including the file seek and buffers.
// "rb" should be used when looking at a file in binary. C11 7.21.5.3 3
FILE *f = fopen(argv[1], "rb");
b = atoi(argv[2]);
unsigned long byteCount = 0;
int ch;
while ((ch = fgetc(f)) != EOF) {
if (ch == b) {
byteCount++;
}
}
The OP error checking is good. But the for(k = 0; k < sizeof(buffer); k++){ loop and its contents had various issues. OP had if(b = newFile[i]){ which should have been if(b == newFile[i]){
Not really an ANSWER --
Chux corrected the code, this is just more than fits in a comment.
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
struct stat st;
int rc=0;
if(argv[1])
{
rc=stat(argv[1], &st);
if(rc==0)
printf("bytes in file %s: %ld\n", argv[1], st.st_size);
else
{
perror("Cannot stat file");
exit(EXIT_FAILURE);
}
return EXIT_SUCCESS;
}
return EXIT_FAILURE;
}
The stat() call is handy for getting file size and for determining file existence at the same time.
Applications use stat instead of reading the whole file, which is great for gigantic files.