using memcmp to compare buffers with 0 bytes - c

I am trying to compare 2 texts from files byte by byte using memcmp, after I read both of them into memory, one file into a buffer(char* or char[], tried both). the problem is, the file I read into a buffer have a lot of 0 bytes, which makes him stop at the first 0 byte thinking it is a null terminating zero, which makes a segmentation fault. how can I make the function keep compare bytes even so there are 0 bytes?
I already tried to check if the buffer is full or not, I printed it byte by byte and it showed all of the bytes including the 0 bytes. when I print it completely using printf("%s", buffer) I get only the first byte(the second byte is 0 byte).
void detect_virus(char *buffer, unsigned int size){
link* l = (link*) malloc(sizeof(link));
load(l);
unsigned int location = 0;
while(l != NULL){
location = 0;
while(location < size - l->vir->SigSize){
int isVirus = memcmp(buffer + location, l->vir->sig, l->vir->SigSize);
if(isVirus == 0)
printf("%d, %s, %d\n", location, l->vir->virusName, l->vir->SigSize);
location++;
}
}
free(l);
}
void detect(link* list){
char filename[50];
fgets(filename, 50, stdin);
sscanf(filename, "%s", filename);
FILE* file = fopen(filename, "rb");
char* buffer = (char*) malloc(10000);
fseek(file, 0, SEEK_END);
unsigned int size = ftell(file);
fseek(file, 0, SEEK_SET);
fread(buffer, 1, size, file);
detect_virus(buffer, size);
fclose(file);
}
I get a segmentation fault at the first time the memcmp function is called, instead of fully compare the texts. any ideas how to fix that?
edit
code for load function:
void load(link* list){
printf("Enter Viruses file name: \n");
char* filename = (char*) malloc(100);
fgets(filename, 100, stdin);
sscanf(filename, "%s", filename);
FILE* file = fopen(filename, "r");
while(!feof(file)){
short length = 0;
fread(&length, 2, 1, file);
if(length == 0)
break;
struct virus* v = (struct virus*)malloc(length);
fseek(file, -2, SEEK_CUR);
fread(v, length, 1, file);
v->SigSize = v->SigSize - 18;
list_append(list, v);
}
list = list->nextVirus;
free(filename);
fclose(file);
}
as a note, I tested the function before and it worked.
edit
I found out the problem, thank you all!

Per 7.21.6.7 The sscanf function, paragraph 2 of the C standard (bolding mine):
The sscanf function is equivalent to fscanf, except that input is obtained from a string (specified by the argument s) rather than from a stream. Reaching the end of the string is equivalent to encountering end-of-file for the fscanf function. If copying takes place between objects that overlap, the behavior is undefined.
Note the bolded portion.
In your code:
sscanf(filename, "%s", filename);
the filename array certainly overlaps with the filename array, thus invoking undefined behavior.
Remove that line of code.
You also need to add error checking, especially checking that the return from fopen() is not NULL.

Related

C: Reading binary file outputing zeroes

I am trying to read from a Binary file called "binary.bin" which has the content of "this". I was expecting it to give me the ASCII values of "t", "h", "i", "s" respectively, but it's giving me 5 zeroes.
void bin_byte_by_byte(char *filename) {
FILE *fptr;
unsigned long len;
int *buffer;
fptr = fopen(filename, "rb");
if(!fptr) {
printf("error: file does not exist");
return;
}
// get file lenght - create a function to this
fseek(fptr, 0, SEEK_END);
len = ftell(fptr);
fseek(fptr, 0, SEEK_SET);
buffer = (int*)malloc(sizeof(int) * len);
if(!buffer) {
printf("error: unable to allocate memory");
fclose(fptr);
return;
}
fread(&buffer, sizeof(buffer), len, fptr);
printf("len = %d\n", len);
for(int i = 0; i < len; i++) {
printf("%d ", buffer[i]);
}
if(fclose(fptr) != 0) {
printf("File did not close as expected");
}
free(buffer);
}
Your file is supposed to be binary but it seems that you pass a text file to your program. The file is 5 bytes which suits to the content "this". If you read this file as binary, maybe it makes sense to read bytes and not ints. If you want to read bytes into an int array, you should read byte-wise and store each byte into one position of your int array.
In the program you've listed there are few mistakes.
buffer = (int*)malloc(sizeof(int) * len);
the line above creates an array of 5 ints. So, it takes 20 bytes (assuming 32-bit platform).
Then you read from the file:
fread(&buffer, sizeof(buffer), len, fptr);
This line reads 20 bytes from the file although it is only 5 bytes long. Also, you pass address of the pointer variable but you need to pass the address of the buffer. So, it should be just buffer and not &buffer
But the main point is here that into buffer[0] goes 4 bytes. So, 't', 'h', 'i', 's' go to the first element of buffer.
So, either you can use char for the array type or you read byte-by-byte and store each byte into a separate element of the buffer

Why doesn't file read output anything?

In the binary file mydata.dat, I've written a string: "this is a test". That's the full contents of the file. I want to read the string back but I don't see any output. The program runs without error though. Any idea what I'm doing wrong?
FILE *f = fopen("mydata.dat", "rb");
char content[100];
while(fread(content, sizeof(content), 1, f) == 1){
printf("%s", content);
}
fclose(f);
First, if you want to read characters, you should use fgets(). Let's say that you really want to use fread().
You must understand that fread() returns the number of items read, so in your case it's 0. Because you ask to fread() to read 1 element of 100 bytes... This will always return 0, if your file has less than 100 bytes. You have swapped the size of an element and the number of elements.
Plus if you want your array to be a valid C string you must put a NULL-terminator byte at the end. Because fread() will not do it for you.
Example:
#include <stdio.h>
int main(void) {
FILE *f = fopen("mydata.dat", "rb");
if (f == NULL) { // Error check
perror("fopen()");
return 1;
}
char content[100];
size_t ret;
// We loop on the file to read 99 bytes at each loop
// sizeof *content is the size of an element of content
while ((ret = fread(content, sizeof *content, sizeof content - 1, f)) > 0) {
content[ret] = '\0'; // We use ret to nul terminate our string
printf("%s", content);
fflush(stdout); // flush the standard output
}
fclose(f);
}

not getting all data in file using fopen

I'm using the fopen with fread for this:
FILE *fp;
if (fopen_s(&fp, filePath, "rb"))
{
printf("Failed to open file\n");
//exit(1);
}
fseek(fp, 0, SEEK_END);
int size = ftell(fp);
rewind(fp);
char buffer = (char)malloc(sizeof(char)*size);
if (!buffer)
{
printf("Failed to malloc\n");
//exit(1);
}
int charsTransferred = fread(buffer, 1, size, fp);
printf("charsTransferred = %d, size = %d\n", charsTransferred, strlen(buffer));
fclose(fp);
I'm not getting the file data in the new file. Here is a comparison between the original file (right) and the one that was sent over the network (left):
Any issues with my fopen calls?
EDIT: I can't do away with the null terminators, because this is a PDF. If i get rid of them the file will corrupt.
Be reassured: the way you're doing the read ensures that you're reading all the data.
you're using "rb" so even in windows you're covered against CR+LF conversions
you're computing the size all right using ftell when at the end of the file
you rewind the file
you allocate properly.
BUT you're not storing the right variable type:
char buffer = (char)malloc(sizeof(char)*size);
should be
char *buffer = malloc(size);
(that very wrong and you should correct it, but since you successfully print some data, that's not the main issue. Next time enable and read the warnings. And don't cast the return value of malloc, it's error-prone specially in your case)
Now, the displaying using printf and strlen which confuses you.
Since the file is binary, you meet a \0 somewhere, and printf prints only the start of the file. If you want to print the contents, you have to perform a loop and print each character (using charsTransferred as the limit).
That's the same for strlen which stops at the first \0 character.
The value in charsTransferred is correct.
To display the data, you could use fwrite to stdout (redirect the output or this can crash your terminal because of all the junk chars)
fwrite(buffer, 1, size, stdout);
Or loop and print only if the char is printable (I'd compare ascii codes for instance)
int charsTransferred = fread(buffer, 1, size, fp);
int i;
for (i=0;i<charsTransferred;i++)
{
char b = buffer[i];
putchar((b >= ' ') && (b < 128) ? b : "-");
if (i % 80 == 0) putchar('\n'); // optional linefeed every now and then...
}
fflush(stdout);
that code prints dashes for characters outside the standard printable ASCII-range, and the real character otherwise.

how to write one binary file to another in c

I have a few binary files that I want to write into an output file.
So I wrote this function using a char as a buffer naively thinking it would work.
//Opened hOutput for writing, hInput for reading
void faddf(FILE* hOutput, FILE* hInput) {
char c;
int scan;
do{
scan = fscanf(hInput, "%c", &c);
if (scan > 0)
fprintf(hOutput, "%c", c);
} while (scan > 0 && !feof(hInput));
}
Executing this function gives me an output of the few readable char's in the beginning binary file. So I tried it this way:
void faddf(FILE* hOutput, FILE* hInput) {
void * buffer;
int scan;
buffer = malloc(sizeof(short) * 209000000);
fread(buffer, sizeof(short), 209000000, hInput);
fwrite(buffer, sizeof(short), 209000000, hOutput);
free(buffer);
}
This "works" but is only works when the file is smaller then my "magic number" Is there a better way?
Although your new code (in the answer) is much better than the old code, it can still be improved and simplified.
Specifically, you can avoid any memory problems by copying the file in chunks.
void faddf( FILE *fpout, FILE *fpin )
{
char buffer[4096];
size_t count;
while ( (count = fread(buffer, 1, sizeof buffer, fpin)) > 0 )
fwrite(buffer, 1, count, fpout);
}
You should avoid reading bytes per byte. Use the fgets() function instead of fscanf().
Please refer to : Man fgets() (for Windows)
When you open both files next to each other (input one / output one), you're saying that the output file only contains readable characters... But can your text editor display unreadable characters on the input one ?
I should not have asked the question in the first place but here is how I ended up doing it:
void faddf(FILE* hOutput, FILE* hInput) {
void * buffer;
int scan,size;
size_t read;
//get the input file size
fseek(hInput, 0L, SEEK_END);
size = ftell(hInput);
fseek(hInput, 0L, SEEK_SET);
//place the get space
buffer = malloc(size);
if (buffer == NULL)exit(1);//should fail silently instead
//try to read everything to buffer
read = fread(buffer, 1, size, hInput);
//write what was read
fwrite(buffer, 1, read, hOutput);
//clean up
free(buffer);
}

fscanf doesn't read the first char of first word (in c)

I'm using fscanf function in a c code to read a file contains 1 line of words separated by white spaces, but for example if the first word is 1234, then when I print it the output is 234, however the other words in the file are read correctly, any ideas?
FILE* file = fopen(path, "r");
char arr = getc(file);
char temp[20];
while(fscanf(file,"%s",temp)!= EOF && i<= column)
{
printf("word %d: %s\n",i, temp);
}
char arr = getc(file);
Probably above line is causing to loose the first char.
Here is the posted code, with my comments
When asking a question about a run time problem,
post code that cleanly compiles, and demonstrates the problem
FILE* file = fopen(path, "r");
// missing check of `file` to assure the fopen() was successful
char arr = getc(file);
// this consumed the first byte of the file, (answers your question)
char temp[20];
while(fscanf(file,"%s",temp)!= EOF && i<= column)
// missing length modifier. format should be: "%19s"
// 19 because fscanf() automatically appends a NUL byte to the input
// 19 because otherwise the input buffer could be overrun,
// resulting in undefined behaviour and possible seg fault event
// should be checking (also) for returned value == 1
// this will fail as soon as an `white space` is encountered
// as the following call to fscanf() will not read/consume the white space
// suggest a leading space in the format string to consume white space
{
printf("word %d: %s\n",i, temp);
// the variable 'i' is neither declared nor modified
// within the scope of the posted code
}
char arr = getc(file);
reads the first character from the file stream and iterates the file stream file
you can use rewind(file) after char arr = getc(file) to reset your file stream to the beginning.
Other example:
#include <stdio.h>
int main(void)
{
FILE *f;
FILE *r;
char str[100];
size_t buf;
memset(str, 0, sizeof(str));
r = fopen("in.txt", "r");
f = fopen("out.txt", "w+b");
fscanf(r, "%s", str);
rewind(r); // without this, the first char won't be written
buf = fread(str, sizeof(str), 1, r);
fwrite(str, sizeof(str), 1, f);
fclose(r);
fclose(f);
return (0);
}

Resources