Scanning a file for a signature (using fread) - c

The point of the function is to get 2 file streams, a signature, and a scanned file.
It's supposed to scan the scanned file, and if the signature is in it it'll return 1 (file infected).
I tried using the same file as both parameters and it still returned 0, I tried debugging and I can't get the problem.
(sizeOfFile function returns the size of the file in bytes)
int scanFile(FILE* signatureFile, FILE* scannedFile)
{
const size_t signatureSize = sizeOfFile(signatureFile);
const size_t scannedFileSize = sizeOfFile(scannedFile);
size_t l1 = 0;
size_t l2 = 0;
unsigned char currChar = ' ';
unsigned char currSignatureChar = ' ';
int i = 0;
unsigned char signatureFirstChar = fread(&signatureFirstChar, 1, 1, signatureFile);
if (scannedFileSize >= signatureSize)
{
while ((l1 = fread(&currChar, 1, 1, scannedFile)) != 0)
{
if (currChar == signatureFirstChar)
{
fseek(scannedFile, -1, SEEK_CUR);
fseek(signatureFile, 0, SEEK_SET);
currSignatureChar = signatureFirstChar;
while (currChar == currSignatureChar)
{
if ((l1 = fread(&currChar, 1, 1, scannedFile)) != 0 && (l2 = fread(&currSignatureChar, 1, 1, signatureFile)) != 0)
{
i++;
if (i == signatureSize)
{
return 1;
}
}
else
{
break;
}
}
}
}
}
else
{
return 0;
}
return 0;
}
Any kind of help would be appreciated.

The fread() function returns the number of objects read. You are assigning that return value to your signatureFirstChar when you do
unsigned char signatureFirstChar = fread(&signatureFirstChar, 1, 1, signatureFile);
change it to
unsigned char signatureFirstChar;
fread(&signatureFirstChar, 1, 1, signatureFile);

Related

How can I convert this method to become recursive?

Method checks for the amount of each character and if its even for every character, it returns 1. Otherwise, it returns 0. String is passed via str[]. chars[] has its every value set to one at the start. It's hard to picture this becoming recursive, any help on teaching is appreciated.
int recursionCheckEven(int i, int j, char str[], int chars[20]) {
for (i = 0; i < strlen(str); i+=2) {
int count = 0;
for (j = i; j < strlen(str); j+=2) {
if (str[i] == str[j] && chars[j] == 1) {
count++;
chars[i] = 2;
chars[j] = 2;
}
}
if (count % 2 != 0) {
chars[i] = 0;
}
}
for (int k = 0; k < 20; k++) {
if (chars[k] == 0) {
return 0;
break;
}
}
return 1;
}
How I call this:
for (unsigned int i = 0; i < stringcount; i++) {
int chars[20] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
if(recursionCheckEven(0, 0, strings[i], chars)) {
printf("The %dth string has even number of characters\n", i);
}
}
You can use a loop to go through the characters in a non-recursive way (this is recommended). The idea of recursion is to avoid using a loop (which is actually not recommended and wastes stack memory and causes other problems).
For recursive check, you can use pointers the check each element, then go to the next element and use the same function.
To help you get started, this is a recursive function which takes a string and counts the number of each character.
int recursive(int total, char* ptr, char ch)
{
if (*ptr == '\0')
return total;
if (*ptr == ch)
total++;
return recursive(total, ptr + 1, ch);
}
int main(void)
{
char *str = "111";
char ch = '1';
int total = recursive(0, str, ch);
printf("total of char %c in %s: %d\n", ch, str, total);
return 0;
}

RLE implementation failing every once in a while

I implemented the following two functions for RLE compression of binary files.
char* RLEcompress(char* data, size_t origSize, size_t* compressedSize) {
char* ret = calloc(2 * origSize, 1);
size_t retIdx = 0, inIdx = 0;
size_t retSize = 0;
while (inIdx < origSize) {
size_t count = 1;
size_t contIdx = inIdx;
while (contIdx < origSize - 1 && data[inIdx] == data[++contIdx]) {
count++;
}
size_t tmpCount = count;
// break down counts with 2 or more digits into counts ≤ 9
while (tmpCount > 9) {
tmpCount -= 9;
ret[retIdx++] = data[inIdx];
ret[retIdx++] = data[inIdx];
ret[retIdx++] = '9';
retSize += 3;
}
ret[retIdx++] = data[inIdx];
retSize += 1;
if (tmpCount > 1) {
// repeat character (this tells the decompressor that the next digit
// is in fact the # of consecutive occurrences of this char)
ret[retIdx++] = data[inIdx];
// convert single-digit count to dataing
ret[retIdx++] = '0' + tmpCount;
retSize += 2;
}
inIdx += count;
}
*compressedSize = retSize;
return ret;
}
char* RLEdecompress(char* data, size_t compressedSize, size_t uncompressedSize, size_t extraAllocation) {
char* ret = calloc(uncompressedSize + extraAllocation, 1);
size_t retIdx = 0, inIdx = 0;
while (inIdx < compressedSize) {
ret[retIdx++] = data[inIdx];
if (data[inIdx] == data[inIdx + 1]) { // next digit is the # of occurrences
size_t occ = ((data[inIdx + 2]) - '0');
for (size_t i = 1; i < occ && retIdx < compressedSize; i++) {
ret[retIdx++] = data[inIdx];
}
inIdx += 2;
}
inIdx += 1;
}
return ret;
}
They seem to work fine, i.e. diff doesn't produce any output when comparing the original files to the compressed-then-uncompressed versions.
However, every once in a while, the files will differ indicating there is a bug somewhere. I haven't been able to find a pattern in the files that exhibit this, but I'll give you an example of what the difference looks like.
The lower one is the original.
As you can see, the byte 21 is repeated twice in the compressed-then-uncompressed version. I haven't been able to identify the issue. Unfortunately the bug happens with very few files: so far I've only observed it with two pdf files, including the one shown above, but I can't share them because it's copyrighted content, but I'm working on coming up with another file that fails so I can provide you with an example.
I have a feeling there is something "obvious" wrong with the code above and I'm just missing it. Help is greatly appreciated.
EDIT:
Here's a test program I'm using to read the offending file, compressing it, then decompressing it. I'm also saving the compressed one to disk in a middle step to have more debug data.
int main(int argc, char** argv) {
size_t compsz;
FILE* fp = fopen(argv[1], "r");
if (!fp) {
perror("fp");
return 1;
}
if (fseek(fp, 0L, SEEK_END) == -1) {
return -1;
}
// get file size
size_t filecontentLen = ftell(fp);
if (filecontentLen < 0) {
return -1;
}
rewind(fp);
char* filecontentBuf = calloc(filecontentLen, 1);
if (!filecontentBuf) {
fclose(fp);
errno = ENOMEM;
return -1;
}
// read original
if (fread(filecontentBuf, sizeof(char), filecontentLen, fp) <= 0) {
int errnosave = errno;
if (ferror(fp)) {
fclose(fp);
free(filecontentBuf);
errno = errnosave;
return -1;
}
}
// write compressed
char* compressed = RLEcompress(filecontentBuf, filecontentLen, &compsz);
FILE* fpcompWrite = fopen("compressed", "w+");
if (fwrite(compressed, compsz, 1, fpcompWrite) == -1) {
perror("fwrite");
}
fclose(fpcompWrite);
// read compressed
FILE* fpcompRead = fopen("compressed", "r");
if (!fpcompRead) {
perror("fpcompRead");
return 1;
}
char* compBuf = calloc(compsz * 2, 1);
fread(compBuf, compsz, 1, fpcompRead);
fclose(fpcompRead);
// decompress and write file
char* uncompBuf = RLEdecompress(compBuf, compsz, filecontentLen, 0);
FILE* funcomp = fopen("uncompressed", "w+");
fwrite(uncompBuf, filecontentLen, 1, funcomp);
fclose(funcomp);
}
I think the problem is that
for (size_t i = 1; i < occ && retIdx < compressedSize; i++) {
ret[retIdx++] = data[inIdx];
}
should be changed in
for (size_t i = 1; i < occ && retIdx < uncompressedSize; i++) {
ret[retIdx++] = data[inIdx];
}
in the decompression algorithm, since redIdx is bounded by uncompressedSize, and maybe in some rare cases it copies fewer bytes than it should.

String isn't getting mutated after using a function, despite being mutated inside the function c language

Quick summary - I am testing my functions, and for that, I've decided to write my own add_as_first() and add_as_last() functions, which are adding character to the old string array, through creating a new one and making a copy, without using default c language string functions. And for some reason, one out of two string arrays isn't behaving like I would like it to do. I've added strlen() function for debugging and I can see that inside a print_file function my string array is perfectly healthy and growing bigger as the function goes through its cycles. But for some reason, when the function ends, one of the string "res" is back to 0 length, while, the second string, "body", got properly mutated and has a value which is supposed to have. The difference between them is that "res" is getting filled by using add_as_last() function inside a print_file loop function, while, "body" is getting filled by using add_as_first() function inside a forming_list() function, which is inside print_file(). Any idea why it behaves this way? It looks like scoping issues but it should work properly, considering that both "body" and "res" are being used in about the same way.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define x_size 4
#define y_size 4
char* add_as_first(char *given_arr, char given_char, int check) {
int strLen = 0;
while (given_arr[strLen] != '\0') {
strLen++;
}
char *new_arr;
new_arr = (char*)malloc((strLen + 1 + 1) * sizeof(char));
new_arr[0] = given_char;
int i = 0;
for (; i < strLen; i++) {
new_arr[i + 1] = given_arr[i];
}
new_arr[i + 1] = '\0';
if ((i + 1 + 1) != (strLen + 1 + 1)) {
printf("WE GOT A MEMORY LEAK OUT THERE\n");
}
if (check != 0) {
free(given_arr);
}
return new_arr;
}
char* add_as_last(char *given_arr, char given_char, int check) {
int strLen = 0;
while (given_arr[strLen] != '\0') {
strLen++;
}
char* newString = (char*)malloc((strLen + 1 + 1) * sizeof(char));
int i = 0;
for(; i < strLen; i++) {
newString[i] = given_arr[i];
}
newString[i] = given_char;
newString[i + 1] = '\0';
if ((i + 1 + 1) != (strLen + 1 + 1)) {
printf("WE GOT A MEMORY LEAK OUT THERE\n");
}
return newString;
}
void forming_list(int file_arr[][y_size], int i, int j, char body[]) {
char thing = '0';
thing = j + '0';
body = add_as_first(body, thing, 1);
body = add_as_first(body, ',', 1);
thing = i + '0';
body = add_as_first(body, thing, 1);
body = add_as_first(body, ' ', 1);
if ((file_arr[i][j] == 1) || (file_arr[i][j] == 5)) {
if (j < x_size - 1) {
forming_list(file_arr, i, j + 1, body);
} else {
forming_list(file_arr, i + 1, 0, body);
}
} else if ((file_arr[i][j] == 2) || (file_arr[i][j] == 6)) {
if (j != 0) {
forming_list(file_arr, i, j - 1, body);
} else {
forming_list(file_arr, i - 1, x_size - 1, body);
}
} else if ((file_arr[i][j] == 3) || (file_arr[i][j] == 7)) {
if (i == 0) {
forming_list(file_arr, y_size - 1, j, body);
} else {
forming_list(file_arr, i - 1, j, body);
}
} else if ((file_arr[i][j] == 4) || (file_arr[i][j] == 8)) {
if (i == y_size - 1) {
forming_list(file_arr, 0, j, body);
} else {
forming_list(file_arr, i + 1, j, body);
}
}
}
void print_file(int file_arr[][y_size], char *res, char *body) {
int i = 0;
int j = 0;
for(i; i < y_size; i++) {
for (j; j < x_size; j++) {
if ((file_arr[i][j] >= 5) && (file_arr[i][j] <= 9)) {
res = add_as_last(res, '#', 1);
}
else if ((file_arr[i][j] >= 1) && (file_arr[i][j] <= 4)) {
res = add_as_last(res, '#', 1);
forming_list(file_arr, i, j, body);
} else {
res = add_as_last(res, 'x', 1);
}
}
res = add_as_last(res, '\n', 1);
printf("Length of string = %zu \n",strlen(res));
j = 0;
}
}
int main()
{
int disp[x_size][y_size] = {
{0, 1, 8, 0},
{0, 0, 8, 0},
{0, 9, 6, 0},
{0, 0, 0, 0}
};
char *res;
char *body;
res = (char*)malloc((1) * sizeof(char));
body = (char*)malloc((1) * sizeof(char));
res[0] = '\0';
body[0] = '\0';
printf("Length of string = %zu \n",strlen(res));
print_file(disp, res, body);
printf("Length of string = %zu \n",strlen(res));
printf("%s\n", res);
printf("%s\n", body);
return 0;
}
The output is:
Length of string = 0 //After creating a res line
Length of string = 5 //After the first outer for loop
Length of string = 10 //After the second outer for loop
Length of string = 15 //After the third outer for loop
Length of string = 20 //After the fourth outer for loop just before exiting function
Length of string = 0 //Outside the function before printing "res" string
//Printing empty "res" string
2,1 2,2 1,2 0,2 0,1 //Printing properly filled "body" string
Why "body" and "res" value is so different after my program is finished working, despite being so similar?
Your print_file function is modifying res, but the new pointer never gets passed back into main.
So main has its res still pointing to the (long destroyed) original res, which produces undefined bahavior for strlen (or anything else).
You should pass the res pointer by reference or pass the pointer to it, so the modification gets back into main.

Element disapears while calling a function - LZW Compression

I made some research but nothing was really concerning my problem...
I'm actually trying to code LZW compression for school, and I need a function to check if an element is in my dictionnary.
However, when I'm calling this function, it tries to access to the 64th element in my dictionnary, but it has desapeared !! I checked it before the function calling, it was here !! And the worse is that I can call this element in the previous callings of the function.
Could you help me please ?
The function :
int is_in_dictionnary(dico * p_pRoot, char * p_string){
int i = 0, j = 0;
char a[1024] = { 0 }, b[1024] = { 0 };
//strcpy(b, p_pRoot->m_dico[64].m_info);
for (i = 0; i < p_pRoot->m_index; i++){
printf("dico %s\n", p_pRoot->m_dico[i].m_info);
strcpy(a, p_string);
strcpy(b, p_pRoot->m_dico[i].m_info);
j = strcmp(a, b);
if (j == 0)
return i;
}
return -1;
}
The console, we are herer abble to see that the function previously called the 64th element "#", whithout any problem
The error on visual studio
Some people Asked me to add the code part where it's not functionning :
void lzw_compress(dico *p_pRoot, char * path)
{
FILE *pFile = NULL, *pCompFile = NULL;
int len_c = 0, size_tamp = 0, i = 0, masked_tamp = 0, tamp_to_write = 0, index_tamp = 0, a;
unsigned char char_tamp = 0, cAndTamp[1024] = { 0 }, tampon[1024] = { 0 }, c = '\0', temp[2] = { 0 };
char test[128] = { 0 };
pFile = fopen(path, "r+");
if (!pFile)
{
printf("problem while opening file to compress");
return;
}
size_t len = strlen(path); //creation of the output file name : paht+ ".lzw"
unsigned char *compress_name = malloc(len + 4 + 1);
strcpy(compress_name, path);
compress_name[len] = '.';
compress_name[len + 1] = 'l';
compress_name[len + 2] = 'z';
compress_name[len + 3] = 'h';
compress_name[len + 4] = '\0';
pCompFile = fopen(compress_name, "w"); //creation of the output file
free(compress_name);
while (1)
{
if (feof(pFile))
break;
c = freadByte(pFile);
for (i = 0; i < 1024; i++)
cAndTamp[i] = 0;
temp[0] = c;
strcat(cAndTamp, tampon);
strcat(cAndTamp, temp);
strcpy(test, p_pRoot->m_dico[64].m_info);
a = 0;
if (is_in_dictionnary(p_pRoot, cAndTamp) > -1)
{
strcpy(tampon, cAndTamp);
a = 0;
}
else
{
if (is_in_dictionnary(p_pRoot, tampon) < 256) //write the character in the file
{
char_tamp = tampon[0];
fwrite(&char_tamp, sizeof(char), 1, pCompFile);
a = 0;
}
else
{
a = 0;
index_tamp = is_in_dictionnary(p_pRoot, tampon);
a = 0;
for (i = 0; i < p_pRoot->m_size; i++)
{
mask = 1 << i;
masked_tamp = index_tamp & mask;
tamp_to_write = masked_tamp >> i;
fwriteBit(tamp_to_write, pCompFile);
flush(pCompFile);
}
}
strcpy(test, p_pRoot->m_dico[64].m_info); //HERE IT'S OK
add_dictionnary(p_pRoot, cAndTamp, size_tamp + 1); //add the string tamp + read byte in the dictionnay
strcpy(test, p_pRoot->m_dico[64].m_info); //HERE IT IS NOT OK
strcpy(tampon, temp);
}
strcpy(test, p_pRoot->m_dico[64].m_info);
size_tamp = is_in_dictionnary(p_pRoot, tampon);
}
if (tampon < 256) //write the character in the file
{
char_tamp = (char)tampon;
fwrite(&char_tamp, sizeof(char), 1, pCompFile);
}
else
{
index_tamp = is_in_dictionnary(p_pRoot, tampon);
for (i = 0; i < p_pRoot->m_size; i++)
{
mask = 1 << i;
masked_tamp = index_tamp & mask;
tamp_to_write = masked_tamp >> i;
fwriteBit(tamp_to_write, pCompFile);
flush(pCompFile);
}
}
fclose(pFile);
fclose(pCompFile);
}
The fucnction that where I think there is a problem
void add_dictionnary(dico * p_pRoot, char * p_string, int p_stringSize)
{
p_pRoot->m_index++;
if (p_pRoot->m_index == pow(2, p_pRoot->m_size))
realloc_dictionnary(p_pRoot);
p_pRoot->m_dico[p_pRoot->m_index].m_info = (char*)calloc(p_stringSize, sizeof(char));
strcpy(p_pRoot->m_dico[p_pRoot->m_index].m_info, p_string);
}
Another thank you guys !
I showed again the program to my teacher and he found the problem !
The problem is that i never use malloc and rarely use realloc so here was the problem :
void realloc_dictionnary(dico * p_pRoot)
{
int real = p_pRoot->m_size + 1;
int size = pow(2, real);
printf("index %d, previous pow %d, new power %d, size %d\n", p_pRoot->m_index, p_pRoot->m_size, real, size);
p_pRoot->m_dico = (code*) realloc(p_pRoot->m_dico, size);
p_pRoot->m_size = real;
}
size in a number of bits, ...
So the correction is : size * sizeof(code)!
void realloc_dictionnary(dico * p_pRoot)
{
int real = p_pRoot->m_size + 1;
int size = pow(2, real);
printf("index %d, previous pow %d, new power %d, size %d\n", p_pRoot->m_index, p_pRoot->m_size, real, size);
p_pRoot->m_dico = (code*) realloc(p_pRoot->m_dico, size * sizeof(code));
p_pRoot->m_size = real;
}
I would like to first of all say sorry because of this so little errror and also a big thanks for your great patience !

Deallocating 2D array and creating in function C

I have a function "n" which create a 2D array. In main() function I call this function with key "n". If function "n" has been already called it must deallocate array (p_spz) and run function "n" and create array again. My code:
if ((p_file != NULL) && (p_spz == NULL) && (funkcia == 'n'))
{
p_spz = n(p_file);
}
else if ((p_file != NULL) && (p_spz != NULL) && (funkcia == 'n'))
{
for(int i = 0; i < 5; i++)
{
free(p_spz[i]);
}
free(p_spz);
p_spz = n(p_file);
}
After first function n is everything OK but second time program failed. How to deallocate it properly?
Here is "n" function.
char** n(FILE *p_file) {
int poc = 0, i = 1, j = 0, l, k=0;
char ch;
while (!feof(p_file)) {
ch = fgetc(p_file);
if (ch == '\n') {
poc++;
}
}
poc = poc / 5;
char **p_spz = (char **)malloc(sizeof(char *)*poc);
fseek(p_file, 0L, SEEK_SET);
char riadok[50];
while (fgets(riadok, sizeof(riadok), p_file)) {
if (i == 2) {
p_spz[k] = (char *)malloc(sizeof(char) * 7);
for (l = 0; l < 7; l++) {
p_spz[k][l] = riadok[l];
}
p_spz[k][7] = 0;
k++;
}
else if (i == 5) {
i = 0;
}
i++;
}
fseek(p_file, 0L, SEEK_SET);
return p_spz;
}
It reads a file (file is open, in function above this function). It reads second and then every fifth line. Then it writes this string into array.
Everything is working properly until I want to deallocate array.

Resources