pattern searching of binary data - c

I am trying to build antivirus in C.
I do that like this:
Read data of virus and picture file to scanned.
Check if virus data appear in picture data.
I read the data of scanned file and virus file like this: ( I read the file by binary mode because the file is picture(.png) )
// open file
file = fopen(filePath, "rb");
if (!file)
{
printf("Error: can't open file.\n");
return 0;
}
// Allocate memory for fileData
char* fileData = calloc(fileLength + 1, sizeof(char));
// Read data of file.
fread(fileData, fileLength, 1, file);
after i read the file data and the Virus data i check if the virus appear in the file like this:
char* ret = strstr(fileData, virusID);
if (ret != NULL)
printf("Infetecd file");
It does not work even though in my picture i have VirusID.
I want to check if the binary data of virus appear in binary data of picture.
For example: my binary data of my virus http://pastebin.com/xZbWA9qu
And the binary data of my picture(with the virus): http://pastebin.com/yjXr84kr

First, note the order of arguments of fread, fread(void *ptr, size_t size, size_t nmemb, FILE *stream); so to get the number of bytes, it's better to do fread(fileData, 1, fileLength, file);. Your code will return 0 or 1 depends on whether there is enough data to be read in the file, not the number of bytes it has read.
Second, strstr is to search for strings, not memory blocks, in order to search binary blocks, you need to write your own, or you can use the GNU extension function memmem.
// Allocate memory for fileData
char *fileData = malloc(fileLength);
// Read data of file.
size_t nread = fread(fileData, 1, fileLength, file);
void *ret = memmem(fileData, nread, virusID, virusLen);
if (ret != NULL)
printf("Infetecd file");

Search for the first byte of the virus signature, if you find it then see if the next byte is the second byte of the signature, and so on until you have checked and matched all bytes of the signature. Then the file is infected. If not all bytes matches then search again for the first byte of the signature.

Related

Reading a text file full with null characters and texts using fread

I am trying to design a small file system.
I have created a text file to store the files data in.
int kufs_create_disk(char* disk_name, int disk_size){
FILE* file_ptr = fopen(disk_name, "w");
if (file_ptr == NULL)
return -1;
fseek (file_ptr, disk_size * 1024-1, SEEK_SET);
fwrite("", 1, sizeof(char), file_ptr); // to make a size for the file
fclose(file_ptr);
DiskName=disk_name;
return 0;
}
After writing to the file I get a file with the size I determine when I call the function.
kufs_create_disk("test.txt", 5);
which creates a file with size of 5kbs with '\0' to fill this file to the size.
I have created another function to write to this file in different places of the file which works just fine and I won't paste the code for simplicity.
When I try to read from the file using fread(), I'm not getting all the data I have written into the memory; rather I get just some of the data.
My read implementation would be:
int kufs_read(int fd, void* buf, int n){
FILE *file_ptr= fopen("test.txt","a+");
fseek (file_ptr, FAT[fd].position, SEEK_SET); //where FAT[fd].position is where I want to start my read and fd is for indexing purposes
fread(buf, 1, n, file_ptr); //n is the number of bytes to be read
FAT[fd].position = FAT[fd].position + n;
}
The thing is the file reads some of the characters written and doesn't read the rest. I did a little test by looping all over the file and checking whether every thing is being read and fread reads every thing but in the buf I only get some of the characters I've written.
The text file looks something like this:
0\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00written string1written string2 0\00\00\00\00\00\00\00\00\00\00\00\000\00\00\00\00\00\00\00\00\00\00\00\00writtenstring 3 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00
I get writtenstring1 and writtenstring2 in the buffer but I don't get writtenstring 3 for example.
Can you explain why?

Writing raw data to a file

I have an uint8_t array of raw data that I want to write to a file (I have it's length)
The problem is that because I'm dealing with raw data there might be a 0x00 (aka null terminator) somewhere, meaning fputs is not reliable, the obvious alternative is to have a loop to use fputc() but is there a way i can do it without that?
Is there say a function that takes a pointer and a size and writes that amount of data from the pointer's location to the file?
In addition to the problem with null-character, there is problem reading binary data when file is opened in text mode (for example fgets stops when it encounters new line or 0x0A and 0x1A character in Windows)
Open the file in binary mode instead, and use fread/fwrite
FILE *fout = fopen("test.bin", "wb");
And use fwrite and fread
Reference
fread and fwrite are your friends.
uint8_t TheData[NUMBER_OF_ARRAY_ITEMS] = {0};
// ... Transformations to your data ...
// Persist the data
FILE *fHandleOutput = fopen("test.bin", "wb");
if(!fHandleOutput){
printf("Error: Output file handle was NULL!\n");
return;
}
// SIGNATURE: fwrite(const void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
fwrite(TheData, sizeof(TheData[0]), NUMBER_OF_ARRAY_ITEMS, fHandleOutput);
fflush(fHandleOutput); // Ensure changes get written to disk before we close
fclose(fHandleOutput);
fHandleOutput = NULL;
// Read the data
// Incoming data buffer
uint8_t TheData[NUMBER_OF_ARRAY_ITEMS] = {0};
// Attempt file open for binary mode
FILE *fHandleInput = fopen("test.bin", "rb");
if(!fHandleInput){
printf("Error: Input file handle was NULL!\n");
return;
}
// SIGNATURE: fread(void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
size_t iRead = fread(TheData, sizeof(TheData[0]), NUMBER_OF_ARRAY_ITEMS, fHandleInput);
fclose(fHandleInput);
fHandleInput = NULL;
It's worth noting that the return value of fread can be used to detect End-of-File (EOF) and I/O errors. If iRead < NUMBER_OF_ARRAY_ITEMS, then either an error occurred, or there were only iRead-number of sizeof(TheData[0])-byte segments between the filepointer's position and the EOF. (feof(...) or ferror(...) can be used to determine the cause of a low item read count.)

Displaying size of a file [C]

I'm making a simple sockets program to send a text file or a picture file over to another socket connected to a port. However, I want to also send the size of the file over to the client socket so that it knows how many bytes to receive.
I also want to implement something where I can send a certain number of bytes instead of the file itself. For example, if a file I wanted to send was 14,003 bytes and I felt like sending 400 bytes, then only 400 bytes would be sent.
I am implementing something like this:
#include <stdio.h>
int main(int argc, char* argv[]) {
FILE *fp;
char* file = "text.txt";
int offset = 40;
int sendSize = 5;
int fileSize = 0;
if ((fp = fopen(file, "r")) == NULL) {
printf("Error: Cannot open the file!\n");
return 1;
} else {
/* Seek from offset into the file */
//fseek(fp, 0L, SEEK_END);
fseek(fp, offset, sendSize + offset); // seek to sendSize
fileSize = ftell(fp); // get current file pointer
//fseek(fp, 0, SEEK_SET); // seek back to beginning of file
}
printf("The size is: %d", fileSize);
}
offset is pretty much going to go 40 bytes into the file and then send whatever sendSize bytes over to the other program.
I keep getting an output of 0 instead of 5. Any reason behind this?
You can try this.
#include <stdio.h>
int main(int argc, char* argv[]) {
FILE *fp;
char* file = "text.txt";
int offset = 40;
int sendSize = 5;
int fileSize = 0;
if ((fp = fopen(file, "r")) == NULL) {
printf("Error: Cannot open the file!\n");
return 1;
} else {
fseek(fp, 0L, SEEK_END);
fileSize = ftell(fp);
}
printf("The size is: %d", fileSize);
}
The fseek() to the end, then ftell() method is a reasonably portable way of getting the size of a file, but not guaranteed to be correct. It won't transparently handle newline / carriage return conversions, and as a result, the standard doesn't actually guarantee that the return from ftell() is useful for any purpose other than seeking to the same position.
The only portable way is to read the file until data runs out and keep a count of bytes. Or stat() the file using the (non-ANSI) Unix standard function.
You may be opening the file in text mode as Windows can open a file in text mode even without the "t" option.
And you can't use ftell() to get the size of a file opened in text mode. Per 7.21.9.4 The ftell function of the C Standard:
For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file
position indicator for the stream to its position at the time
of the ftell call; the difference between two such return
values is not necessarily a meaningful measure of the number of
characters written or read.
Even if it does return the "size" of the file, the translation to "text" may changed the actual number of bytes read.
It's also not portable or standard-conforming to use fseek() to find the end of a binary file. Per 7.21.9.2 The
fseek
function:
A binary stream need not meaningfully support fseek calls with a
whence value of SEEK_END.
I think your Seek does not work due to the 3rd parameter:
try to seek with
(fp, offset, SEEK_SET);
as he will try to use the number sendSize+Offset as the "origin" constant, it will be compared to the 3 constant values as below (it is 0, 1 or 2) and as nothing compares it seem to return 0 all time.
http://www.cplusplus.com/reference/cstdio/fseek/
Parameters
stream, offset, origin
Position used as reference for the offset. It is specified by one of the following constants defined in exclusively to be used as arguments for this function:
Constant Reference position
SEEK_SET Beginning of file
SEEK_CUR Current position of the file pointer
SEEK_END End of file

how to split a file into pages and set each page address

My software needs to read a file and write to a device. It should split the file to smaller pages which has a maximum size (say M bytes), and also sets page address for each cycle. How can I implement it in C?
Thanks!
Hetty
It's not clear what are you going to do with this data but to read a file chunk by chunk you just need to use fread:
FILE *file = fopen("yourfile.dat", "rb");
size_t amount;
unsigned char buffer[PAGE_SIZE];
while ((amount = fread(buffer, 1, PAGE_SIZE, file)) > 0)
{
..
}

Copying a file in C with fwrite

I am new to C and was trying to write a program just to copy a file so that I could learn the basics of files. My code takes a file as input, figures out its length by subtracting its start from its end using fseek and ftell. Then, it uses fwrite to write, based on what I could get from its man page, ONE element of data, (END - START) elements long, to the stream pointed to by OUT, obtaining them from the location given by FI. The problem is, although it does produce "copy output," the file is not the same as the original. What am I doing wrong? I tried reading the input file into a variable and then writing from there, but that didn't help either. What am I doing wrong?
Thanks
int main(int argc, char* argv[])
{
FILE* fi = fopen(argv[1], "r"); //create the input file for reading
if (fi == NULL)
return 1; // check file exists
int start = ftell(fi); // get file start address
fseek(fi, 0, SEEK_END); // go to end of file
int end = ftell(fi); // get file end address
rewind(fi); // go back to file beginning
FILE* out = fopen("copy output", "w"); // create the output file for writing
fwrite(fi,end-start,1,out); // write the input file to the output file
}
Should this work?
{
FILE* out = fopen("copy output", "w");
int* buf = malloc(end-start); fread(buf,end-start,1,fi);
fwrite(buf,end-start,1,out);
}
This isn't how fwrite works.
To copy a file, you'd typically allocate a buffer, then use fread to read one buffer of data, followed by fwrite to write that data back out. Repeat until you've copied the entire file. Typical code is something on this general order:
#define SIZE (1024*1024)
char buffer[SIZE];
size_t bytes;
while (0 < (bytes = fread(buffer, 1, sizeof(buffer), infile)))
fwrite(buffer, 1, bytes, outfile);
The first parameter of fwrite is a pointer to the data to be written to the file not a FILE* to read from. You have to read the data from the first file into a buffer then write that buffer to the output file. http://www.cplusplus.com/reference/cstdio/fwrite/
Perhaps a look through an open-source copy tool in C would point you in the right direction.
Here is How It can be done:
Option 1: Dynamic "Array"
Nested Level: 0
// Variable Definition
char *cpArr;
FILE *fpSourceFile = fopen(<Your_Source_Path>, "rb");
FILE *fpTargetFile = fopen(<Your_Target_Path>, "wb");
// Code Section
// Get The Size Of bits Of The Source File
fseek(fpSourceFile, 0, SEEK_END); // Go To The End Of The File
cpArr = (char *)malloc(sizeof(*cpArr) * ftell(fpSourceFile)); // Create An Array At That Size
fseek(fpSourceFile, 0, SEEK_SET); // Return The Cursor To The Start
// Read From The Source File - "Copy"
fread(&cpArr, sizeof(cpArr), 1, fpSourceFile);
// Write To The Target File - "Paste"
fwrite(&cpArr, sizeof(cpArr), 1, fpTargetFile);
// Close The Files
fclose(fpSourceFile);
fclose(fpTargetFile);
// Free The Used Memory
free(cpArr);
Option 2: Char By Char
Nested Level: 1
// Variable Definition
char cTemp;
FILE *fpSourceFile = fopen(<Your_Source_Path>, "rb");
FILE *fpTargetFile = fopen(<Your_Target_Path>, "wb");
// Code Section
// Read From The Source File - "Copy"
while(fread(&cTemp, 1, 1, fpSourceFile) == 1)
{
// Write To The Target File - "Paste"
fwrite(&cTemp, 1, 1, fpTargetFile);
}
// Close The Files
fclose(fpSourceFile);
fclose(fpTargetFile);

Resources