Writing raw data to a file - c

I have an uint8_t array of raw data that I want to write to a file (I have it's length)
The problem is that because I'm dealing with raw data there might be a 0x00 (aka null terminator) somewhere, meaning fputs is not reliable, the obvious alternative is to have a loop to use fputc() but is there a way i can do it without that?
Is there say a function that takes a pointer and a size and writes that amount of data from the pointer's location to the file?

In addition to the problem with null-character, there is problem reading binary data when file is opened in text mode (for example fgets stops when it encounters new line or 0x0A and 0x1A character in Windows)
Open the file in binary mode instead, and use fread/fwrite
FILE *fout = fopen("test.bin", "wb");
And use fwrite and fread
Reference

fread and fwrite are your friends.
uint8_t TheData[NUMBER_OF_ARRAY_ITEMS] = {0};
// ... Transformations to your data ...
// Persist the data
FILE *fHandleOutput = fopen("test.bin", "wb");
if(!fHandleOutput){
printf("Error: Output file handle was NULL!\n");
return;
}
// SIGNATURE: fwrite(const void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
fwrite(TheData, sizeof(TheData[0]), NUMBER_OF_ARRAY_ITEMS, fHandleOutput);
fflush(fHandleOutput); // Ensure changes get written to disk before we close
fclose(fHandleOutput);
fHandleOutput = NULL;
// Read the data
// Incoming data buffer
uint8_t TheData[NUMBER_OF_ARRAY_ITEMS] = {0};
// Attempt file open for binary mode
FILE *fHandleInput = fopen("test.bin", "rb");
if(!fHandleInput){
printf("Error: Input file handle was NULL!\n");
return;
}
// SIGNATURE: fread(void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
size_t iRead = fread(TheData, sizeof(TheData[0]), NUMBER_OF_ARRAY_ITEMS, fHandleInput);
fclose(fHandleInput);
fHandleInput = NULL;
It's worth noting that the return value of fread can be used to detect End-of-File (EOF) and I/O errors. If iRead < NUMBER_OF_ARRAY_ITEMS, then either an error occurred, or there were only iRead-number of sizeof(TheData[0])-byte segments between the filepointer's position and the EOF. (feof(...) or ferror(...) can be used to determine the cause of a low item read count.)

Related

Is there a way to get size of a file on Windows using C?

I'm currently trying to read the full contents of a file on Windows, using C's fread function. This function requires the size of the buffer that is being read into to be passed as an argument. And because I want the whole file to be read, I need to pass in the size of the file in bytes.
I've tried getting the size of a file on Windows though the use of the Win32 API, more specifically using GetFileSizeEx. The below snippet is from an existing Stack Overflow answer.
__int64 GetFileSize(const char* name)
{
HANDLE hFile = CreateFile(name, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if(hFile == INVALID_HANDLE_VALUE)
return -1; // error condition, could call GetLastError to find out more
LARGE_INTEGER size;
if(!GetFileSizeEx(hFile, &size))
{
CloseHandle(hFile);
return -1; // error condition, could call GetLastError to find out more
}
CloseHandle(hFile);
return size.QuadPart;
}
The returned size from this function is bigger than the actual file size. After executing the following code block
FILE* file = fopen(path, "r");
long size = (long)GetFileSize(path);
char* buffer = new char[size + 1];
fread(buffer, 1, size, file);
buffer[size] = '\0';
the buffer contains garbage bytes at the end of it. I've checked by hand, and the returned size is surely bigger than the actual size in bytes.
I've tried the other methods described in the same Stack Overflow answer linked above, but they all result in garbage bytes at the end of the buffer.
FILE* file = fopen(path, "r"); should be FILE* file = fopen(path, "rb"); If you want an accurate size open the file in binary mode.
On Windows reading a file in text mode causes "\r\n" sequences to be converted to "\n", resulting in the apperance of fewer bytes being read than expected.
The standard way to read file size on any system using only C standard functions make use of fseek() and ftell() function:
#include <stdio.h>
long get_file_len(char *filename)
{
long int size=0;
FILE *fp= fopen ( filename , "rb" );
if (!fp)
return 0;
fseek (fp,0,SEEK_END); //move file pointer to end of file
size= ftell (fp);
fclose(fp);
return size;
}
As variant you can use also lseek():
#include <stdio.h>
long get_file_len(char *filename)
{
long int size=0;
FILE *fp= fopen ( filename , "rb" );
if (!fp)
return 0;
size = lseek (fp,0,SEEK_END); //move file pointer to end of file
fclose(fp);
return size;
}
You should open the file in binary mode and you should use fseek and ftell to get the file size, that is the portable way. That way you get rid of the windows text mode convertions.
FILE* file = fopen(path, "rb");
fseek(file,0,SEEK_END) ; //move to 0 bytes to the end
long size=ftell(file); //get the size (pos at end)
rewind(file); //same as fseek(file,0,SEEK_SET), move the position to the begining
char* buffer = new char[size + 1];
long bytes_read=fread(buffer, 1, size, file);
buffer[bytes_read]=0;
if (bytes_read!=size)
{
// check errors (feof)
}

Adapt code to copy/paste .zip and .tar.gzip files?

Introduction
I'm writing my own cp program. With the code I currently have I'm able to copy and paste files.
Code
char *buf;
int fd;
int ret;
struct stat sb;
FILE *stream;
/*opening and getting size of file to copy*/
fd = open(argv[1],O_RDONLY);
if(fd == -1)
{
perror("open");
return 1;
}
/*obtaining size of file*/
ret = fstat(fd,&sb);
if(ret)
{
perror("stat");
return 1;
}
/*opening a stream for reading/writing file*/
stream fdopen(fd,"rb");
if(!stream)
{
perror("fdopen");
return 1;
}
/*allocating space for reading binary file*/
buf = malloc(sb.st_size);
/*reading data*/
if(!fread(buf,sb.st_size,1,stream))
{
perror("fread");
return 1;
}
/*writing file to a duplicate*/
fclose(stream);
stream = fopen("duplicate","wb");
if(!fwrite(buf,sb.st_size,1,stream))
{
perror("fwrite");
return 1;
}
fclose(stream);
close(fd);
free(buf);
return 0;
The problem
I'm unable to copy and paste .zip files and .tar.gz files. If i alter the code and give an extension such as 'duplicate.zip' (assuming im copying a zip file) such as .zip and then try and copy a .zip file
everything is copied, however the new duplicated file does not act like a zip file and when i use cat it outputs nothing and this error when i attempt to unzip it anyway:
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
So how do i go about copying zip and pasting zip files and also .tar.gz files. Any pointers will be helpful, thanks in advance.
You are using malloc() incorrectly. You want to allocate sb.st_size bytes.
malloc(sb.st_size * sizeof buf)
should be
malloc(sb.st_size)
The use of fread() is dubious and you are throwing away the result of fread(). Instead of
if(!fread(buf,sb.st_size,1,stream))
you should have
size_t num_bytes_read = fread (buf, 1, sb.st_size, stream);
if (num_bytes_read < sb.st_size)
You are using strlen() incorrectly. The content of buf is not guaranteed to be a string; and anyway you already know how many bytes you have in buf: sb.st_size. (Because if fread() returned a smaller number of bytes read you got angry and terminated the process.) So instead of
fwrite(buf,strlen(buf),1,stream)
you should have
fwrite (buf, 1, sb.st_size, stream)
In addition to AlexP's notes...
/*obtaining size of file*/
ret = fstat(fd,&sb);
if(ret)
{
perror("stat");
return 1;
}
// ...some code...
/*allocating space for reading binary file*/
buf = malloc(sb.st_size);
/*reading data*/
if(!fread(buf,sb.st_size,1,stream))
{
perror("fread");
return 1;
}
You have a race condition here. If the file size changes between your fstat call and malloc or fread you will read too much or too little of the file.
Fixing this leads us to the next issue, you're slurping the entire file into memory. While this might work for small files, it is extremely inefficient with your memory on large ones. For very large files it might be too large for a single malloc, and you're not checking if your malloc succeeds.
Instead, read and write the file a piece at a time. And read until there isn't any more to read.
uint8_t *buffer[4096]; // 4K buffer
size_t num_read;
while( (num_read = fread(buffer, sizeof(uint8_t), sizeof(buffer), in)) != 0 ) {
if( fwrite( buffer, sizeof(uint8_t), num_read, out ) == 0 ) {
perror("fwrite");
}
}
This avoids the race condition by not having to call fstat in the first place. And it avoids allocating a potentially enormous hunk of memory. Instead it can all be done on the stack.
I've used uint8_t to get a hunk of bytes. It's a standard fixed width integer type from stdint.h. You can also use unsigned char to read bytes, and that's probably what uint8_t really is, but uint8_t makes it explicit.

Reading a text file full with null characters and texts using fread

I am trying to design a small file system.
I have created a text file to store the files data in.
int kufs_create_disk(char* disk_name, int disk_size){
FILE* file_ptr = fopen(disk_name, "w");
if (file_ptr == NULL)
return -1;
fseek (file_ptr, disk_size * 1024-1, SEEK_SET);
fwrite("", 1, sizeof(char), file_ptr); // to make a size for the file
fclose(file_ptr);
DiskName=disk_name;
return 0;
}
After writing to the file I get a file with the size I determine when I call the function.
kufs_create_disk("test.txt", 5);
which creates a file with size of 5kbs with '\0' to fill this file to the size.
I have created another function to write to this file in different places of the file which works just fine and I won't paste the code for simplicity.
When I try to read from the file using fread(), I'm not getting all the data I have written into the memory; rather I get just some of the data.
My read implementation would be:
int kufs_read(int fd, void* buf, int n){
FILE *file_ptr= fopen("test.txt","a+");
fseek (file_ptr, FAT[fd].position, SEEK_SET); //where FAT[fd].position is where I want to start my read and fd is for indexing purposes
fread(buf, 1, n, file_ptr); //n is the number of bytes to be read
FAT[fd].position = FAT[fd].position + n;
}
The thing is the file reads some of the characters written and doesn't read the rest. I did a little test by looping all over the file and checking whether every thing is being read and fread reads every thing but in the buf I only get some of the characters I've written.
The text file looks something like this:
0\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00written string1written string2 0\00\00\00\00\00\00\00\00\00\00\00\000\00\00\00\00\00\00\00\00\00\00\00\00writtenstring 3 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00
I get writtenstring1 and writtenstring2 in the buffer but I don't get writtenstring 3 for example.
Can you explain why?

pattern searching of binary data

I am trying to build antivirus in C.
I do that like this:
Read data of virus and picture file to scanned.
Check if virus data appear in picture data.
I read the data of scanned file and virus file like this: ( I read the file by binary mode because the file is picture(.png) )
// open file
file = fopen(filePath, "rb");
if (!file)
{
printf("Error: can't open file.\n");
return 0;
}
// Allocate memory for fileData
char* fileData = calloc(fileLength + 1, sizeof(char));
// Read data of file.
fread(fileData, fileLength, 1, file);
after i read the file data and the Virus data i check if the virus appear in the file like this:
char* ret = strstr(fileData, virusID);
if (ret != NULL)
printf("Infetecd file");
It does not work even though in my picture i have VirusID.
I want to check if the binary data of virus appear in binary data of picture.
For example: my binary data of my virus http://pastebin.com/xZbWA9qu
And the binary data of my picture(with the virus): http://pastebin.com/yjXr84kr
First, note the order of arguments of fread, fread(void *ptr, size_t size, size_t nmemb, FILE *stream); so to get the number of bytes, it's better to do fread(fileData, 1, fileLength, file);. Your code will return 0 or 1 depends on whether there is enough data to be read in the file, not the number of bytes it has read.
Second, strstr is to search for strings, not memory blocks, in order to search binary blocks, you need to write your own, or you can use the GNU extension function memmem.
// Allocate memory for fileData
char *fileData = malloc(fileLength);
// Read data of file.
size_t nread = fread(fileData, 1, fileLength, file);
void *ret = memmem(fileData, nread, virusID, virusLen);
if (ret != NULL)
printf("Infetecd file");
Search for the first byte of the virus signature, if you find it then see if the next byte is the second byte of the signature, and so on until you have checked and matched all bytes of the signature. Then the file is infected. If not all bytes matches then search again for the first byte of the signature.

Binary file reading

I am dealing with a code which reading data from a binary file. The code is given here. Would anyone please make clear to me the role of fseek and fread here.
fc = fopen(CLOUDS_FILE, "rb");
if (fc == NULL){ fputs("File open error.\n", stderr); exit(1); }
crs = aux[CLRS];
fpos = (int) (pixel[2]*crs*crs + pixel[1]*crs + pixel[0]);
flsz = sizeof(fd);
fseek(fc, fpos*flsz, 0);
rd = fread((void *) &fd, flsz, 1, fc);
if (rd != 1){ fputs("Read error.\n", stderr); exit(1); }
fclose(fc);
fseek() changes the file offset. fread() reads data starting from the current offset, incrementing the offset by the number of elements read.
(Or is the question something else entirely? I mean, the above is something one can trivially figure by reading the manpages)
The binary file reading is done with an internal 'pointer', just like text editors have a cursor position when editing something. When opening the file in reading mode (using fopen) the pointer will be at the beginning of the file. Read operations (like fread, which will read a specified number of bytes from the stream) start reading at the pointer position and usually advance the pointer when they're done. If it is only necessary to read a specific part of the file, it is possible to manually set the pointer to a certain (relative or absolute) position, this is what fseek is used for.
#include <stdio.h>
int fseek(FILE *stream, long offset, int whence);
The fseek() function sets the file position indicator for the stream
pointed to by stream. The new position, measured in bytes, is obtained
by adding offset bytes to the position specified by whence. If whence
is set to SEEK_SET, SEEK_CUR, or SEEK_END, the offset is relative to
the start of the file, the current position indicator, or end-of-file,
respectively.
#include <stdio.h>
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fread() reads nmemb elements of data, each size bytes
long, from the stream pointed to by stream, storing them at the loca‐
tion given by ptr.
Sure, fseek is forwarding the "read from" index in the file to a calculated offset in CLOUDS_FILE, while fread is reading one object of size sizeof(fd) (whatever fd is, as that's not in your pasted code) into fd.

Resources