Lack of understanding when fread a file into an array - c

I'm doing CS50 (Week 4 Lab 4 Volume) and I have to copy a header from an input WAV file to output file (the 44 first bytes).
To start, I created the array uint8_t header[HEADER_SIZE];. Then, I used fread. But why the right solution is : fread(header, HEADER_SIZE, 1, input); and not something like : fread(header, sizeof(uint8_t), HEADER_SIZE, input);?
Don't it mean that all the header will be copied at once in the header array in a single element (which will lead to an error since an element must have 1 byte due to its type uint8_t)? Does the compiler now how to handle the situation when reading at once many bytes into an array with only 1 byte per element?
Thank you!

Related

How to use fread correctly?

I've seen two ways of providing the size of a file to fread.
If lets say I have a char array data, a file pointer and filesize is the size of the file in bytes then the first is:
fread(data, 1, filesize, file);
And the second:
fread(data, filesize, 1, file);
My question is, is there any difference between the two lines of code?
and which line of code is more "correct".
Also, I'm assuming the 1 in the two lines of code actually means sizeof(char), is that correct?
Argument 2: size of each member
Argument 3: Number of objects you want to read
Now your question:
is there any difference between the two lines of code?
fread(data, 1, filesize, file);
Reads filesize objects pointed by data where size of each object is 1 byte. If less than filesize bytes are read, those would be read partially.
fread(data, filesize, 1, file);
Reads 1 object pointed by data where size of this object is filesize bytes. If less than filesize bytes are available, none would be read.
Do whatever is the requirement of your program.
The first tells fread to read elements of size 1, filesize of them.
The second tells fread to read filesize elements of size of 1.
In theory both produce the same result.
In practice and theory both produce the same result. But if you respect the 'fread' standard, the first line is the correct one.

Explanation of HEX value representation and Endianess

I was working on a script to basically output some sample data as a binary blob.
I'm a new intern in the software field and vaguely remember the idea of endianness.
I realize that the most significant bits for big-endian starts at the top and works down the memory block.
If I have 0x03000201 and the data is being parsed to output 0 1 2, how does this happen and what is being done to make that work in terms of bits, bytes, etc.
I am wondering, in the example posted below, how the numbers are extracted to form 0 1 2 when printing out the data stored in the variables.
For example: I am creating a couple lines of the binary blob using this file:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *file;
int buffer = 0x03000201;
int buffer2= 0x010203;
file = fopen("test.bin", "wb");
if (file != NULL)
{
fwrite(&buffer, sizeof(buffer), 1, file);
fwrite(&buffer2, sizeof(buffer2), 1, file);
fclose(file);
}
return 0;
}
I then created a Python script to parse this data:
Info About Parse
import struct
with open('test.bin','rb') as f:
while True:
data = f.read(4)
if not data: break
var1, var2, var3 = struct.unpack('=BHB', data)
print(var1, var2, var3)
Big or little endianness defines how to interpret a sequence of bytes longer than one byte and how to store those in memory. Wikipedia will help you with that.
I was really just looking to understand how 0x0300020 when read 2
bytes at a time and reprinted yields 0 1 2.
You don't read 2 bytes at a time, you read 4 bytes: data = f.read(4)
f.read(size) reads some quantity of data and returns it as a string.
You unpack data using =BHB - byte, 2 bytes, byte. Endianness comes into play only when you unpack data, all other IO calls in your code deal with byte sequences.
Experiment with unpack() Byte Order, Size, and Alignment You may also look at file data with a HEX editor of your choice.
And if, after your research, you have a concrete question, ask here.

How can I find the bytes of a wav file ?

I was 100% sure that the bytes of a wav file are chunkSize + 8,What I've been trying to do
is:
fseek(file_pointer, chunkSize+8-4, SEEK_SET)
and then use
fread(rev, 4, 1, file_pointer)
to put the last 4 bytes to the array rev unsigned char rev[4]. But the bytes it puts in rev are for sure not the 4 last bytes. I've been working on the project for so many hours and I still can't find why this isn't working. If someone can tell me the correct answer I will build a statue of him right now.
If you want to access the last 4 bytes of a file, you can use SEEK_END, like so:
fseek(file_pointer, -4, SEEK_END);
The fread should then return the last four bytes.

Reading a binary file 1 byte at a time

I am trying to read a binary file in C 1 byte at a time and after searching the internet for hours I still can not get it to retrieve anything but garbage and/or a seg fault. Basically the binary file is in the format of a list that is 256 items long and each item is 1 byte (an unsigned int between 0 and 255). I am trying to use fseek and fread to jump to the "index" within the binary file and retrieve that value. The code that I have currently:
unsigned int buffer;
int index = 3; // any index value
size_t indexOffset = 256 * index;
fseek(file, indexOffset, SEEK_SET);
fread(&buffer, 256, 1, file);
printf("%d\n", buffer);
Right now this code is giving me random garbage numbers and seg faulting. Any tips as to how I can get this to work right?
Your confusing bytes with int. The common term for a byte is an unsigned char. Most bytes are 8-bits wide. If the data you are reading is 8 bits, you will need to read in 8 bits:
#define BUFFER_SIZE 256
unsigned char buffer[BUFFER_SIZE];
/* Read in 256 8-bit numbers into the buffer */
size_t bytes_read = 0;
bytes_read = fread(buffer, sizeof(unsigned char), BUFFER_SIZE, file_ptr);
// Note: sizeof(unsigned char) is for emphasis
The reason for reading all the data into memory is to keep the I/O flowing. There is an overhead associated with each input request, regardless of the quantity requested. Reading one byte at a time, or seeking to one position at a time is the worst case.
Here is an example of the overhead required for reading 1 byte:
Tell OS to read from the file.
OS searches to find the file location.
OS tells disk drive to power up.
OS waits for disk drive to get up to speed.
OS tells disk drive to position to the correct track and sector.
-->OS tells disk to read one byte and put into drive buffer.
OS fetches data from drive buffer.
Disk spins down to a stop.
OS returns 1 byte to your program.
In your program design, the above steps will be repeated 256 times. With everybody's suggestion, the line marked with "-->" will read 256 bytes. Thus the overhead is executed only once instead of 256 times to get the same quantity of data.
In your code you are trying to read 256 bytes to the address of one int. If you want to read one byte at a time, call fread(&buffer, 1, 1, file); (See fread).
But a simpler solution will be to declare an array of bytes, read it all together and process it after that.
unsigned char buffer; // note: 1 byte
fread(&buffer, 1, 1, file);
It is time to read mans I believe.
Couple of problems with the code as it stands.
The prototype for fread is:
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
You've set the size to 256 (bytes) and the count to 1. That's fine, that means "read one lump of 256 bytes, shove it into the buffer".
However, your buffer is on the order of 2-8 bytes long (or, at least, vastly smaller than 256 bytes), so you have a buffer overrun. You probably want to use fred(&buffer, 1, 1, file).
Furthermore, you're writing byte data to an int pointer. This will work on one endian-ness (small-endian, in fact), so you'll be fine on Intel architecture and from that learn bad habits tha WILL come back and bite you, one of these days.
Try real hard to only write byte data into byte-organised storage, rather than into ints or floats.
You are trying to read 256 bytes into a 4-byte integer variable called "buffer". You are overwriting the next 252 bytes of other data.
It seems like buffer should either be unsigned char buffer[256]; or you should be doing fread(&buffer, 1, 1, f) and in that case buffer should be unsigned char buffer;.
Alternatively, if you just want a single character, you could just leave buffer as int (unsigned is not needed because C99 guarantees a reasonable minimum range for plain int) and simply say:
buffer = fgetc(f);

Reading Binary file in C

I am having following issue with reading binary file in C.
I have read the first 8 bytes of a binary file. Now I need to start reading from the 9th byte. Following is the code:
fseek(inputFile, 2*sizeof(int), SEEK_SET);
However, when I print the contents of the array where I store the retrieved values, it still shows me the first 8 bytes which is not what I need.
Can anyone please help me out with this?
Assuming:
FILE* file = fopen(FILENAME, "rb");
char buf[8];
You can read the first 8 bytes and then the next 8 bytes:
/* Read first 8 bytes */
fread(buf, 1, 8, file);
/* Read next 8 bytes */
fread(buf, 1, 8, file);
Or skip the first 8 bytes with fseek and read the next 8 bytes (8 .. 15 inclusive, if counting first byte in file as 0):
/* Skip first 8 bytes */
fseek(file, 8, SEEK_SET);
/* Read next 8 bytes */
fread(buf, 1, 8, file);
The key to understand this is that the C library functions keep the current position in the file for you automatically. fread moves it when it performs the reading operation, so the next fread will start right after the previous has finished. fseek just moves it without reading.
P.S.: My code here reads bytes as your question asked. (Size 1 supplied as the second argument to fread)
fseek just moves the position pointer of the file stream; once you've moved the position pointer, you need to call fread to actually read bytes from the file.
However, if you've already read the first eight bytes from the file using fread, the position pointer is left pointing to the ninth byte (assuming no errors happen and the file is at least nine bytes long, of course). When you call fread, it advances the position pointer by the number of bytes that are read. You don't have to call fseek to move it.

Resources