Bitmap file header size - file-format

I'm a newbie in programming bmp files and i checked this web site to learn about bmp header..
http://www.daubnet.com/en/file-format-bmp
it seems that the header of a bmp file is 54 bytes.
Using paint, i created a simple 10x10 image, and i saved it in 24 bits.
so according to simple math, the file size should be 10*10*3 + 54 = 354 bytes.
but hex editor and file explorer returned a size of 374 bytes.
So i have a difference of 20 bytes, and i don't know why.
could you tell me why please?
thanks a lot!!

Lines in BMPs are padded out to multiples of 4 bytes.
Without padding, each line is 3*10 = 30 bytes. With padding, each line is 32 bytes, so the image data is 320 bytes in size. Thus, the file size is 54+320 = 374 bytes.

Related

Parsing ID3V2 Frames in C

I have been attempting to retrieve ID3V2 Tag Frames by parsing through the mp3 file and retrieving each frame's size. So far I have had no luck.
I have effectively allocated memory to a buffer to aid in reading the file and have been successful in printing out the header version but am having difficulty in retrieving both the header and frame sizes. For the header framesize I get 1347687723, although viewing the file in a hex editor I see 05 2B 19.
Two snippets of my code:
typedef struct{ //typedef structure used to read tag information
char tagid[3]; //0-2 "ID3"
unsigned char tagversion; //3 $04
unsigned char tagsubversion;//4 00
unsigned char flags; //5-6 %abc0000
uint32_t size; //7-10 4 * %0xxxxxxx
}ID3TAG;
if(buff){
fseek(filename,0,SEEK_SET);
fread(&Tag, 1, sizeof(Tag),filename);
if(memcmp(Tag.tagid,"ID3", 3) == 0)
{
printf("ID3V2.%02x.%02x.%02x \nHeader Size:%lu\n",Tag.tagversion,
Tag.tagsubversion, Tag.flags ,Tag.size);
}
}
Due to memory alignment, the compiler has set 2 bytes of padding between flags and size. If your struct were putted directly in memory, size would be at address 6 (from the beginning of the struct). Since an element of 4 bytes size must be at an address multiple of 4, the compiler adds 2 bytes, so that size moves to the closest multiple of 4 address, which is here 8. So when you read from your file, size contains bytes 8-11. If you try to print *(&Tag.size - 2), you'll surely get the correct result.
To fix that, you can read fields one by one.
ID3v2 header structure is consistent across all ID3v2 versions (ID3v2.0, ID3v2.3 and ID3v2.4).
Its size is stored as a big-endian synch-safe int32
Synchsafe integers are
integers that keep its highest bit (bit 7) zeroed, making seven bits
out of eight available. Thus a 32 bit synchsafe integer can store 28
bits of information.
Example:
255 (%11111111) encoded as a 16 bit synchsafe integer is 383
(%00000001 01111111).
Source : http://id3.org/id3v2.4.0-structure ยง 6.2
Below is a straightforward, real-life C# implementation that you can easily adapt to C
public int DecodeSynchSafeInt32(byte[] bytes)
{
return
bytes[0] * 0x200000 + //2^21
bytes[1] * 0x4000 + //2^14
bytes[2] * 0x80 + //2^7
bytes[3];
}
=> Using values you read on your hex editor (00 05 EB 19), the actual tag size should be 112025 bytes.
By coincidence I am also working on an ID3V2 reader. The doc says that the size is encoded in four 7-bit bytes. So you need another step to convert the byte array into an integer... I don't think just reading those bytes as an int will work because of the null bit on top.

Maximum file size supported by a file representation of a node?

For a I feel like it would be 63504 Bytes because the file size would be (496/4)*512 + 16 Bytes.. But I cant seem to get that in the requested format, which leads me to believe that I attempted it wrong.
For pt b I have no Idea how to approach it.. Any help/hints would be appreciated
Part [a]
Starting with
Extent size 1 ==> File size will be 2^0 * 512 bytes
Extent size 2 ==>File size will be 2^1 * 512 bytes
Extent size 3 ==>File size will be 2^2 * 512 bytes
......
Extent size 12 ==>File size will be 2^11 * 512 bytes
Extent size x ==>File size will be 2^(x-1) * 512 bytes
Part [b]
Considering 512 bytes as block size,
Number of disk accesses required for getting 1,00,000 byte will be,
100000/512 = 196
I hope it does make sense to you....

What is the meaning of `16 longs` and `110 words`, when reads the bits from a file

I'm trying to read PUD file format, that belong to the warcraft 2 game map.
In the explanation of file structure, there is small phrases I don't understand it.
What is this mean (16 longs, 110 words) ?
Here is an example
16 longs -------> Units and buildings allowed. (16 players)
units bit order:
0000000000000000000000000000000x bit0: footman/grunt
000000000000000000000000000000x0 bit1: peasant/peon
00000000000000000000000000000x00 bit2: ballista/catapult
0000000000000000000000000000x000 bit3: knight/ogre
000000000000000000000000000x0000 bit4: archer/axe thrower
00000000000000000000000000x00000 bit5: mage/death knights
0000000000000000000000000x000000 bit6: tanker
000000000000000000000000x0000000 bit7: destroyer
00000000000000000000000x00000000 bit8: transport
0000000000000000000000x000000000 bit9: battleship/juggernault
000000000000000000000x0000000000 bit10: submarine/turtle
00000000000000000000x00000000000 bit11: flying machine/balloon
0000000000000000000x000000000000 bit12: gryphon/dragon
000000000000000000x0000000000000 bit13: unused/unused
00000000000000000x00000000000000 bit14: demo. squad/sapper
0000000000000000x000000000000000 bit15: aviary/roost
000000000000000x0000000000000000 bit16: farm
00000000000000x00000000000000000 bit17: barracks
0000000000000x000000000000000000 bit18: lumber mill
000000000000x0000000000000000000 bit19: stables/mound
00000000000x00000000000000000000 bit20: mage tower/temple
0000000000x000000000000000000000 bit21: foundry
000000000x0000000000000000000000 bit22: refinery
00000000x00000000000000000000000 bit23: inventor/alchemist
0000000x000000000000000000000000 bit24: church/altar storms
000000x0000000000000000000000000 bit25: tower
00000x00000000000000000000000000 bit26: town hall/great hall
0000x000000000000000000000000000 bit27: keep/stronghold
000x0000000000000000000000000000 bit28: castle/fortress
00x00000000000000000000000000000 bit29: blacksmith
0x000000000000000000000000000000 bit30: shipyard
x0000000000000000000000000000000 bit31: unused
Is this mean 16 longs = 16*4Bytes = 64 or 16*32Bits = 512 or other.
Also the same thing with 110 words.
They're referring to C types on a particular architecture. In C, long is a type of variable. Its size varies as a function of the compiler, but in this case it's a 32-bit value. Words are processor words, which are typically 32 bits in modern parlance. However, Warcraft 2 was written a long time ago and runs on 16-bit machines. As Sean pointed out in a comment, words are 16 bits in this context.
To answer the question in the comment:
16 longs = 16 * 32 bits = 512 bits = 64 bytes.
110 words = 110 * 16 bits = 1760 bits = 220 bytes.
I've never known "16 longs, 110 words" to have any particular meaning other than long being 64 bit numbers, and words being 32 bit data. I would perform some experiments and see what values are contained in the first 16 8-byte chunks and then 110 4-byte chunks to see if values are relatively consistent.
If by looking at the word values, you see one bit on like in your table above, then presumably you're reading it correctly. However, generally speaking, there's no way to know for sure if you're right for these sorts of things, only ways to know if you're wrong.
Edit: Of course, the sizes have changed over the years and "long" may be 4 bytes, not 8. Likewise words would be 2 bytes, not 4.

Deinterleaving of a 2-channel WAV file into two text files containing raw data [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have an assignment based on the above stated problem. The sampling frequency and size per sample would be known in the problem. I just need an idea about the kind of coding that would be required for this.
Use a file format spec such as this one to see how to read the file header, determining sample rate, bit rate etc.
The canonical WAVE format starts with the RIFF header:
0 4 ChunkID Contains the letters "RIFF" in ASCII form
(0x52494646 big-endian form).
4 4 ChunkSize 36 + SubChunk2Size, or more precisely:
4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
This is the size of the rest of the chunk
following this number. This is the size of the
entire file in bytes minus 8 bytes for the
two fields not included in this count:
ChunkID and ChunkSize.
8 4 Format Contains the letters "WAVE"
(0x57415645 big-endian form).
The "WAVE" format consists of two subchunks: "fmt " and "data":
The "fmt " subchunk describes the sound data's format:
12 4 Subchunk1ID Contains the letters "fmt "
(0x666d7420 big-endian form).
16 4 Subchunk1Size 16 for PCM. This is the size of the
rest of the Subchunk which follows this number.
20 2 AudioFormat PCM = 1 (i.e. Linear quantization)
Values other than 1 indicate some
form of compression.
22 2 NumChannels Mono = 1, Stereo = 2, etc.
24 4 SampleRate 8000, 44100, etc.
28 4 ByteRate == SampleRate * NumChannels * BitsPerSample/8
32 2 BlockAlign == NumChannels * BitsPerSample/8
The number of bytes for one sample including
all channels. I wonder what happens when
this number isn't an integer?
34 2 BitsPerSample 8 bits = 8, 16 bits = 16, etc.
2 ExtraParamSize if PCM, then doesn't exist
X ExtraParams space for extra parameters
The "data" subchunk contains the size of the data and the actual sound:
36 4 Subchunk2ID Contains the letters "data"
(0x64617461 big-endian form).
40 4 Subchunk2Size == NumSamples * NumChannels * BitsPerSample/8
This is the number of bytes in the data.
You can also think of this as the size
of the read of the subchunk following this
number.
44 * Data The actual sound data.
After that, you'll find raw pcm data, interleaved like
[sample 1 ][sample 2 ]
[s1,ch1][s1,ch2][s2,ch1][s2,ch2]
You could open a text file per sample in write, binary mode, then loop over the audio data, reading the bytes for a single sample/channel then using fprintf or fwrite to write them to the correct file.
The sampling frequency is irrelevant for this, but the size per sample (typically 8 or 16 bit per channel per sample) decides which pointer-size you need to use, so here the example of 8 bit per channel:
char* reader = begin; // interleaved
char* left = malloc(numsamples); // de-interleaved
char* right = malloc(numsamples);
while(reader<end) {
*left = *reader;
++left;
++reader;
*right = *reader;
++right;
++reader;
}
To do the same for 2 channel 16 bit interleaved audio you just declare all 3 buffers as short* and instead malloc(numsamples*2)
Assuming you've loaded the WAV data into memory, all you need to do is:
Open the two output files (using fopen()).
Loop over the sample data, and for each sample:
Put the left channel's value in the first file
Put the right channel's value in the second file
Close the files.

Reading data from GIF Headers in C

The first 13 bytes of any GIF image file are as follows:
3 bytes - the ascii characters "GIF"
3 bytes - the version - either "87a" or "89a"
2 bytes - width in pixels
2 bytes - height in pixels
1 byte - packed fields
1 byte - background color index
1 byte - pixel aspect ratio
I can get the first six bytes myself by using some sort of code like:
int G = getchar();
int I = getchar();
int F = getchar();
etc .. doing the same for the 87a/89a part, all this gets the first 6 bytes, providing the ascii characters for, say, GIF87a.
Well, I can't manage to figure out how to get the rest of the information. I try going along with the same getchar(); method, but it's not what I would expect it to be. Say I have a 350x350 GIF file, since the width and height is 2 bytes each, i use getchar 2 times, and I end up with the width being "94" and "1", two numbers, as there's two bytes. But how would I use this information to get the actual, in base 10, width and height? I tried bitwise-anding 94 and 1, but then realized it returns 0.
I figure maybe if I can find out how to get the width and height I'll be able to access the rest of the information on my own.
Pixel width and height are stored in little indian format.
It's just like any other number broken into parts with a limited range. For example, look at 43. Each digit has a limited range, from 0 to 9. So the next digit is the number of 10's, then hundreds (10*10) and so on. In this case, the values can range from 0 to 255, so the next number is the number of 256's.
256 * 1 + 94 = 350
The standard should specify the byte order, that is, whether the most significant (called big endian) comes first of the least significant (called little endian) comes first.
Byte Order: Little-endian
Typically for reading a compressed bitstream or image data, we could open the file in read binary mode, read the data and interpret the same through a getBits functionality. For example, please consider a sample below
fptr = fopen("myfile.gif", "rb");
// Read a word
fread(&cache, sizeof(unsigned char), 4, fptr);
//Read your width through getBits
width = getBits(cache, position, number_of_bits);
Please refer here K & R Question: Need help understanding "getbits()" method in Chapter 2 for more details on getBits

Resources