The first 13 bytes of any GIF image file are as follows:
3 bytes - the ascii characters "GIF"
3 bytes - the version - either "87a" or "89a"
2 bytes - width in pixels
2 bytes - height in pixels
1 byte - packed fields
1 byte - background color index
1 byte - pixel aspect ratio
I can get the first six bytes myself by using some sort of code like:
int G = getchar();
int I = getchar();
int F = getchar();
etc .. doing the same for the 87a/89a part, all this gets the first 6 bytes, providing the ascii characters for, say, GIF87a.
Well, I can't manage to figure out how to get the rest of the information. I try going along with the same getchar(); method, but it's not what I would expect it to be. Say I have a 350x350 GIF file, since the width and height is 2 bytes each, i use getchar 2 times, and I end up with the width being "94" and "1", two numbers, as there's two bytes. But how would I use this information to get the actual, in base 10, width and height? I tried bitwise-anding 94 and 1, but then realized it returns 0.
I figure maybe if I can find out how to get the width and height I'll be able to access the rest of the information on my own.
Pixel width and height are stored in little indian format.
It's just like any other number broken into parts with a limited range. For example, look at 43. Each digit has a limited range, from 0 to 9. So the next digit is the number of 10's, then hundreds (10*10) and so on. In this case, the values can range from 0 to 255, so the next number is the number of 256's.
256 * 1 + 94 = 350
The standard should specify the byte order, that is, whether the most significant (called big endian) comes first of the least significant (called little endian) comes first.
Byte Order: Little-endian
Typically for reading a compressed bitstream or image data, we could open the file in read binary mode, read the data and interpret the same through a getBits functionality. For example, please consider a sample below
fptr = fopen("myfile.gif", "rb");
// Read a word
fread(&cache, sizeof(unsigned char), 4, fptr);
//Read your width through getBits
width = getBits(cache, position, number_of_bits);
Please refer here K & R Question: Need help understanding "getbits()" method in Chapter 2 for more details on getBits
Related
I have been attempting to retrieve ID3V2 Tag Frames by parsing through the mp3 file and retrieving each frame's size. So far I have had no luck.
I have effectively allocated memory to a buffer to aid in reading the file and have been successful in printing out the header version but am having difficulty in retrieving both the header and frame sizes. For the header framesize I get 1347687723, although viewing the file in a hex editor I see 05 2B 19.
Two snippets of my code:
typedef struct{ //typedef structure used to read tag information
char tagid[3]; //0-2 "ID3"
unsigned char tagversion; //3 $04
unsigned char tagsubversion;//4 00
unsigned char flags; //5-6 %abc0000
uint32_t size; //7-10 4 * %0xxxxxxx
}ID3TAG;
if(buff){
fseek(filename,0,SEEK_SET);
fread(&Tag, 1, sizeof(Tag),filename);
if(memcmp(Tag.tagid,"ID3", 3) == 0)
{
printf("ID3V2.%02x.%02x.%02x \nHeader Size:%lu\n",Tag.tagversion,
Tag.tagsubversion, Tag.flags ,Tag.size);
}
}
Due to memory alignment, the compiler has set 2 bytes of padding between flags and size. If your struct were putted directly in memory, size would be at address 6 (from the beginning of the struct). Since an element of 4 bytes size must be at an address multiple of 4, the compiler adds 2 bytes, so that size moves to the closest multiple of 4 address, which is here 8. So when you read from your file, size contains bytes 8-11. If you try to print *(&Tag.size - 2), you'll surely get the correct result.
To fix that, you can read fields one by one.
ID3v2 header structure is consistent across all ID3v2 versions (ID3v2.0, ID3v2.3 and ID3v2.4).
Its size is stored as a big-endian synch-safe int32
Synchsafe integers are
integers that keep its highest bit (bit 7) zeroed, making seven bits
out of eight available. Thus a 32 bit synchsafe integer can store 28
bits of information.
Example:
255 (%11111111) encoded as a 16 bit synchsafe integer is 383
(%00000001 01111111).
Source : http://id3.org/id3v2.4.0-structure § 6.2
Below is a straightforward, real-life C# implementation that you can easily adapt to C
public int DecodeSynchSafeInt32(byte[] bytes)
{
return
bytes[0] * 0x200000 + //2^21
bytes[1] * 0x4000 + //2^14
bytes[2] * 0x80 + //2^7
bytes[3];
}
=> Using values you read on your hex editor (00 05 EB 19), the actual tag size should be 112025 bytes.
By coincidence I am also working on an ID3V2 reader. The doc says that the size is encoded in four 7-bit bytes. So you need another step to convert the byte array into an integer... I don't think just reading those bytes as an int will work because of the null bit on top.
I need to create a large binary matrix that is over the array size limit for MATLAB.
By default, MATLAB creates integer arrays as double precision arrays. But since my matrix is binary, I am hoping that there is a way to create an array of bits instead of doubles and consume far less memory.
I created a random binary matrix A and converted it to a logical array B:
A = randi([0 1], 1000, 1000);
B=logical(A);
I saved both as .mat files. They take up about the same space on my computer so I don't think MATLAB is using a more compact data type for logicals, which seems very wasteful. Any ideas?
Are you sure that the variables take the same amount of space? logical data matrices / arrays are inherently 1 byte per number where as randi is double precision, which is 8 bytes per number. A simple call to whos will show you how much memory each variable takes:
>> A = randi([0 1], 1000, 1000);
>> B = logical(A);
>> whos
Name Size Bytes Class Attributes
A 1000x1000 8000000 double
B 1000x1000 1000000 logical
As you can see, A takes 8 x 1000 x 1000 = 8M bytes where as B takes up 1 x 1000 x 1000 = 1M bytes. There is most certainly memory savings between them.
The drawback with logicals is that it takes 1 byte per number, and you're looking for 1 bit instead. The best thing I can think of is to use an unsigned integer type and interleave chunks of N-bits where N is the associated bit precision of the data type, so uint8, uint16, uint32 etc. into a single interleaved array. As such, 32 digits can be interleaved per number and you can save this final matrix.
Going off on a tangent - Images
In fact, this is how Java packs colour pixels when reading images in using their BufferedImage class. Each pixel in a RGB image is 24 bits, where there are 8 bits per colour channel - red, green and blue. Each pixel is represented as a proportion of red, green and blue, and they concatenate the trio of 8 bits into a single 24-bit integer. Usually, integers are represented as 32 bits and so you may think that there are 8 extra bits being wasted. There is in fact an alpha channel that represents the transparency of each colour pixel and that is another 8 bits to represent this. If you don't use transparency, these are assumed to be all 1s, and so the collection of these 4 pairs of 8 bits constitute 32 bits per pixel. There is, however, compression algorithms to reduce the size of each pixel on average to significantly less than 32 bits per pixel, but that's outside the scope of what I'm talking about.
Going back to our discussion, one way to represent this binary matrix in bit form would be perhaps in a for loop like so:
Abin = zeros(1, ceil(numel(A)/32), 'uint32');
for ii = 1 : numel(Abin)
val = A((ii-1)*32 + 1:ii*32);
dec = bin2dec(sprintf('%d', val));
Abin(ii) = dec;
end
Bear in mind that this will only work for matrices where the total number of elements is divisible by 32. I won't go into how to handle the case where it isn't because I solely want to illustrate the point that you can do what you ask, but it requires a bit of manipulation. Your case of 1000 x 1000 = 1M is certainly divisible by 32 (you get 1M / 32 = 31250), and so this will work.
This is probably not the most optimized code, but it gets the point across. Basically, we take chunks of 32 numbers (0/1) going column-wise from left to right and determining the 32-bit unsigned integer representation of this number. We then store this in a single location in the matrix Abin. What you will get in the end, given your 1000 x 1000 matrix is 31250 32-bit unsigned integers, which corresponds to 1000 x 1000 bits, or 1M bits = 125,000 bytes.
Try looking at the size of each variable now:
>> whos
Name Size Bytes Class Attributes
A 1000x1000 8000000 double
Abin 1x31250 125000 uint32
B 1000x1000 1000000 logical
To perform a reconstruction, try:
Arec = zeros(size(A));
for ii = 1 : numel(Abin)
val = dec2bin(Abin(ii), 32) - '0';
Arec((ii-1)*32 + 1:ii*32) = val(:);
end
Also not the most optimized, but it gets the point across. Given the "compressed" matrix Abin that we calculated before, for each element, we reconstruct what the original 32-bit number was then assign these numbers in 32-bit chunks stored in Arec.
You can verify that Arec is indeed equal to the original matrix A:
>> isequal(A, Arec)
ans =
1
Also, check out the workspace with whos:
>> whos
Name Size Bytes Class Attributes
A 1000x1000 8000000 double
Abin 1x31250 125000 uint32
Arec 1000x1000 8000000 double
B 1000x1000 1000000 logical
You are storing your data in a compressed file format. For mat files in version 7.0 and 7.3 gzip compression is used. The uncompressed data has different sizes, but after compression both are compressed down to roughly the same size. That happened because both data contains only 0 and 1 which can be compressed efficient.
I'm making a project about a digital recorder trough microcontroller. I want to store a voice recorded from microphone and build a .WAV file. I have the captured voice samples from ADC, and I only know the structure of WAV file (from this image), but I don't know anything else of it. Could you help me, giving me some information about the building process of this file type?
Thank you.
Now I can explain how I making the code. Maybe for someone few part of this explanation may result redundant, but I want to say clearly every pass I did'nt understand.
For first, I wanna explain very single stack of the hader of .wav file, referred in the image I posted upward.
The first segment , ChunkID, is simply a char vector "RIFF".
The second segment, ChunkSize, is the size from this point to the end of file; because the first 2 segment are 8 byte, the value of this segment is simple the total size of file (in byte) - 8 byte. Note that in a variabile-time recording this value is not known at this point, so for first time this is filled by a casual value, and at the end of recording, when the total size of file is known, it will filled with correct value.
Segment Format is a char vector "WAVE".
The segment Subchunk1D is char vector "fmt " (put attention at the final space).
The segment subchunk1size is 16 (decimal value).
The AudioFormat segment is 2 byte, in my case is 1 for the PCM.
The NumChannel segment is 2 byte, and its value is 1 for mono and 2 for stereo.
The SampleRate segment is the sample frequency in Hz (e.g. 44100).
ByteRate segment is given by ByteRate = SampleRate * BlockAlign.
BlockAlign segment is given by BlockAlign = NumChannels * BitPerSample / 8.
BitPerSample segment is the number of bit that compose each sample. In my case of a 10-bit ADC , I have casted this value to 8 bit, losing the less significant 2 bit.
Subchunk2ID segment is a char vector "data".
Subchunk2Size segment contains the entire size of the data acquired (sampples), and so it is the entire size of the file - 44, because 44 is the byte count from begin to this point. Another method tho calculate this value is: Subchunk2Size = NumSample * BlockAlign. In any case, this segment is not known at this point, and for its calculation it needs the end of recording.
The final segment, data, is the vector that contain the sample. Is the only doesn't have a fixed dimension (of course).
Each segment described is in succession, without any gender of delimiter, because the delimiter is intrinsec in the dimension of each segment.
Implement this in C is very simple, if each segment is well described.
I am a newbie for solving these kind of problems. I need to extract variable no of bits from a single short value.
Like If I have read something from an array and need to fill another array first reading the 10 bits from earlier read value , and then again 6 bits to another short.
Like:
int pixelNo = 0;
short pixelValue_part = pixels[pixelNo];
// but here i need only 10 bits , in the second
// iteration i might need 4 bits and then so on so forth.
After reading these shorts in parts, I will have to put these parts into
second array sequentially inorder to arrange all pixels in the sequence.
Note :
The problem is of arranging all input pixel sequence in ascending ordered way.I have to arrange the pixels of each having size of 10 bits. For this reason I would need read first 10 bits of short.
Edit:
| | 0,1 512,513 1024,1025 1536,1537 1,2,3 513,514,515 1025,1206,1027 1357,1538,1539
|_|____|___|___|_|___|______|______
I have above array as input and I want to produce output like the following array.
| | 0,1 2,3 4,5 6,7 ...... 513,514,515,1024,1025,1536,1537...
|_|____|___|___|_|___|______|______
Values of arrays all are Pixels of some image. So in actual the image was unarranged in pixels in input array, and then the second array is the array of arranged (sorted ) pixels.
Assuming your original data is in src, and you want a span of n bits (starting n+pos bits from the low end), this will extract those bits:
(src >> pos) & ((1<<n)-1)
Breaking it down:
(1<<n)-1 is a mask of n 1's (binary)
src >> pos slides the bits you want down to the "bottom" of the variable
Then we bitwise-and the two together, effectively erasing the bits you don't want, leaving behind the ones you do
You can do this for each piece you need. To put the pieces together, you'd use << to shift pieces where you need them to be and then | (bitwise-or) the pieces together.
could somebody explain to me why in 24-bit rgb bitmap file I have to add a padding which size depends on width of image ? What for ?
I mean I must add this code to my program (in C):
if( read % 4 != 0 ) {
read = 4 - (read%4);
printf( "Padding: %d bytes\n", read );
fread( pixel, read, 1, inFile );
}
Because 24 bits is an odd number of bytes (3) and for a variety of reasons all the image rows are required to start at an address which is a multiple of 4 bytes.
According to Wikipedia, the bitmap file format specifies that:
The bits representing the bitmap pixels are packed in rows. The size of each row is rounded up to a multiple of 4 bytes (a 32-bit DWORD) by padding. Padding bytes (not necessarily 0) must be appended to the end of the rows in order to bring up the length of the rows to a multiple of four bytes. When the pixel array is loaded into memory, each row must begin at a memory address that is a multiple of 4. This address/offset restriction is mandatory only for Pixel Arrays loaded in memory. For file storage purposes, only the size of each row must be a multiple of 4 bytes while the file offset can be arbitrary. A 24-bit bitmap with Width=1, would have 3 bytes of data per row (blue, green, red) and 1 byte of padding, while Width=2 would have 2 bytes of padding, Width=3 would have 3 bytes of padding, and Width=4 would not have any padding at all.
The wikipedia article on Data Structure Padding is also an interesting read that explains the reasons that paddings are generally used in computer science.
I presume this was design decision to align for better memory patterns while not wasting that much space (for 319px wide image you would waste 3 bytes or 0.25%)
Imagine you need to access some odd row directly. You could access first 4 pixels of n-th row by doing:
uint8_t *startRow = bmp + n * width * 3; //3 bytes per pixel
uint8_t r1 = startRow[0];
uint8_t g1 = startRow[1];
//... Repeat
uint8_t b4 = startRow[11];
Note that if n and width are odd (and bmp is even), startRow is going to be odd.
Now if you tried to do following speedup:
uint32_t *startRow = (uint32_t *) (bmp + n * width * 3);
uint32_t a = startRow[0]; //Loading register at a time is MUCH faster
uint32_t b = startRow[1]; //but only if address is aligned
uint32_t c = startRow[2]; //else code can hit bus errors!
uint8_t r1 = (a & 0xFF000000) >> 24;
uint8_t g1 = (a & 0x00FF0000) >> 16;
//... Repeat
uint8_t b4 = (c & 0x000000FF) >> 0;
You'd run into lots of problems. In best case scenario (that is intel cpu) your every load of a, b and c would need to be broken into two loads since startRow is not divisible by 4. In worst case scenario (eg. sun sparc) your program would crash with "bus error".
In newer designs it is common to force rows to be aligned to at least L1 cache line size (64 bytes on intel or 128 bytes on nvidia gpus).
Short version
Because the bmp file format specifies rows must perfectly fit in a 32bits "memory cells". Because pixels are 24bits, some combinations of pixels will not perfect sit in 32bit "cells". In this case, the cell is "padded up to" the full 32bits.
8bits per byte ∴
cell: 32bit = 4bytes ∴
pixel: 24bits = 3bytes
// If doesn't fit perfectly in 4 byte "cell"
if( read % 4 != 0 ) {
// find the difference between the "cell", and "the partial fit"
read = 4 - (read%4);
printf( "Padding: %d bytes\n", read );
// skip the difference
fread( pixel, read, 1, inFile );
}
Long version
In computing, a word is the natural unit of data used by a particular processor design. A word is a fixed-sized piece of data handled as a unit by the instruction set or the hardware of the processor
-wiki: Word_(computer_architecture)
Computer systems basically have a preferred "word length" (though not so important these days). A standard data unit allows all sorts of optimisations in the architecture of the computer system (think what shipping containers did for the shipping industry). There is a 32 bit standard called DWORD aka Double word (I guess) - and thats what typical bitmap images are optimised for.
So if you have 24bits per pixel, there will be various "literal pixels" row lengths that will not fit nicely into the 32bits. So in that case, pad it out.
Note: today, you are probably using a computer with a 64bit word size. Check your processor.
It depends on the format whether or not there is padding at the end of each row.
There really isn't much reason for it for 3 x 8 bit channel images since I/O is byte oriented anyway. For images with pixels packed into less than a byte (1 bit / pixel for example), padding is useful so that each row starts at a byte offset.