Decoding Binary via fget / buffer string (Trying to get mp3 header) - c

I'm writing some quick code to try and extract data from an mp3 file header.
The objective is to extract information from the header such as the bitrate and other vital information so that I can appropriately stream the file to a mp3decoder with the necessary arguments.
Here is a wikipedia image showing the mp3header information:
http://upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestructure.svg
My question is, am I attacking this correctly? Printing the data received is worthless -- I just get a bunch of random characters. I need to get to the binary so that I can decode it and determine vital information.
Here is my baseline code:
// mp3 Header File IO.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
// Main function
int main (void)
{
// Declare variables
FILE *mp3file;
char *mp3syncword; // we will need to allocate memory to this!!
char requestedFile[255] = "";
unsigned long fileLength;
// Counters
int i;
// Memory allocation with malloc
mp3syncword=(char *)malloc(2000);
// Let's get the name of the requested file (hard-coded for now)
strcpy(requestedFile,"testmp3.mp3");
// Open the file with mode read, binary
mp3file = fopen(requestedFile, "rb");
if (!mp3file){
// If we can't find the file, notify the user of the problem
printf("Not found!");
}
// Let's get some header data from the file
fseek(mp3file,1,SEEK_SET);
fread(mp3syncword,32,1,mp3file);
// For debug purposes, lets print the received data
for(i = 0; i < 32; ++i)
printf("%c", ((char *)mp3syncword)[i]);
enter code here
return 0;
}
Help appreciated.

You are printing the bytes out using %c as the format specifier. You need to use an unsigned numeric format specifier (e.g. %u for a decimal number or %x or %X for hexadecimal) to print the byte values.
You should also declare your byte arrays as unsigned char as they are signed by default on Windows.
You might also want to print out a space (or other separator) after each byte value to make the output clearer.
The standard printf does not provide a binary representation type specifier. Some implementations do have this but the version supplied with Visual Studio does not. In order to output this you will need to perform bit operations on the number to extract the individual bits and print each of them in turn for each byte. For example:
unsigned char byte = // Read from file
unsigned char mask = 1; // Bit mask
unsigned char bits[8];
// Extract the bits
for (int i = 0; i < 8; i++) {
// Mask each bit in the byte and store it
bits[i] = (byte & (mask << i)) >> i;
}
// The bits array now contains eight 1 or 0 values
// bits[0] contains the least significant bit
// bits[7] contains the most significant bit

C does not have a printf() specifier to print in binary. Most people print in hex instead, which will give you (typically) eight bits at a time:
printf("the first eight bits are %02x\n", (unsigned char) mp3syncword[0]);
You will need to interpret this manually to figure out the values of individual bits. The cast to unsigned char on the argument is to avoid surprises if it's negative.
To test bits, you can use use the & operator together with the bitwise left shift operator, <<:
if(mp3syncword[2] & (1 << 2))
{
/* The third bit from the right of the third byte was set. */
}
If you want to be able to use "big" (larger than 7) indexes for bits, i.e. treat the data as a 32-bit word, it might be good to read it into e.g. an unsigned int, and then inspect that. Be careful with endian-ness when you do this reading, however.

Warning: there are probably errors with memory layout and/or endianess with this approach. It is not guaranteed that the struct members match the same bits from computer to computer.
In short: don't rely on this (I'll leave the answer, it might be useful for something else)
You can define a struct with bit fields:
struct MP3Header {
unsigned SyncWord : 12;
unsigned Version : 1;
unsigned Layer : 2;
unsigned ErrorProtection : 1;
unsigned BitRate : 4;
unsigned Frequency : 2;
unsigned PadBit : 1;
unsigned PrivBit : 1;
unsigned Mode : 2;
unsigned ModeExtension : 2;
unsigned Copy : 1;
unsigned Original : 1;
unsigned Emphasis : 2;
};
and then use each member as an isolated value:
struct MP3Header h;
/* ... */
fread(&h, sizeof h, 1, mp3file); /* error check!! */
printf("Frequency: %u\n", h.Frequency);

Related

C structure with bits printing in hexadecimal

I have defined a structure as below
struct {
UCHAR DSatasetMGMT : 1;
UCHAR AtriburDeallocate : 1;
UCHAR Reserved6 : 6;
UCHAR Reserved7 : 7;
UCHAR DSatasetMGMTComply : 1;
}DatasetMGMTCMDSupport;
It is a 2 byte structure represented in bits. How should I print the whole 2 bytes of structure in hexadecimal. I tried
"DatasetMGMTCMDSupport : 0x%04X\n"
And
0x%04I64X\n
But not getting expected result.
I am getting 0x3DC18003 with 0x%04X\n while the correct data is 0x8003 "
I am using 64 bit windows system.
I need to know how to print 2 byte structure in hexadecimal.
Try using 0x%04hx\n. This tells printf to print out only the two bytes. You can read more about it here: https://en.wikipedia.org/wiki/Printf_format_string#Length_field
In contrast, the I64 in 0x%04I64X\n tells printf to print out a 64 bit integer, which is 8 bytes, and 0x%04X\n tells it to print out a default-size integer, which might be 4 bytes on your system.
The width 04 specifies a minimum width. Since the value needs more digits, they are printed.
From a C Standard point of view, you cannot rely on a particular layout of bit fields. Hence, any solution will at best have implementation-defined behaviour.
That being said, your expected output can be obtained. The structure fits in 2 bytes and if you print sizeof(DatasetMGMTCMDSupport) it should give the result 2.
The byte representation of DatasetMGMTCMDSupport can be printed and that is what you were attempting, but since your system has integer size 4, two additional bytes are included. To fix this, the following can be done:
#include <stdint.h>
#include <string.h>
#include <stdio.h>
...
uint16_t a;
memcpy(&a, &DatasetMGMTCMDSupport, sizeof(a));
printf("0x%04X", (unsigned)a);
This copies the 2 bytes of DatasetMGMTCMDSupport into a 2-byte integer variable and prints the hexadecimal representation of those 2 bytes only. If you are on a little-endian system, you should see 0x8003.
A more general approach would be to directly print the bytes of DatasetMGMTCMDSupport:
for(unsigned i = 0; i < sizeof(DatasetMGMTCMDSupport); i++)
{
printf("%02X", (unsigned)((unsigned char *)&DatasetMGMTCMDSupport)[i]);
}
This will most likely print 0380 (notice the byte order: first byte printed first).
To reverse the byte order is straightforward:
for(unsigned i = 0; i < sizeof(DatasetMGMTCMDSupport); i++)
{
printf("%02X", (unsigned)((unsigned char *)&DatasetMGMTCMDSupport)[sizeof(DatasetMGMTCMDSupport)-1-i]);
}
which should give 8003.

How to print specific byte of unsigned integer?

I'm attempting to write a program in C that examines bytes in memory and prints their contents. Given a 4-byte unsigned integer, what would a function look like that prints a specific byte of the integer to stdout in hexadecimal? Does printf have some sort of capability like this built-in?
Here's the interface of what I'm looking for.
// number - the integer to be examined
// order - the byte to be examined, with 0 being the lowest-order
// (first) byte and 3 being the highest order (last) byte
void print_byte(unsigned number, unsigned order);
If it's important for the implementation, this would be a little-endian machine.
Please Try This...
#include <stdio.h>
void print_byte(unsigned number, unsigned order)
{
unsigned i = 0;
i = (number >> (8*order)) & 0x000000FF;
printf("Number:0x%08x, Byte:%02x, Order:%d\n",number,i,order);
return;
}
int main(void) {
print_byte(0x1f2e3d4c,0);
print_byte(0x1f2e3d4c,1);
print_byte(0x1f2e3d4c,2);
print_byte(0x1f2e3d4c,3);
return 0;
}
If using C99 or later, use the length modifier "hh" after shifting. This modifier will convert the integer to unsigned/signed char before printing. Use 8 or CHAR_BIT depending on meaning of "byte".
printf("%hhX", number >> (order * 8));
or
#include <limits.h>
printf("%hhX", number >> (order * CHAR_BIT));

Reading from a text file in C

I am having trouble reading a specific integer from a file and I am not sure why. First I read through the entire file to find out how big it is, and then I reset the pointer to the beginning. I then read 3 16-byte blocks of data. Then 1 20-byte block and then I would like to read 1 byte at the end as an integer. However, I had to write into the file as a character but I do not think that should be a problem. My issue is that when I read it out of the file instead of being the integer value of 15 it is 49. I checked in the ACII table and it is not the hex or octal value of 1 or 5. I am thoroughly confused because my read statement is read(inF, pad, 1) which I believe is right. I do know that an integer variable is 4 bytes however, there is only one byte of data left in the file so I read in only the last byte.
My code is reproduced the function(it seems like a lot but it don't think it is)
the code is
#include<math.h>
#include<stdio.h>
#include<string.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
char x;
int y;
int bytes = 0;
int num = 0;
int count = 0;
num = open ("a_file", O_RDONLY);
bytes = read(num, y, 1);
printf("y %d\n", y);
return 0;
}
To sum up my question, how come when I read the byte that stores 15 from the text file, I can't view it as 15 from the integer representation?
Any help would be very appreciated.
Thanks!
You're reading a first byte of int (4 bytes), and then print it as a whole. If you want to read by one byte, you need also to use it as one byte, like this:
char temp; // one-byte signed integer
read(fd, &temp, 1); // read the integer from file
printf("%hhd\n", temp); // print one-byte signed integer
Or, you can use regular int:
int temp; // four byte signed integer
read(fd, &temp, 4); // read it from file
printf("%d\n", temp); // print four-byte signed integer
Note that this will work only on platforms with 32-bit integers, and also depends on platform's byte order.
What you're doing is:
int temp; // four byte signed integer
read(fd, &temp, 1); // read one byte from file into the integer
// now first byte of four is from the file,
// and the other three contain undefined garbage
printf("%d\n", temp); // print contents of mostly uninitialized memory
The read function system call has a declaration like:
ssize_t read(int fd, void* buf, size_t count);
So, you should pass address of the int variable in which you want to read the stuff.
i.e use
bytes = read(num, &y, 1);
You can see all the details of file I/O in C from that link
Based on the read function, I believe it is reading the first byte in the first byte of the 4 bytes of the integer, and that byte is not placed in the lowest byte. This means whatever is in pad for the other 3 bytes will still be there, even if you initialized it to zero (then it will have zeros in the other bytes). I would read in one byte and then cast it to an integer (if you need a 4 byte integer for some reason), as shown below:
/* declare at the top of the program */
char temp;
/* Note line to replace read(inF,pad,1) */
read(inF,&temp,1);
/* Added to cast the value read in to an integer high order bit may be propagated to make a negative number */
pad = (int) temp;
/* Mask off the high order bits */
pad &= 0x000000FF;
Otherwise, you could change your declaration to be an unsigned char which would take care of the other 3 bytes.

Why does C print my hex values incorrectly?

So I'm a bit of a newbie to C and I am curious to figure out why I am getting this unusual behavior.
I am reading a file 16 bits at a time and just printing them out as follows.
#include <stdio.h>
#define endian(hex) (((hex & 0x00ff) << 8) + ((hex & 0xff00) >> 8))
int main(int argc, char *argv[])
{
const int SIZE = 2;
const int NMEMB = 1;
FILE *ifp; //input file pointe
FILE *ofp; // output file pointer
int i;
short hex;
for (i = 2; i < argc; i++)
{
// Reads the header and stores the bits
ifp = fopen(argv[i], "r");
if (!ifp) return 1;
while (fread(&hex, SIZE, NMEMB, ifp))
{
printf("\n%x", hex);
printf("\n%x", endian(hex)); // this prints what I expect
printf("\n%x", hex);
hex = endian(hex);
printf("\n%x", hex);
}
}
}
The results look something like this:
ffffdeca
cade // expected
ffffdeca
ffffcade
0
0 // expected
0
0
600
6 // expected
600
6
Can anyone explain to me why the last line in each block doesn't print the same value as the second?
The placeholder %x in the format string interprets the corresponding parameter as unsigned int.
To print the parameter as short, add a length modifier h to the placeholder:
printf("%hx", hex);
http://en.wikipedia.org/wiki/Printf_format_string#Format_placeholders
This is due to integer type-promotion.
Your shorts are being implicitly promoted to int. (which is 32-bits here) So these are sign-extension promotions in this case.
Therefore, your printf() is printing out the hexadecimal digits of the full 32-bit int.
When your short value is negative, the sign-extension will fill the top 16 bits with ones, thus you get ffffcade rather than cade.
The reason why this line:
printf("\n%x", endian(hex));
seems to work is because your macro is implicitly getting rid of the upper 16-bits.
You have implicitly declared hex as a signed value (to make it unsigned write unsigned short hex) so that any value over 0x8FFF is considered to be negative. When printf displays it as a 32-bit int value it is sign-extended with ones, causing the leading Fs. When you print the return value of endian before truncating it by assigning it to hex the full 32 bits are available and printed correctly.

How to convert from integer to unsigned char in C, given integers larger than 256?

As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.

Resources