So I'm a bit of a newbie to C and I am curious to figure out why I am getting this unusual behavior.
I am reading a file 16 bits at a time and just printing them out as follows.
#include <stdio.h>
#define endian(hex) (((hex & 0x00ff) << 8) + ((hex & 0xff00) >> 8))
int main(int argc, char *argv[])
{
const int SIZE = 2;
const int NMEMB = 1;
FILE *ifp; //input file pointe
FILE *ofp; // output file pointer
int i;
short hex;
for (i = 2; i < argc; i++)
{
// Reads the header and stores the bits
ifp = fopen(argv[i], "r");
if (!ifp) return 1;
while (fread(&hex, SIZE, NMEMB, ifp))
{
printf("\n%x", hex);
printf("\n%x", endian(hex)); // this prints what I expect
printf("\n%x", hex);
hex = endian(hex);
printf("\n%x", hex);
}
}
}
The results look something like this:
ffffdeca
cade // expected
ffffdeca
ffffcade
0
0 // expected
0
0
600
6 // expected
600
6
Can anyone explain to me why the last line in each block doesn't print the same value as the second?
The placeholder %x in the format string interprets the corresponding parameter as unsigned int.
To print the parameter as short, add a length modifier h to the placeholder:
printf("%hx", hex);
http://en.wikipedia.org/wiki/Printf_format_string#Format_placeholders
This is due to integer type-promotion.
Your shorts are being implicitly promoted to int. (which is 32-bits here) So these are sign-extension promotions in this case.
Therefore, your printf() is printing out the hexadecimal digits of the full 32-bit int.
When your short value is negative, the sign-extension will fill the top 16 bits with ones, thus you get ffffcade rather than cade.
The reason why this line:
printf("\n%x", endian(hex));
seems to work is because your macro is implicitly getting rid of the upper 16-bits.
You have implicitly declared hex as a signed value (to make it unsigned write unsigned short hex) so that any value over 0x8FFF is considered to be negative. When printf displays it as a 32-bit int value it is sign-extended with ones, causing the leading Fs. When you print the return value of endian before truncating it by assigning it to hex the full 32 bits are available and printed correctly.
Related
size of short int is 2 bytes(16 bits) on my 64 bit processor and mingw compiler but when I convert short int variable to a binary string using itoa function
it returns string of 32 bits
#include<stdio.h>
int main(){
char buffer [50];
short int a=-2;
itoa(a,buffer,2); //converting a to binnary
printf("%s %d",buffer,sizeof(a));
}
Output
11111111111111111111111111111110 2
The answer is in understanding C's promotion of short datatypes (and char's, too!) to int's when those values are used as parameters passed to a function and understanding the consequences of sign extension.
This may be more understandable with a very simple example:
#include <stdio.h>
int main() {
printf( "%08X %08X\n", (unsigned)(-2), (unsigned short)(-2));
// Both are cast to 'unsigned' to avoid UB
return 0;
}
/* Prints:
FFFFFFFE 0000FFFE
*/
Both parameters to printf() were, as usual, promoted to 32 bit int's. The left hand value is -2 (decimal) in 32bit notation. By using the cast to specify the other parameter should not be subjected to sign extension, the printed value shows that it was treated as a 32 bit representation of the original 16 bit short.
itoa() is not available in my compiler for testing, but this should give the expected results
itoa( (unsigned short)a, buffer, 2 );
your problem is so simple , refer to itoa() manual , you will notice its prototype which is
char * itoa(int n, char * buffer, int radix);
so it takes an int that to be converted and you are passing a short int so it's converted from 2 byte width to 4 byte width , that's why it's printing a 32 bits.
to solve this problem :
you can simply shift left the array by 16 position by the following simple for loop :
for (int i = 0; i < 17; ++i) {
buffer[i] = buffer[i+16];
}
and it shall give the same result , here is edited version of your code:
#include<stdio.h>
#include <stdlib.h>
int main(){
char buffer [50];
short int a= -2;
itoa(a,buffer,2);
for (int i = 0; i < 17; ++i) {
buffer[i] = buffer[i+16];
}
printf("%s %d",buffer,sizeof(a));
}
and this is the output:
1111111111111110 2
I would like to know how the string is represented in integer, so I wrote the following program.
#include <stdio.h>
int main(int argc, char *argv[]){
char name[4] = {"#"};
printf("integer name %d\n", *(int*)name);
return 0;
}
The output is:
integer name 64
This is understandable because # is 64 in integer, i.e., 0x40 in hex.
Now I change the program into:
#include <stdio.h>
int main(int argc, char *argv[]){
char name[4] = {"##"};
printf("integer name %d\n", *(int*)name);
return 0;
}
The output is:
integer name 16448
I dont understand this. Since ## is 0x4040 in hex. So it should be 2^12+2^6 = 4160
If I count the '\0' at the end of the string, then it should be 2^16+2^10 = 66560
Could someone explain where 16448 comes from?
Your math is wrong: 0x4040 == 16448. The two fours are the 6th and 14th bits respectively.
Your code actually invokes undefined behavior because you must not alias a char * with an int *. This is known as the strict aliasing rule. To see just one reason why this should be disallowed, consider what would otherwise have to happen if the code is run on a little and a big endian machine.
If you want to see the hex pattern of the string, you should simply loop over its bytes and print out each byte.
void
print_string(const char * strp)
{
printf("0x");
do
printf("%02X", (unsigned char) *strp);
while (*strp++);
printf("\n");
}
Of course, instead of printing the bytes, you can shift them into an integer (that will very soon overflow) and only finally output that integer. While doing this, you'll be forced to take a stand on “your” endianness.
/* Interpreting as big endian. */
unsigned long
int_string(const char * strp)
{
unsigned long value = 0UL;
do
value = (value << 8) | (unsigned char) *strp;
while (*strp++);
return value;
}
This is how 16448 comes :
0x4040 can be written like this in binary :
4 0 4 0 -> Hex
0100 0000 0100 0000 -> Binary
2^14 2^6 = 16448
Because here 6th and 14th bit are set.
Hope you got it :)
I'm trying to extract 2nd and 3rd byte from a char array and interpret it's value as an integer. Here in this case, want to extract 0x01 and 0x18 and interpret its value as 0x118 or 280 (decimal) using strtol. But the output act len returns 0.
int main() {
char str[]={0x82,0x01,0x18,0x7d};
char *len_f_str = malloc(10);
int i;
memset(len_f_str,'\0',sizeof(len_f_str));
strncpy(len_f_str,str+1,2);
printf("%x\n",str[1] & 0xff);
printf("%x\n",len_f_str[1] & 0xff);
printf("act len:%ld\n",strtol(len_f_str,NULL,16));
return 0;
}
Output:
bash-3.2$ ./a.out
1
18
act len:0
What am I missing here? Help appreciated.
strtol converts an ASCII representation of a string to a value, not the actual bits.
Try this:
short* myShort;
myShort = (short*) str[1];
long myLong = (long) myShort;
strtol expects the input to be a sequence of characters representing the number in a printable form. To represent 0x118 you would use
char *num = "118";
If you then pass num to strtol, and give it a radix of 16, you would get your 280 back.
If your number is represented as a sequence of bytes, you could use simple math to compute the result:
unsigned char str[]={0x82,0x01,0x18,0x7d};
unsigned int res = str[1] << 8 | str[2];
As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.
I'm writing some quick code to try and extract data from an mp3 file header.
The objective is to extract information from the header such as the bitrate and other vital information so that I can appropriately stream the file to a mp3decoder with the necessary arguments.
Here is a wikipedia image showing the mp3header information:
http://upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestructure.svg
My question is, am I attacking this correctly? Printing the data received is worthless -- I just get a bunch of random characters. I need to get to the binary so that I can decode it and determine vital information.
Here is my baseline code:
// mp3 Header File IO.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
// Main function
int main (void)
{
// Declare variables
FILE *mp3file;
char *mp3syncword; // we will need to allocate memory to this!!
char requestedFile[255] = "";
unsigned long fileLength;
// Counters
int i;
// Memory allocation with malloc
mp3syncword=(char *)malloc(2000);
// Let's get the name of the requested file (hard-coded for now)
strcpy(requestedFile,"testmp3.mp3");
// Open the file with mode read, binary
mp3file = fopen(requestedFile, "rb");
if (!mp3file){
// If we can't find the file, notify the user of the problem
printf("Not found!");
}
// Let's get some header data from the file
fseek(mp3file,1,SEEK_SET);
fread(mp3syncword,32,1,mp3file);
// For debug purposes, lets print the received data
for(i = 0; i < 32; ++i)
printf("%c", ((char *)mp3syncword)[i]);
enter code here
return 0;
}
Help appreciated.
You are printing the bytes out using %c as the format specifier. You need to use an unsigned numeric format specifier (e.g. %u for a decimal number or %x or %X for hexadecimal) to print the byte values.
You should also declare your byte arrays as unsigned char as they are signed by default on Windows.
You might also want to print out a space (or other separator) after each byte value to make the output clearer.
The standard printf does not provide a binary representation type specifier. Some implementations do have this but the version supplied with Visual Studio does not. In order to output this you will need to perform bit operations on the number to extract the individual bits and print each of them in turn for each byte. For example:
unsigned char byte = // Read from file
unsigned char mask = 1; // Bit mask
unsigned char bits[8];
// Extract the bits
for (int i = 0; i < 8; i++) {
// Mask each bit in the byte and store it
bits[i] = (byte & (mask << i)) >> i;
}
// The bits array now contains eight 1 or 0 values
// bits[0] contains the least significant bit
// bits[7] contains the most significant bit
C does not have a printf() specifier to print in binary. Most people print in hex instead, which will give you (typically) eight bits at a time:
printf("the first eight bits are %02x\n", (unsigned char) mp3syncword[0]);
You will need to interpret this manually to figure out the values of individual bits. The cast to unsigned char on the argument is to avoid surprises if it's negative.
To test bits, you can use use the & operator together with the bitwise left shift operator, <<:
if(mp3syncword[2] & (1 << 2))
{
/* The third bit from the right of the third byte was set. */
}
If you want to be able to use "big" (larger than 7) indexes for bits, i.e. treat the data as a 32-bit word, it might be good to read it into e.g. an unsigned int, and then inspect that. Be careful with endian-ness when you do this reading, however.
Warning: there are probably errors with memory layout and/or endianess with this approach. It is not guaranteed that the struct members match the same bits from computer to computer.
In short: don't rely on this (I'll leave the answer, it might be useful for something else)
You can define a struct with bit fields:
struct MP3Header {
unsigned SyncWord : 12;
unsigned Version : 1;
unsigned Layer : 2;
unsigned ErrorProtection : 1;
unsigned BitRate : 4;
unsigned Frequency : 2;
unsigned PadBit : 1;
unsigned PrivBit : 1;
unsigned Mode : 2;
unsigned ModeExtension : 2;
unsigned Copy : 1;
unsigned Original : 1;
unsigned Emphasis : 2;
};
and then use each member as an isolated value:
struct MP3Header h;
/* ... */
fread(&h, sizeof h, 1, mp3file); /* error check!! */
printf("Frequency: %u\n", h.Frequency);