I found this program during an online test on c programming I tried it on my level but I cannot figure it out that why the output of this program comes out to be 64.
Can anyone explain the concept behind this?
#include <iostream>
#include <stdio.h>
using namespace std;
int main()
{
int a = 320;
char *ptr;
ptr = (char *)&a;
printf("%d",*ptr);
return 0;
}
output:
64
Thankyou.
A char * points to one byte only. Assuming a byte on your system is 8 bits, the number 320 occupies 2 bytes. The lower byte of those is 64, the upper byte is 1, because 320 = 256 * 1 + 64. That is why you get 64 on your computer (a little-endian computer).
But note that on other platforms, so called big-endian platforms, the result could just as well be 1 (the most significant byte of a 16 bit/2 byte value) or 0 (the most significant byte of a value larger than 16 bit/2 bytes).
Note that all this assumes that the platform has 8-bit bytes. If it had, say 10-bit bytes, you would get a different result again. Fortunately, most computers have 8-bit bytes nowadays.
You won't be able to understand this unless you know about:
hex/binary represenation, and
CPU endianess.
Type out the decimal number 320 in hex. Split it up in bytes. Assuming int is 4 bytes, you should be able to tell which parts of the number that goes in which bytes.
After that, consider the endianess of the given CPU and sort the bytes in that order. (MS byte first or LS byte first.)
The code accesses the byte allocated at the lowest address of the integer. What it contains depends on the CPU endianess. You'll either get hex 0x40 or hex 0x00.
Note: You shouldn't use char for these kind of things, because it has implementation-defined signedness. In case the data bytes contains values larger than 0x7F, you might get some very weird bugs, that inconsistently appear/disappear across multiple compilers. Always use uint8_t* when doing any form of bit/byte manipulation.
You can expose this bug by replacing 320 with 384. Your little endian system may then either print -128 or 128, you'll get different results on different compilers.
What #Lundin said is enough.
BTW, maybe some basic knowledge is helpful. 320 = 0x0140. a int = 4 char. So when print the first byte, it output 0x40 = 64 because of cpu endianess.
ptr is char pointer of a. Thus *ptr will give char value of a. char occupies only 1 byte thus it repeats its values after 255. That is 256 becomes 0, 257 becomes 1 and so on. Thus 320 becomes 64.
Int is four byte data byte while char is one byte data byte, char pointer can keep the address one byte at time. Binary value of 320 is 00000000 00000000 00000001 01000000. So, char pointer ptr is pointing to only first byte.
*ptr i.e. content of first byte is 01000000 and its decimal value is 64.
Related
While playing around with pointers I came around something interesting.
I initialized a 16 bit unsigned int variable to the number 32771 and then assigned the address of that variable to an 8 bit unsigned int pointer.
Now 32771, in unsigned 16 bit form, has binary representation of 110000000000001. So when dereferencing the 8 bit pointer the first time, I expected it to print the value of 11000000, which is = 192 and after incrementing the pointer and then dereferencing the pointer again, is expected it to print the value 00000001, which is 128.
In actuality, for the first dereference, 3 was printed, which is what I would get if I read 11000000 from left to right and the second dereference printed 128.
int main(){
__uint16_t a = 32771;
__uint8_t *p = (__uint8_t *)&a;
printf("%d", *p); //Expected 192 but actual output 3.
++p;
printf("%d", *p); //Expected 1, but actual output 128
}
I know that bits are read from right to left, however in this case the bits are being read from left to right. Why?
32771 is 32768 plus 3. In binary, it's 1000000000000011.
If you split it into the most significant byte and the last significant byte, you get 128 and 3 because 128 * 256 + 3 = 32771.
So the bytes will be 3 and 128. There's no particular reason one should occur before the other. Your CPU can store the bytes that compose a multi-byte number in whatever order it wants to. Apparently, yours stores the least significant byte at a lower address than the most significant byte.
This is my program
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
struct bitfield {
unsigned a:3;
char b;
unsigned c:5;
int d;
}bit;
printf("%lu \n",sizeof(bit));
return 0;
}
I was expecting size of this structure to be, well quite a lot, but it comes out to be 8 on my machine, because unsigned is 4 bytes. Now reason I was expecting to be more than that was because I would expect char b to be on a byte boundary, so that it is aligned in the memory correctly. Now my guess is that compiler is putting a, b, c, all in those 4 bytes. I am new to C, so please bear with me. Is my assumption that all the other data types besides bit fields have to be necessarily on byte incorrect? If it is correct, I would expect, a to take the whole unsigned int, and b to take a byte and then padding of 3 bytes, and then so on. What am I missing here?
I don't seem to understand the conflict.
you basically have:
struct bitfield {
unsigned a:3;
unsigned padding1:5;
char b;
unsigned c:5;
unsigned padding2:3;
unsigned padding3:8;
int d;
}bit;
a is on byte 1 boundary, it uses three bits + 5 bit padding (because there aren't any more bit fields to use up the leftover bits).
b is on byte 2 boundary, it uses a whole byte.
c is on byte 3 boundary, it uses five bits + 3 bit padding (because there aren't any more bit fields to use up the leftover bits).
- int padding comes here -
d is on an int boundary (4 bytes on your machine). It uses 1 byte for padding + 4 bytes for data.
All together, 8 bytes...
... although, as pointed out by # JonathanLeffler in the comments, this is implementation specific and doesn't mean every compiler will behave the same.
You usually align the bitfields in that way they share a single byte together. When you place another variable in between bitfields, the unused bits end up using space in memory you don't need but still use space.
I tested your code with some values and ran the program in gdb.
bit.a = 1;
bit.b = 'a';
bit.c = 4;
bit.d = 1337;
When printing the memory in gdb, the output looks like this
(gdb) x/8b &bit
0x7fffffffe118: 00000001 01100001 00000100 00000000
00111001 00000101 00000000 00000000
So we see that the first byte is completely used from field a, although it only uses 3 bits. The character a in field b also takes up a byte. The third byte (00000100) matches value 4 of field c and here's whats interesting:
You then have an integer which uses 4 bytes (in the gdb output above the last 4 bytes, bottom line), but there's an extra zero byte in there. This is a common pratice for compilers as they try to align the memory in an optimized way.
i read that memory is arranged as a group of 4 bytes in 32 bit processor and 8 bytes in 64 bit processor from http://fresh2refresh.com/c-programming/c-structure-padding/ but didn't clarify the difference between these two.
struct structure2
{
int id1;
char name;
int id2;
char c;
float percentage;
};
By 32 bit processor (more specifically speaking, It is talking about size of data bus rather than size of registers), It means 32 bits(4 Bytes) of data will be read and processed at a time.
Now, Consider an int:
int a=10; //assuming 4 bytes
00000000 000000000 00000000 00001010
Assuming little endian architecture, it would be stored as:
------------------------------------------------------------------------
| 00001010 | 00000000 | 00000000 | 00000000 | <something_else>
-------------------------------------------------------------------------
1st byte 2nd byte 3rd byte 4th byte
\--------------------------------------------------/
|
4 bytes processed together
In this case when the processor will read the data to be processed, It can process the entire integer in one go (all 4 bytes together)(In 1 machine cycle more strictly speaking)
However consider a case where the same integer was stored as,
------------------------------------------------------------------------
|<something_else>| 00001010 | 00000000 | 00000000 | 00000000 |
-------------------------------------------------------------------------
1st byte 2nd byte 3rd byte 4th byte
\------------------------------------------------------/
|
4 bytes processed together
In this case, the processor would need 2 machine cycles to read the integer.
Most of the architecture always try to minimize the CPU cycles.
Hence the 1st arrangement in memory is preferred by many compilers and thus enforce alignment requirements (padding).
So 4 byte ints are stored in addresses starting at multiple of 4s, chars are stored in multiple of 1s, 8 byte doubles are stored in multiple of 8s, 8 byte long long int in multiple of 8s and so on...
Now consider your structure
struct structure2
{
int id1; //assuming 4 byte int
char name; // 1byte
int id2; //4 byte
char c; // 1 byte
float percentage; //assuming 4 byte float
};
id1 will get stored in some address(starting multiple of 4)in memory and take 4 bytes.
name will take the next byte.
now if id2 gets stored in next byte, It will break the alignment rule above. So it would Leave 3 Bytes of padding and get stored starting with adress which is next multiple of 4 and will take 4 bytes.
For c again the same thing happens as name. It takes next 1 byte and keeps 3 byte of padding.
At last percentage gets stored in next 4 bytes.
So total size of structure becomes 20 bytes.
A more complicated case would be say
struct mystructure
{
char a; //1 byte char
double b; // 8 byte double
int c; // 4 byte int
}
Here one may at first glance say that size would be 20 bytes(1 byte for char + 7 byte padding + 8 byte for double + 4 byte for int).
However the actual size would be 24 bytes.
Say somebody declared an array of this structure
struct mystructre arr[4];
Here(assuming 20 byte structure) although arr[0] is properly aligned, but if you check carefully you'll find that arr[1].b is misaligned. So 4 bytes of extra padding is added at the end of structure to make the structure size multiple of its alignment.(Every structure also has its own alignment requirements).
Hence the total size would be 24 bytes.
The size of the integer,long etc. are decided by the compiler. The compiler generally takes care of the processor architecture but it may choose not to.
Similarly, whether to use padding or not is decided by the compiler. Not padding is known as packing. Some compilers have explicit options for allowing packing.
In GCC(GNU C compiler) you can do it with __attribute__((__packed__)), so in the following code
struct __attribute__((__packed__)) mystructure2
{
char a;
int b;
char c;
};
mystructure2 has size 6 bytes because of explicit request to pack the structure. This structure will be slower to process.
You can probably figure it out yourself by now, what would happen in 64 bit processor, or if size of int was different.
This website does not precise exactly which kind of 64-bit platform is used, but seems to assume an ILP64 (int, long and pointers are 64-bit) platform with length-aligned integers. Which means an int is four bytes on a 32-bit processor, and eight bytes on a 64-bit processor, and each must be aligned on a multiple of its own length.
The result is a change in the length of the padding between name and id2 (padding necessary to preserve id2's alignment).
On a 32-bit platform, there would be three bytes of padding; on a 64-bit platform, there would be seven.
The padding between c and percentage will likely not change, because the size of floating-point variables is not affected by the processor's bitness.
#include <stdio.h>
union bits_32{
unsigned int x;
struct {char b4,b3,b2,b1;} byte;
} ;
int main(int argc, char **argv){
union bits_32 foo;
foo.x=0x100000FA;
printf("%x",foo.byte.b4 & 0xFF);
}
This will output FA. Why doesn't it output 10 since b4 occupies the first space?
It's depends on endianess of your machine. If your machine is little endian it prints FA(Your's is little endian right?). If your machine is big endian it prints 10.
Storing Words in Memory
We've defined a word to mean 32 bits. This is the same as 4 bytes. Integers, single-precision floating point numbers, and MIPS instructions are all 32 bits long. How can we store these values into memory? After all, each memory address can store a single byte, not 4 bytes.
The answer is simple. We split the 32 bit quantity into 4 bytes. For example, suppose we have a 32 bit quantity, written as 90AB12CD16, which is hexadecimal. Since each hex digit is 4 bits, we need 8 hex digits to represent the 32 bit value.
So, the 4 bytes are: 90, AB, 12, CD where each byte requires 2 hex digits.
It turns out there are two ways to store this in memory.
Big Endian
In big endian, you store the most significant byte in the smallest address. Here's how it would look:
Address Value
1000 90
1001 AB
1002 12
1003 CD
Little Endian
In little endian, you store the least significant byte in the smallest address. Here's how it would look:
Address Value
1000 CD
1001 12
1002 AB
1003 90
Notice that this is in the reverse order compared to big endian. To remember which is which, recall whether the least significant byte is stored first (thus, little endian) or the most significant byte is stored first (thus, big endian).
Notice I used "byte" instead of "bit" in least significant bit. I sometimes abbreciated this as LSB and MSB, with the 'B' capitalized to refer to byte and use the lowercase 'b' to represent bit. I only refer to most and least significant byte when it comes to endianness.
I'm trying to read a binary file into a C# struct. The file was created from C and the following code creates 2 bytes out of the 50+ byte rows.
unsigned short nDayTimeBitStuffed = atoi( LPCTSTR( strInput) );
unsigned short nDayOfYear = (0x01FF & nDayTimeBitStuffed);
unsigned short nTimeOfDay = (0x01F & (nDayTimeBitStuffed >> 9) );
Binary values on the file are 00000001 and 00000100.
The expected values are 1 and 2, so I think some bit ordering/swapping is going on but not sure.
Any help would be greatly appreciated.
Thanks!
The answer is 'it depends' - most notably on the machine, and also on how the data is written to the file. Consider:
unsigned short x = 0x0102;
write(fd, &x, sizeof(x));
On some machines (Intel), the low-order byte (0x02) will be written before the high-order byte (0x01); on others (PPC, SPARC), the high-order byte will be written before the low-order one.
So, from a little-endian (Intel) machine, you'd see the bytes:
0x02 0x01
But from a big-endian (PPC) machine, you'd see the bytes:
0x01 0x02
Your bytes appear to be 0x01 and 0x04. Your calculation for 0x02 appears flawed.
The C code you show doesn't write anything. The value in nDayOfYear is the bottom 9 bits of the input value; the nTimeOfDay appears to be the next 5 bits (so 14 of the 16 bits are used).
For example, if the value in strInput is 12141 decimal, 0x2F6D, then the value in nDayOfYear would be 365 (0x16D) and the value in nTimeOfDay would be 23 (0x17).
It is a funny storage order; you can't simply compare the two values whereas if you packed the day of year in the more significant portion of the value and time into the less significant, then you could compare values as simple integers and get the correct comparison.
The expected file contents are very much related to the processor and compiler used to create the file, if it's binary.
I'm assuming a Windows machine here, which uses 2 bytes for a short and puts them in little endian order.
Your comments don't make much sense either. If it's two bytes then it should be using two chars, not shorts. The range of the first is going to be 1-365, so it definitely needs more than a single byte to represent. I'm going to assume you want the first 4 bytes, not the first 2.
This means that the first byte will be bits 0-7 of the DayOfYear, the second byte will be bits 8-15 of the DayOfYear, the third byte will be bits 0-7 of the TimeOfDay, and the fourth byte will be bits 8-15 of the TimeOfDay.