Why does a reference to an int return only one memory address? - c

Example Program:
#include <stdio.h>
int main() {
int x = 0;
printf("%p", &x);
return 0;
}
I have read that most machines are byte-accessible, meaning that only one
byte can be stored on a single memory address (e.g. 0xf4829cba stores the value 01101011). Assuming that x is a 32-bit integer, shouldn't the reference to the variable return four memory addresses, instead of one?
Please ELI5, as I am very confused right now.
Thank you so much for your time.
-Matt

The address (it's not a "reference") you're given is to the beginning of the memory where the variable is stored. The variable will then take as many bytes as needed according to its type. So if int is 32 bits in your target architecture, the address you get is of the first of four bytes used to store that int.
+−−−−−−−−+
address−−−>| byte 0 |
| byte 1 |
| byte 2 |
| byte 3 |
+−−−−−−−−+

It may help to think in terms of objects1 rather than bytes. Most useful data types in C take up more than a single byte.
As for an expression like &x evaluating to multiple addresses, think of it like the address to your house - you don't specify a distinct address for every room in the house, do you? No, for the purpose of telling other people where your house is, you only need to specify one address. For the purpose of knowing where an int ordouble or struct humongous object is, we only need to know the address of the first byte.
You can access and manipulate individual bytes in a larger object in several different ways. You can use bit masking operations like
int x = some_value;
unsigned char aByte = (x & 0xFF000000) >> 24; // isolate the MSB
or you can map the object onto an array of unsigned char using a union:
union {
int x;
unsigned char b[sizeof (int)];
} u;
u.x = some_value;
aByte = u.b[0]; // access the initial byte - depending on byte ordering, this
// may be the MSB or the LSB.
or by creating a pointer to the first byte:
int x = some_value;
unsigned char *b = (unsigned char *) &x;
unsigned char aByte = b[0];
Byte ordering is a thing - some architectures store multi-byte values starting at the most significant byte, others starting at the least significant byte:
For any address A
A+0 A+1 A+2 A+3
Big endian +---+---+---+---+
|MSB| | |LSB|
+---+---+---+---+ Little endian
A+3 A+2 A+1 A+0
The M68K chips that powered the original Macintosh were big-endian, while x86 is little-endian.
Bitwise operators like & and | take byte ordering into account - x & 0xFF000000 will always isolate the MSB2. When you map an object onto an array of unsigned char, the first element may map to the MSB, or it may map to the LSB, or it may map to something else (the old VAX architecture used a "middle-endian" ordering for 32-bit floats that either went 2301 or 1032, can't remember which offhand).
In the C sense of a region of storage that may be used to hold a value, not the OOP sense of an instance of a class.
Assuming 32-bit int and 8-bit bytes, anyway.

Related

Converting 32 bit number to four 8bit numbers

I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.
For example,
If the input is 32700, I want 188 127 00 00.
I achieved this by using:
32700 % 256
32700 / 256
The above works till 32700. From 32800 onward, I start getting incorrect conversions.
I am totally new to this and would like some help to understand how this can be done properly.
Major edit following clarifications:
Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.
For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.
We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:
|==================|
| MEMORY | ADDRESS |
|========|=========|
| ... | x-1 | <== Pointer to byte before
|--------|---------|
| BYTE 0 | x | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 | x+1 | <== Pointer to second byte
|--------|---------|
| BYTE 2 | x+2 | <== Pointer to third byte
|--------|---------|
| BYTE 3 | x+3 | <== Pointer to fourth byte
|--------|---------|
| ... | x+4 | <== Pointer to byte after
|===================
So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).
As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):
Any number of derived types can be constructed from the object and function types, as follows:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]
[...]
So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!
Given a 32-bit signed value:
int32_t value;
its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:
uint8_t arr[4];
notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:
uint8_t (*ptr)[4];
and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!
uint8_t (*bytes)[4] = (void *) &value;
I need to cast the pointer ("(void *)") because I can't bear that whining compiler &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!
Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!
To put it all together, given a send_can(uint8_t) function:
for (size_t i = 0; i < sizeof(*bytes); i++)
send_can((*bytes)[i]);
and, for testing purpose, we define:
void send_can(uint8_t b)
{
printf("%hhu\n", b);
}
which prints, on my machine, when value is 32700:
188
127
0
0
Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:
byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0
actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.
It really depends on how your architecture stores an int. For example
8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64
This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.
Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.
Having said that you wish to convert a number into 4 8 bit ints. You could have something like
unsigned long x = 600000UL; // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);
Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.
You could do some bit masking.
600000 is 0x927C0
600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.
I ended up doing this:
unsigned char bytes[4];
unsigned long n;
n = (unsigned long) sensore1 * 100;
bytes[0] = n & 0xFF;
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);
I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:
typedef union
{
uint32_t u4_input;
uint8_t u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;
UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];
The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.
Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.

How to convert to integer a char[4] of "hexadecimal" numbers [C/Linux]

So I'm working with system calls in Linux. I'm using "lseek" to navigate through the file and "read" to read. I'm also using Midnight Commander to see the file in hexadecimal. The next 4 bytes I have to read are in little-endian , and look like this : "2A 00 00 00". But of course, the bytes can be something like "2A 5F B3 00". I have to convert those bytes to an integer. How do I approach this? My initial thought was to read them into a vector of 4 chars, and then to build my integer from there, but I don't know how. Any ideas?
Let me give you an example of what I've tried. I have the following bytes in file "44 00". I have to convert that into the value 68 (4 + 4*16):
char value[2];
read(fd, value, 2);
int i = (value[0] << 8) | value[1];
The variable i is 17480 insead of 68.
UPDATE: Nvm. I solved it. I mixed the indexes when I shift. It shoud've been value[1] << 8 ... | value[0]
General considerations
There seem to be several pieces to the question -- at least how to read the data, what data type to use to hold the intermediate result, and how to perform the conversion. If indeed you are assuming that the on-file representation consists of the bytes of a 32-bit integer in little-endian order, with all bits significant, then I probably would not use a char[] as the intermediate, but rather a uint32_t or an int32_t. If you know or assume that the endianness of the data is the same as the machine's native endianness, then you don't need any other.
Determining native endianness
If you need to compute the host machine's native endianness, then this will do it:
static const uint32_t test = 1;
_Bool host_is_little_endian = *(char *)&test;
It is worthwhile doing that, because it may well be the case that you don't need to do any conversion at all.
Reading the data
I would read the data into a uint32_t (or possibly an int32_t), not into a char array. Possibly I would read it into an array of uint8_t.
uint32_t data;
int num_read = fread(&data, 4, 1, my_file);
if (num_read != 1) { /* ... handle error ... */ }
Converting the data
It is worthwhile knowing whether the on-file representation matches the host's endianness, because if it does, you don't need to do any transformation (that is, you're done at this point in that case). If you do need to swap endianness, however, then you can use ntohl() or htonl():
if (!host_is_little_endian) {
data = ntohl(data);
}
(This assumes that little- and big-endian are the only host byte orders you need to be concerned with. Historically, there have been others, which is why the byte-reorder functions come in pairs, but you are extremely unlikely ever to see one of the others.)
Signed integers
If you need a signed instead of unsigned integer, then you can do the same, but use a union:
union {
uint32_t unsigned;
int32_t signed;
} data;
In all of the preceding, use data.unsigned in place of plain data, and at the end, read out the signed result from data.signed.
Suppose you point into your buffer:
unsigned char *p = &buf[20];
and you want to see the next 4 bytes as an integer and assign them to your integer, then you can cast it:
int i;
i = *(int *)p;
You just said that p is now a pointer to an int, you de-referenced that pointer and assigned it to i.
However, this depends on the endianness of your platform. If your platform has a different endianness, you may first have to reverse-copy the bytes to a small buffer and then use this technique. For example:
unsigned char ibuf[4];
for (i=3; i>=0; i--) ibuf[i]= *p++;
i = *(int *)ibuf;
EDIT
The suggestions and comments of Andrew Henle and Bodo could give:
unsigned char *p = &buf[20];
int i, j;
unsigned char *pi= &(unsigned char)i;
for (j=3; j>=0; j--) *pi++= *p++;
// and the other endian:
int i, j;
unsigned char *pi= (&(unsigned char)i)+3;
for (j=3; j>=0; j--) *pi--= *p++;

Copying a 4 element character array into an integer in C

A char is 1 byte and an integer is 4 bytes. I want to copy byte-by-byte from a char[4] into an integer. I thought of different methods but I'm getting different answers.
char str[4]="abc";
unsigned int a = *(unsigned int*)str;
unsigned int b = str[0]<<24 | str[1]<<16 | str[2]<<8 | str[3];
unsigned int c;
memcpy(&c, str, 4);
printf("%u %u %u\n", a, b, c);
Output is
6513249 1633837824 6513249
Which one is correct? What is going wrong?
It's an endianness issue. When you interpret the char* as an int* the first byte of the string becomes the least significant byte of the integer (because you ran this code on x86 which is little endian), while with the manual conversion the first byte becomes the most significant.
To put this into pictures, this is the source array:
a b c \0
+------+------+------+------+
| 0x61 | 0x62 | 0x63 | 0x00 | <---- bytes in memory
+------+------+------+------+
When these bytes are interpreted as an integer in a little endian architecture the result is 0x00636261, which is decimal 6513249. On the other hand, placing each byte manually yields 0x61626300 -- decimal 1633837824.
Of course treating a char* as an int* is undefined behavior, so the difference is not important in practice because you are not really allowed to use the first conversion. There is however a way to achieve the same result, which is called type punning:
union {
char str[4];
unsigned int ui;
} u;
strcpy(u.str, "abc");
printf("%u\n", u.ui);
Neither of the first two is correct.
The first violates aliasing rules and may fail because the address of str is not properly aligned for an unsigned int. To reinterpret the bytes of a string as an unsigned int with the host system byte order, you may copy it with memcpy:
unsigned int a; memcpy(&a, &str, sizeof a);
(Presuming the size of an unsigned int and the size of str are the same.)
The second may fail with integer overflow because str[0] is promoted to an int, so str[0]<<24 has type int, but the value required by the shift may be larger than is representable in an int. To remedy this, use:
unsigned int b = (unsigned int) str[0] << 24 | …;
This second method interprets the bytes from str in big-endian order, regardless of the order of bytes in an unsigned int in the host system.
unsigned int a = *(unsigned int*)str;
This initialization is not correct and invokes undefined behavior. It violates C aliasing rules an potentially violates processor alignment.
You said you want to copy byte-by-byte.
That means the the line unsigned int a = *(unsigned int*)str; is not allowed. However, what you're doing is a fairly common way of reading an array as a different type (such as when you're reading a stream from disk.
It just needs some tweaking:
char * str ="abc";
int i;
unsigned a;
char * c = (char * )&a;
for(i = 0; i < sizeof(unsigned); i++){
c[i] = str[i];
}
printf("%d\n", a);
Bear in mind, the data you're reading may not share the same endianness as the machine you're reading from. This might help:
void
changeEndian32(void * data)
{
uint8_t * cp = (uint8_t *) data;
union
{
uint32_t word;
uint8_t bytes[4];
}temp;
temp.bytes[0] = cp[3];
temp.bytes[1] = cp[2];
temp.bytes[2] = cp[1];
temp.bytes[3] = cp[0];
*((uint32_t *)data) = temp.word;
}
Both are correct in a way:
Your first solution copies in native byte order (i.e. the byte order the CPU uses) and thus may give different results depending on the type of CPU.
Your second solution copies in big endian byte order (i.e. most significant byte at lowest address) no matter what the CPU uses. It will yield the same value on all types of CPUs.
What is correct depends on how the original data (array of char) is meant to be interpreted.
E.g. Java code (class files) always use big endian byte order (no matter what the CPU is using). So if you want to read ints from a Java class file you have to use the second way. In other cases you might want to use the CPU dependent way (I think Matlab writes ints in native byte order into files, c.f. this question).
If your using CVI (National Instruments) compiler you can use the function Scan to do this:
unsigned int a;
For big endian:
Scan(str,"%1i[b4uzi1o3210]>%i",&a);
For little endian:
Scan(str,"%1i[b4uzi1o0123]>%i",&a);
The o modifier specifies the byte order.
i inside the square brackets indicates where to start in the str array.

how to create pointer to a bit in c-language

As we know a in c-language char pointer traverse memory byte by byte i.e. 1 byte each time,
and integer pointer 4 byte each time(in gcc compiler), 2 byte each time(in TC compiler).
for example:
char *cptr; // if this points to 0x100
cptr++; // now it points to 0x101
int *iptr; // if this points to 0x100
iptr++; // now it points to 0x104
My question is:
How to create a bit pointer in c which on incrementing traverse memory bit by bit?
The char is the 'smallest addressable unit' in C. You can't point directly at something smaller than that (such as a bit).
You can't. Using pointers, it's not possible to manipulate bits directly. (Do you really expect poor hypothetical bit *p = 1; p++ to return 1.125?)
However, you can use bitwise operators, such as <<, >>, | and & to access a specific bit within a byte.
Conceptually, a "bit pointer" is not a single scalar, but an ordered pair consisting of a byte pointer and a bit index within that byte. You can represent this with a structure containing both, or with two separate objects. Performing arithmetic on them requires some modular reduction on your part; for example, if you want to access the bit 10 bits past a given bit, you have to add 10 to the bit index, then reduce it modulo 8, and increment the byte pointer part appropriately.
Incidentally, on historical systems that only had word-addressable memory, not byte-addressable, char * consisted of a word pointer and a byte index within the word. This is the exact same concept. The difference is that, while C provides char * even on machines without byte-addressable memory, it does not provide any built-in "bit pointer" type. You have to create it yourself if you want it.
No, but you can write a function to read the bits one by one:
int readBit(char *byteData, int bitOffset)
{
const int wholeBytes = bitOffset / 8;
const int remainingBits = bitOffset % 8;
return (byteData[wholeBytes] >> remainingBits) & 1;
//or if you want most significant bit to be 0
//return (byteData[wholeBytes] >> (7-remainingBits)) & 1;
}
Usage:
char *data = any memory you like.
int bitPointer=0;
int bit0 = readBit(data, bitPointer);
bitPointer++;
int bit1 = readBit(data, bitPointer);
bitPointer++;
int bit2 = readBit(data, bitPointer);
Of course if this kind of function had general value it would probably already exist. Operating bit-by-bit is just so inefficient compared to using bit masks, and shifts etc.
I don't think that is possible since modern computers are byte addressable which means that there is one address for each byte. So a bit has no address and as such a pointer cant point to it. You could use a char * and bitwise operations to determine the value of individual bits.
If you really want it you could write a class that uses a char* to keep track of the address in memory, a char(or short/int however the value would never need to be higher than 0000 0111 so a char would reduce the memory footprint) to keep track of which bit in that byte you are at and then overload the operators so that it functions as you want it to.
I am not sure what you are asking is possible. You need to do some magic with bit shifting to traverse through all the bits of a byte pointed by the pointer.
You could always cast your pointer to integer, that is at least 3 bits bigger in size than byte pointer used at the system. Then just shift the pointer after the cast left by 3 bits. Then store the bit information on the least significant 3 bits.
This integer "bitpointer" can then be incremented with normal arithmetic.
Something like this:
#include <stdio.h>
#define bitptr long long
#define create_bitptr(pointer,bit) ((((bitptr)pointer)<<3)|bit) ;
#define get_bit(bptr) ((bptr)&7)
#define get_value(bptr) (*((char*)((bptr)>>3)))
#define set_bit(bptr) get_value(bptr) |= 1<<get_bit(bptr)
#define clear_bit(bptr) get_value(bptr) &= (~(1<<get_bit(bptr)))
int main(void)
{
char variable=0;
bitptr p ;
p=create_bitptr(&variable,0) ;
set_bit(p) ; p++ ; //1
clear_bit(p) ; p++ ; //0
set_bit(p) ; p++ ; //1
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
printf("%d\n",variable) ;
return 0;
}
With pointers it does not look like possible.But to write or read any bit of the data you can try this one.
unsigned char data;
struct _p
{
unsigned char B0:1;
unsigned char B1:1;
unsigned char B2:1;
unsigned char B3:1;
unsigned char B4:1;
unsigned char B5:1;
unsigned char B6:1;
unsigned char B7:1;
}
int main()
{
data = 15;
_p * point = ( _p * ) & data;
//you can read and write any bit of the byte with point->BX; ( Ex: printf( "%d" , point->B0;point->B5 = 1;
}

how is data stored at bit level according to "Endianness"?

I read about Endianness and understood squat...
so I wrote this
main()
{
int k = 0xA5B9BF9F;
BYTE *b = (BYTE*)&k; //value at *b is 9f
b++; //value at *b is BF
b++; //value at *b is B9
b++; //value at *b is A5
}
k was equal to A5 B9 BF 9F
and (byte)pointer "walk" o/p was 9F BF b9 A5
so I get it bytes are stored backwards...ok.
~
so now I thought how is it stored at BIT level...
I means is "9f"(1001 1111) stored as "f9"(1111 1001)?
so I wrote this
int _tmain(int argc, _TCHAR* argv[])
{
int k = 0xA5B9BF9F;
void *ptr = &k;
bool temp= TRUE;
cout<<"ready or not here I come \n"<<endl;
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
if( temp )
cout<<"1 ";
if( !temp)
cout<<"0 ";
if(i==7||i==15||i==23)
cout<<" - ";
}
}
I get some random output
even for nos. like "32" I dont get anything sensible.
why ?
Just for completeness, machines are described in terms of both byte order and bit order.
The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.
The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).
The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.
Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.
Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.
Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).
Now that we've discussed that you don't need your code... About your code:
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
...
}
This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)
It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.
ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.
What you want to use is bit-masks. Something like this should work:
for (int i = 0; i < 32; i++) {
unsigned int mask = 1 << i;
bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
...
}
Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.
To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.
Byte Endianness
On different machines this code may give different results:
union endian_example {
unsigned long u;
unsigned char a[sizeof(unsigned long)];
} x;
x.u = 0x0a0b0c0d;
int i;
for (i = 0; i< sizeof(unsigned long); i++) {
printf("%u\n", (unsigned)x.a[i]);
}
This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.
Bit Endianness
Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmatic operation on a value. That value is stored in bits (which are in bytes).
Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.
struct thing {
unsigned y:1; // y will be one bit and can have the values 0 and 1
signed z:1; // z can only have the values 0 and -1
unsigned a:2; // a can be 0, 1, 2, or 3
unsigned b:4; // b is just here to take up the rest of the a byte
};
In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.
This line here:
temp = *( (bool*)ptr + i );
... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.
You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".

Resources