Endianness macro in C - c

I recently saw this post about endianness macros in C and I can't really wrap my head around the first answer.
Code supporting arbitrary byte orders, ready to be put into a file
called order32.h:
#ifndef ORDER32_H
#define ORDER32_H
#include <limits.h>
#include <stdint.h>
#if CHAR_BIT != 8
#error "unsupported char size"
#endif
enum
{
O32_LITTLE_ENDIAN = 0x03020100ul,
O32_BIG_ENDIAN = 0x00010203ul,
O32_PDP_ENDIAN = 0x01000302ul
};
static const union { unsigned char bytes[4]; uint32_t value; } o32_host_order =
{ { 0, 1, 2, 3 } };
#define O32_HOST_ORDER (o32_host_order.value)
#endif
You would check for little endian systems via
O32_HOST_ORDER == O32_LITTLE_ENDIAN
I do understand endianness in general. This is how I understand the code:
Create example of little, middle and big endianness.
Compare test case to examples of little, middle and big endianness and decide what type the host machine is of.
What I don't understand are the following aspects:
Why is an union needed to store the test-case? Isn't uint32_t guaranteed to be able to hold 32 bits/4 bytes as needed? And what does the assignment { { 0, 1, 2, 3 } } mean? It assigns the value to the union, but why the strange markup with two braces?
Why the check for CHAR_BIT? One comment mentions that it would be more useful to check UINT8_MAX? Why is char even used here, when it's not guaranteed to be 8 bits wide? Why not just use uint8_t? I found this link to Google-Devs github. They don't rely on this check... Could someone please elaborate?

Why is a union needed to store the test case?
The entire point of the test is to alias the array with the magic value the array will create.
Isn't uint32_t guaranteed to be able to hold 32 bits/4 bytes as needed?
Well, more-or-less. It will but other than 32 bits there are no guarantees. It would fail only on some really fringe architecture you will never encounter.
And what does the assignment { { 0, 1, 2, 3 } } mean? It assigns the value to the union, but why the strange markup with two braces?
The inner brace is for the array.
Why the check for CHAR_BIT?
Because that's the actual guarantee. If that doesn't blow up, everything will work.
One comment mentions that it would be more useful to check UINT8_MAX? Why is char even used here, when it's not guaranteed to be 8 bits wide?
Because in fact it always is, these days.
Why not just use uint8_t? I found this link to Google-Devs github. They don't rely on this check... Could someone please elaborate?
Lots of other choices would work also.

The initialization has two set of braces because the inner braces initialize the bytes array. So byte[0] is 0, byte[1] is 1, etc.
The union allows a uint32_t to lie on the same bytes as the char array and be interpreted in whatever the machine's endianness is. So if the machine is little endian, 0 is in the low order byte and 3 is in the high order byte of value. Conversely, if the machine is big endian, 0 is in the high order byte and 3 is in the low order byte of value.

{{0, 1, 2, 3}} is the initializer for the union, which will result in bytes component being filled with [0, 1, 2, 3].
Now, since the bytes array and the uint32_t occupy the same space, you can read the same value as a native 32-bit integer. The value of that integer shows you how the array was shuffled - which really means which endian system are you using.
There are only 3 popular possibilities here - O32_LITTLE_ENDIAN, O32_BIG_ENDIAN, and O32_PDP_ENDIAN.
As for the char / uint8_t - I don't know. I think it makes more sense to just use uint_8 with no checks.

Related

C - Big-endian struct interconvert with little-endian struct

I have two structs which have the same data members. (one is a big_endian struct, the other is little_endian ) now I have to interconvert with them. But when I code, I found that there are lots of repeated codes with little change. How can I change these codes to be more elegant without repeated code? (repeated code means these code may be similar such as mode == 1 and mode == 2, which only differ in assignment position. It doesn't look elegant but works.)
here is my code:
#pragma scalar_storage_order big-endian
typedef struct {
int a1;
short a2;
char a3;
int a4;
} test_B;
#pragma scalar_storage_order default
typedef struct {
int a1;
short a2;
char a3;
int a4;
} test_L;
void interconvert(test_L *little, test_B *big, int mode) {
// if mode == 1 , convert little to big
// if mode == 2 , convert big to little
// it may be difficult and redundant when the struct has lots of data member!
if(mode == 1) {
big->a1 = little->a1;
big->a2 = little->a2;
big->a3 = little->a3;
big->a4 = little->a4;
}
else if(mode == 2) {
little->a1 = big->a1;
little->a2 = big->a2;
little->a3 = big->a3;
little->a4 = big->a4;
}
else return;
}
Note:The above code must run on gcc-7 or higher ,because of the #pragma scalar_storage_order
An answer was posted which suggested to use memcpy for this problem, but that answer has been deleted. Actually that answer was right, if used correctly, and I want to explain why.
The #pragma specified by the OP is central, as he notes out:
Note: the above code must run on gcc-7 or higher because of the #pragma scalar_storage_order
The struct from the OP:
#pragma scalar_storage_order big-endian
typedef struct {
int a1;
short a2;
char a3;
int a4;
} test_B;
means that the instruction "test_B.a2=256" writes, in the two consecutive bytes belonging to the a2 member, respectively 1 and 0. This is big-endian. The similar instruction "test_L.a2=256" would instead strore the bytes 0 and 1 (little endian).
The following memcpy:
memcpy(&test_L, &test_B, sizeof test_L)
would make the bytes for test_L.a2 equal to 1 and 0, because that is the ram content of test_B.a2. But now, reading test_L.a2 in little endian mode, those two bytes mean 1. We wrote 256 and read back 1. This is exactly the wanted conversion.
To use correctly this mechanism, it is sufficient to write in one struct, memcpy() in the other, and read the other - member by member. What was big-endian becomes little-endian and viceversa. Of course, if the intention is to elaborate data and apply calculations on it, it is important to know what endianness has the data; if it matches the default mode, no transformation has to be done before the calculations, but the transformation has to be applied later. On the contrary, if the incoming data does not match the "default endianness" of the processor, it must be transformed first.
EDIT
After the comment of the OP, below, I investigated more. I took a look at this https://gcc.gnu.org/onlinedocs/gcc/Structure-Layout-Pragmas.html
Well, there are three #pragma available to choose the byte layout: big-endian, little-endian, and default. One of the first two is equal to the last: if the target machine is little-endian, default means little-endian; if it is big-endian, default means big-endian. This is more than logical.
So, doing a memcpy() between big-endian and default does nothing on a big-endian machine; and also this is logical. Ok, better I stress more that memcpy() does absolutely nothing per se: it only moves data from a ram area treated in a certain manner to another area treated in another manner. The two different areas are treated differently only when a normal member access is done: here come to play the #pragma scalar_storage_order. And as I written before, it is important to know what endiannes have the data entering the program. If they come from TCP network, for example, we know that is big-endian; more in general, if it is taken from outside the "program" and respect a protocol, we should know what endianness has.
To convert from an endianness to the other, one should use little and big, NOT default, because that default is surely equal to one of the former two.
Still another edit
Stimulated by comments, and by Jamesdlin who used an online compiler, I tried to do it too. At this url http://tpcg.io/lLe5EW
there is the demonstration that assigning to a member of one struct, memcpy to another, and reading that, the endian conversion is done. That's all.

Bit Field of a specific size and order

There are several times in C in which a type is guaranteed to be at LEAST a certain size, but not necessarily exactly that size (sizeof(int) can result in 2 or 4). However, I need to be absolutely certain of some sizes and memory locations. If I have a union such as below:
typedef union{
struct{
unsigned int a:1, b:1, c:1, d:1, e:1, f:1, g:1, h:1;
};
unsigned int val:8;
} foo;
Is it absolutely guaranteed that the value of val is 8 bits long? Moreover, is it guaranteed that a is the most significant bit of val, and b is the second-most significant bit? I wish to do something like this:
foo performLogicalNOT(foo x){
foo product;
product.val = ~x.val;
return product;
}
And thus with an input of specific flags, return a union with exactly the opposite flags (11001100 -> 00110011). The actual functions are more complex, and require that the size of val be exactly 8. I also want to perform AND and OR in the same manner, so it is crucial that each a and b value be where I expect them to be and the size I expect them to be.
How the bits would be packed are not standard and pretty much implementation defined. Have a look at this answer.
Instead of relying on Union, it is better to use bitmask to derive the values. For the above example, char foo can be used. All operations (like ~) would be done on foo only. To get or set the bit specific values, appropriate bitmask can be used.
#define BITMASK_A 0x80
#define BITMASK_B 0x40
and so on..
To get the value of 'a' bit, use:
foo & BITMASK_A
To set the bit to 1, use:
foo | BITMASK_A
To reset the bit to 0, use:
foo & (~BITMASK_A)

C programming: words from byte array

I have some confusion regarding reading a word from a byte array. The background context is that I'm working on a MIPS simulator written in C for an intro computer architecture class, but while debugging my code I ran into a surprising result that I simply don't understand from a C programming standpoint.
I have a byte array called mem defined as follows:
uint8_t *mem;
//...
mem = calloc(MEM_SIZE, sizeof(uint8_t)); // MEM_SIZE is pre defined as 1024x1024
During some of my testing I manually stored a uint32_t value into four of the blocks of memory at an address called mipsaddr, one byte at a time, as follows:
for(int i = 3; i >=0; i--) {
*(mem+mipsaddr+i) = value;
value = value >> 8;
// in my test, value = 0x1084
}
Finally, I tested trying to read a word from the array in one of two ways. In the first way, I basically tried to read the entire word into a variable at once:
uint32_t foo = *(uint32_t*)(mem+mipsaddr);
printf("foo = 0x%08x\n", foo);
In the second way, I read each byte from each cell manually, and then added them together with bit shifts:
uint8_t test0 = mem[mipsaddr];
uint8_t test1 = mem[mipsaddr+1];
uint8_t test2 = mem[mipsaddr+2];
uint8_t test3 = mem[mipsaddr+3];
uint32_t test4 = (mem[mipsaddr]<<24) + (mem[mipsaddr+1]<<16) +
(mem[mipsaddr+2]<<8) + mem[mipsaddr+3];
printf("test4= 0x%08x\n", test4);
The output of the code above came out as this:
foo= 0x84100000
test4= 0x00001084
The value of test4 is exactly as I expect it to be, but foo seems to have reversed the order of the bytes. Why would this be the case? In the case of foo, I expected the uint32_t* pointer to point to mem[mipsaddr], and since it's 32-bits long, it would just read in all 32 bits in the order they exist in the array (which would be 00001084). Clearly, my understanding isn't correct.
I'm new here, and I did search for the answer to this question but couldn't find it. If it's already been posted, I apologize! But if not, I hope someone can enlighten me here.
It is (among others) explained here: http://en.wikipedia.org/wiki/Endianness
When storing data larger than one byte into memory, it depends on the architecture (means, the CPU) in which order the bytes are stored. Either, the most significant byte is stored first and the least significant byte last, or vice versa. When you read back the individual bytes through byte access operations, and then merge them to form the original value again, you need to consider the endianess of your particular system.
In your for-loop, you are storing your value byte-wise, starting with the most significant byte (counting down the index is a bit misleading ;-). Your memory looks like this afterwards: 0x00 0x00 0x10 0x84.
You are then reading the word back with a single 32 bit (four byte) access. Depending on our architecture, this will either become 0x00001084 (big endian) or 0x84100000 (little endian). Since you get the latter, you are working on a little endian system.
In your second approach, you are using the same order in which you stored the individual bytes (most significant first), so you get back the same value which you stored earlier.
It seems to be a problem of endianness, maybe comes from casting (uint8_t *) to (uint32_t *)

Portability of C code for different memory addressing schemes

If I understand correctly, the DCPU-16 specification for 0x10c describes a 16-bit address space where each offset addresses a 16-bit word, instead of a byte as in most other memory architectures. This has some curious consequences, e.g. I imagine that sizeof(char) and sizeof(short) would both return 1.
Is it feasible to keep C code portable between such different memory addressing schemes? What would be the gotchas to keep in mind?
edit: perhaps I should have given a more specific example. Let's say you have some networking code that deals with byte streams. Do you throw away half of your memory by putting only one byte at each address so that the code can stay the same, or do you generalize everything with bitshifts to deal with N bytes per offset?
edit2: The answers seem to focus on the issue of data type sizes, which wasn't the point - I shouldn't even have mentioned it. The question is about how to cope with losing the ability to address any byte in memory with a pointer. Is it reasonable to expect code to be agnostic about this?
It's totally feasible. Roughly speaking, C's basic integer data types have sizes that uphold:
sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long)
The above is not exactly what the spec says, but it's close.
As pointed out by awoodland in a comment, you'd also expect a C compiler for the DCPU-16 to have CHAR_BIT == 16.
Bonus for not assuming that the DCPU-16 would have sizeof (char) == 2, that's a common fallacy.
When you say, 'losing the ability to address a byte', I assume you mean 'bit-octet', rather than 'char'. Portable code should only assume CHAR_BIT >= 8. In practice, architectures that don't have byte addressing often define CHAR_BIT == 8, and let the compiler generate instructions for accessing the byte.
I actually disagree with the answers suggesting: CHAR_BIT == 16 as a good choice. I'd prefer: CHAR_BIT == 8, with sizeof(short) == 2. The compiler can handle the shifting / masking, just as it does for many RISC architectures, for byte access in this case.
I imagine Notch will revise and clarify the DCPU-16 spec further; there are already requests for an interrupt mechanism, and further instructions. It's an aesthetic backdrop for a game, so I doubt there will be an official ABI spec any time soon. That said, someone will be working on it!
Edit:
Consider an array of char in C. The compiler packs 2 bytes in each native 16-bit word of DCPU memory. So if we access, say, the 10th element (index 9), fetch the word # [9 / 2] = 4, and extract the byte # [9 % 2] = 1.
Let 'X' be the start address of the array, and 'I' be the index:
SET J, I
SHR J, 1 ; J = I / 2
ADD J, X ; J holds word address
SET A, [J] ; A holds word
AND I, 0x1 ; I = I % 2 {0 or 1}
MUL I, 8 ; I = {0 or 8} ; could use: SHL I, 3
SHR A, I ; right shift by I bits for hi or lo byte.
The register A holds the 'byte' - it's a 16 bit register, so the top half can be ignored.
Alternatively, the top half can be zeroed:
AND A, 0xff ; mask lo byte.
This is not optimized, but it conveys the idea.
The equality goes rather like this:
1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
The short type can be 1, and as a matter of fact maybe you'll even want the int type to be 1 too actually (I didn't read the spec, but I'm supposing the normal data type is 16 bit). This stuff is defined by the compiler.
For practicity, the compiler may want to set long to something larger than int even if it requires the compiler doing some extra work (like implementing addition/multiplication etc in software).
This isn't a memory addressing issue, but rather a granularity question.
yes it is entirely possible to port C code
in terms of data transfer it would be advisable to either pack the bits (or use a compression) or send in 16 bit bytes
because the CPU will almost entirely communicate only with (game) internal devices that will likely also be all 16 bit this should be no real problem
BTW I agree that CHAR_BIT should be 16 as (IIRC) each char must be addressable so making CHAR_BIT ==8 will REQUIRE sizeof(char*) ==2 which will make everything else overcomplicated

Using array of chars as an array of long ints

On my AVR I have an array of chars that hold color intensity information in the form of {R,G,B,x,R,G,B,x,...} (x being an unused byte). Is there any simple way to write a long int (32-bits) to char myArray[4*LIGHTS] so I can write a 0x00BBGGRR number easily?
My typecasting is rough, and I'm not sure how to write it. I'm guessing just make a pointer to a long int type and set that equal to myArray, but then I don't know how to arbitrarily tell it to set group x to myColor.
uint8_t myLights[4*LIGHTS];
uint32_t *myRGBGroups = myLights; // ?
*myRGBGroups = WHITE; // sets the first 4 bytes to WHITE
// ...but how to set the 10th group?
Edit: I'm not sure if typecasting is even the proper term, as I think that would be if it just truncated the 32-bit number to 8-bits?
typedef union {
struct {
uint8_t red;
uint8_t green;
uint8_t blue;
uint8_t alpha;
} rgba;
uint32_t single;
} Color;
Color colors[LIGHTS];
colors[0].single = WHITE;
colors[0].rgba.red -= 5;
NOTE: On a little-endian system, the low-order byte of the 4-byte value will be the alpha value; whereas it will be the red value on a big-endian system.
Your code is perfectly valid. You can use myRGBGroups as regular array, so to access 10th pixel you can use
myRGBGroups[9]
Think of using C union, where the first field of the union is a int32 and the second a vector of 4*chars. But, not sure if this is the best way for you.
You need to account for the endianness of uint32_t on the AVR to make sure the components are being stored in the correct order (for later dereferencing via myLights array) if you're going to do this. A quick Google seems to indicate that AVRs store data in memory little-endian, but other registers vary in endianness.
Anyway, assuming you've done that, you can dereference myRGBGroups using array indexing (where each index will reference a block of 4 bytes). So, to set the 10th group, you can just do myRGBGroups[ 9 ] = COLOR.
if can use arithmetic on myRGBgroup, for example myRGBgroups ++ will give next group, similarly you can use plus, minus, etc. operators.those operators operate using type sizes, rather than single byte
myRGBgroups[10] // access group as int
((uint8_t*)(myRGBgroups + 10)) // access group as uint8 array
Union of a struct and uint32_t is a much better idea than making a uint8_t of size 4 * LIGHTS. Another fairly common way to do this is to use macros or inline functions that do the bitwise arithmetic necessary to create the correct uint32_t:
#define MAKE_RGBA32(r,g,b,a) (uint32_t)(((r)<<24)|((g)<<16)|((b)<<8)|(a))
uint32_t colors[NUM_COLORS];
colors[i] = MAKE_RGBA32(255,255,255,255);
Depending on your endianness the values may need to be placed into the int in a different order. This technique is common because for older 16bpp color formats like RGBA5551 or RGB565, it makes more sense to think of the colors in terms of the bitwise arithmetic than in units of bytes.
You can perform something similar using struct assignment - this gets around the endian problem:
typedef struct Color {
unsigned char r, g, b, a;
} Color;
const Color WHITE = {0xff, 0xff, 0xff, 0};
const Color RED = {0xff, 0, 0, 0};
const Color GREEN = {0, 0xff, 0, 0};
Color colors[] = {WHITE, RED};
Color c;
colors[1] = GREEN;
c = colors[1];
However comparison is not defined in the standard you can't use c == GREEN and you can't use the {} shortcut in assignment (only initialisation) so c = {0, 0, 0, 0} would fail.
Also bear in mind that if it's an 8 bit AVR (as opposed to an AVR32 say), then you most likely won't see any performance benefit from either technique.

Resources