Test big endian [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Little vs Big Endianess: How to interpret the test
Is there an easy method to test code with gcc or any online compiler like ideone for big endian? I don't want to use qemu or virtual machines
EDIT
Can someone explain the behavior of this piece of code on a system using big endian?
#include <stdio.h>
#include <string.h>
#include <stdint.h>
int main (void)
{
int32_t i;
unsigned char u[4] = {'a', 'b', 'c', 'd'};
memcpy(&i, u, sizeof(u));
printf("%d\n", i);
memcpy(u, &i, sizeof(i));
for (i = 0; i < 4; i++) {
printf("%c", u[i]);
}
printf("\n");
return 0;
}

As a program?
#include <stdio.h>
#include <stdint.h>
int main(int argc, char** argv) {
union {
uint32_t word;
uint8_t bytes[4];
} test_struct;
test_struct.word = 0x1;
if (test_struct.bytes[0] != 0)
printf("little-endian\n");
else
printf("big-endian\n");
return 0;
}
On a little-endian architecture, the least significant byte is stored first. On a big-endian architecture, the most-significant byte is stored first. So by overlaying a uint32_t with a uint8_t[4], I can check to see which byte comes first. See: http://en.wikipedia.org/wiki/Big_endian
GCC in particular defines the __BYTE_ORDER__ macro as an extension. You can test against __ORDER_BIG_ENDIAN__, __ORDER_LITTLE_ENDIAN__, and __ORDER_PDP_ENDIAN__ (which I didn't know existed!) -- see http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html
See also http://en.wikipedia.org/wiki/Big_endian
As for running code in an endianness that doesn't match your machine's native endianness, then you're going to have to compile and run it on an architecture that has that different endianness. So you are going to need to cross-compile, and run on an emulator or virtual machine.
edit: ah, I didn't see the first printf().
The first printf will print "1633837924", since a big-endian machine will interpret the 'a' character as the most significant byte in the int.
The second printf will just print "abcd", since the value of u has been copied byte-by-byte back and forth from i.

Related

Checking Endianness of RISC-V machine using C-code

Can someone please help me out with this. There is a C-code which most of you are familiar with, it checks the endian-ness of a machine.
What would be the result if it runs on a RISC-V machine?
Code is mentioned as below:
#include <cstdio>
int main()
{
int x = 1;
char* p = (char*)&x;
printf("%d\n",(int)*p);
return 0;
}
The program is valid regardless of the platform.
The output is 1 for a little-endian computer or a computer where sizeof (int) == sizeof (char). It will be 0 for all other platforms.
Since RISC-V is little-endian then the output should be 1.

How can I copy 4 letter ascii word to buffer in C?

I am trying to copy the word: 0x0FF0 to a buffer but unable to do so.
Here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <math.h>
#include <time.h>
#include <linux/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
void print_bits(unsigned int x);
int main(int argc, char *argv[])
{
char buffer[512];
unsigned int init = 0x0FF0;
unsigned int * som = &init;
printf("print bits of som now: \n");
print_bits(init);
printf("\n");
memset(&buffer[0], 0, sizeof(buffer)); // reinitialize the buffer
memcpy(buffer, som, 4); // copy word to the buffer
printf("print bits of buffer[0] now: \n");
print_bits(buffer[0]);
printf("\n");
return 0;
}
void print_bits(unsigned int x)
{
int i;
for (i = 8 * sizeof(x)-17; i >= 0; i--) {
(x & (1 << i)) ? putchar('1') : putchar('0');
}
printf("\n");
}
this is the result I get in the console:
Why am I getting different values from the bit printing if I am using memcpy?
Don't know if it has something to do with big-little-endian but I am losing 4 bits of 1's here so in both of the methods it shouldn't happen.
When you call
print_bits(buffer[0]);
you're taking just one byte out of the buffer, converting it to unsigned int, and passing that to the function. The other bytes in buffer are ignored.
You are mixing up types and relying on specific settings of your architecture/platform; This already breaks your existing code, and it may get even more harmful once you compile with different settings.
Your buffer is of type char[512], while your init is of type unsigned int.
First, it depends on the settings whether char is signed or unsigned char. This is actually relevant, since it influences how a char-value is promoted to an unsigned int-value. See the following code that demonstrated the difference using explicitly signed and unsigned chars:
signed char c = 0xF0;
unsigned char uc = c;
unsigned int ui_from_c = c;
unsigned int ui_from_uc = uc;
printf("Singned char c:%hhd; Unsigned char uc:%hhu; ui_from_c:%u ui_from_uc:%u\n", c, uc, ui_from_c,ui_from_uc);
// output: Singned char c:-16; Unsigned char uc:240; ui_from_c:4294967280 ui_from_uc:240
Second, int may be represented by 4 or by 8 bytes (which can hold a "word"), yet char will typically be 1 byte and can therefore not hold a "word" of 16 bit.
Third, architectures can be big endian or little endian, and this influences where a constant like 0x0FF0, which requires 2 bytes, would actually be located in a 4 or 8 byte integral representation.
So it is for sure that buffer[0] selects just a portion of that what you think it does, the portion might get promoted in the wrong way to an unsigned int, and it might even be a portion completely out of the 0x0FF0-literal.
I'd suggest to use fixed-width integral values representing exactly a word throughout:
#include <stdio.h>
#include <stdint.h>
void print_bits(uint16_t x);
int main(int argc, char *argv[])
{
uint16_t buffer[512];
uint16_t init = 0x0FF0;
uint16_t * som = &init;
printf("print bits of som now: \n");
print_bits(init);
printf("\n");
memset(buffer, 0, sizeof(buffer)); // reinitialize the buffer
memcpy(buffer, som, sizeof(*som)); // copy word to the buffer
printf("print bits of buffer[0] now: \n");
print_bits(buffer[0]);
printf("\n");
return 0;
}
void print_bits(uint16_t x)
{
int i;
for (i = 8 * sizeof(x); i >= 0; i--) {
(x & (1 << i)) ? putchar('1') : putchar('0');
}
printf("\n");
}
You are not writing the bytes "0F F0" to the buffer. You are writing whatever bytes your platform uses internally to store the number 0x0FF0. There is no reason these need to be the same.
When you write 0x0FF0 in C, that means, roughly, "whatever my implementation uses to encode the number four thousand eighty". That might be the byte string 0F, F0. But it might not be.
I mean, how weird would it be if unsigned int init = 0x0FF0; and unsigned int init = 4080; would do the same thing on some platforms and different things on others? But surely not all platforms store the number 4,080 using the byte string "0F F0".
For example, I might store the number ten as "10" or "ten" or any number of other ways. It's unreasonable for you to expect "ten", "10", or any other particular byte sequence to appear in memory just because you stored the number ten unless you do happen to specifically know how your platform stores the number ten. Given that you asked this question, you don't know that.
Also, you are only printing the value of buffer[0], which is a single character. So it couldn't possibly hold any version of 0x0FF0.

Casting uint8_t array into uint16_t value in C

I'm trying to convert a 2-byte array into a single 16-bit value. For some reason, when I cast the array as a 16-bit pointer and then dereference it, the byte ordering of the value gets swapped.
For example,
#include <stdint.h>
#include <stdio.h>
main()
{
uint8_t a[2] = {0x15, 0xaa};
uint16_t b = *(uint16_t*)a;
printf("%x\n", (unsigned int)b);
return 0;
}
prints aa15 instead of 15aa (which is what I would expect).
What's the reason behind this, and is there an easy fix?
I'm aware that I can do something like uint16_t b = a[0] << 8 | a[1]; (which does work just fine), but I feel like this problem should be easily solvable with casting and I'm not sure what's causing the issue here.
As mentioned in the comments, this is due to endianness.
Your machine is little-endian, which (among other things) means that multi-byte integer values have the least significant byte first.
If you compiled and ran this code on a big-endian machine (ex. a Sun), you would get the result you expect.
Since your array is set up as big-endian, which also happens to be network byte order, you could get around this by using ntohs and htons. These functions convert a 16-bit value from network byte order (big endian) to the host's byte order and vice versa:
uint16_t b = ntohs(*(uint16_t*)a);
There are similar functions called ntohl and htonl that work on 32-bit values.
This is because of the endianess of your machine.
In order to make your code independent of the machine consider the following function:
#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1
int endian() {
int i = 1;
char *p = (char *)&i;
if (p[0] == 1)
return LITTLE_ENDIAN;
else
return BIG_ENDIAN;
}
So for each case you can choose which operation to apply.
You cannot do anything like *(uint16_t*)a because of the strict aliasing rule. Even if code appears to work for now, it may break later in a different compiler version.
A correct version of the code could be:
b = ((uint16_t)a[0] << CHAR_BIT) + a[1];
The version suggested in your question involving a[0] << 8 is incorrect because on a system with 16-bit int, this may cause signed integer overflow: a[0] promotes to int, and << 8 means * 256.
This might help to visualize things. When you create the array you have two bytes in order. When you print it you get the human readable hex value which is the opposite of the little endian way it was stored. The value 1 in little endian as a uint16_t type is stored as follows where a0 is a lower address than a1...
a0 a1
|10000000|00000000
Note, the least significant byte is first, but when we print the value in hex it the least significant byte appears on the right which is what we normally expect on any machine.
This program prints a little endian and big endian 1 in binary starting from least significant byte...
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <arpa/inet.h>
void print_bin(uint64_t num, size_t bytes) {
int i = 0;
for(i = bytes * 8; i > 0; i--) {
(i % 8 == 0) ? printf("|") : 1;
(num & 1) ? printf("1") : printf("0");
num >>= 1;
}
printf("\n");
}
int main(void) {
uint8_t a[2] = {0x15, 0xaa};
uint16_t b = *(uint16_t*)a;
uint16_t le = 1;
uint16_t be = htons(le);
printf("Little Endian 1\n");
print_bin(le, 2);
printf("Big Endian 1 on little endian machine\n");
print_bin(be, 2);
printf("0xaa15 as little endian\n");
print_bin(b, 2);
return 0;
}
This is the output (this is Least significant byte first)
Little Endian 1
|10000000|00000000
Big Endian 1 on little endian machine
|00000000|10000000
0xaa15 as little endian
|10101000|01010101

C program to check little vs. big endian [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C Macro definition to determine big endian or little endian machine?
int main()
{
int x = 1;
char *y = (char*)&x;
printf("%c\n",*y+48);
}
If it's little endian it will print 1. If it's big endian it will print 0. Is that correct? Or will setting a char* to int x always point to the least significant bit, regardless of endianness?
In short, yes.
Suppose we are on a 32-bit machine.
If it is little endian, the x in the memory will be something like:
higher memory
----->
+----+----+----+----+
|0x01|0x00|0x00|0x00|
+----+----+----+----+
A
|
&x
so (char*)(&x) == 1, and *y+48 == '1'. (48 is the ascii code of '0')
If it is big endian, it will be:
+----+----+----+----+
|0x00|0x00|0x00|0x01|
+----+----+----+----+
A
|
&x
so this one will be '0'.
The following will do.
unsigned int x = 1;
printf ("%d", (int) (((char *)&x)[0]));
And setting &x to char * will enable you to access the individual bytes of the integer, and the ordering of bytes will depend on the endianness of the system.
This is big endian test from a configure script:
#include <inttypes.h>
int main(int argc, char ** argv){
volatile uint32_t i=0x01234567;
// return 0 for big endian, 1 for little endian.
return (*((uint8_t*)(&i))) == 0x67;
}
Thought I knew I had read about that in the standard; but can't find it. Keeps looking. Old; answering heading; not Q-tex ;P:
The following program would determine that:
#include <stdio.h>
#include <stdint.h>
int is_big_endian(void)
{
union {
uint32_t i;
char c[4];
} e = { 0x01000000 };
return e.c[0];
}
int main(void)
{
printf("System is %s-endian.\n",
is_big_endian() ? "big" : "little");
return 0;
}
You also have this approach; from Quake II:
byte swaptest[2] = {1,0};
if ( *(short *)swaptest == 1) {
bigendien = false;
And !is_big_endian() is not 100% to be little as it can be mixed/middle.
Believe this can be checked using same approach only change value from 0x01000000 to i.e. 0x01020304 giving:
switch(e.c[0]) {
case 0x01: BIG
case 0x02: MIX
default: LITTLE
But not entirely sure about that one ...

C memcpy issues with unsigned char array

I have a question about memcpy that I hope someone can answer. Here's a short demonstrative program:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main (int argc, char **argv){
unsigned char buffer[10];
unsigned short checksum = 0x1234;
int i;
memset(buffer, 0x00, 10);
memcpy(buffer, (const unsigned char*)&checksum, 2);
for(i = 0; i < 10; i ++){
printf("%02x",buffer[i]);
}
printf("\n");
return 0;
}
When I run this program, I get 34120000000000000000.
My question is why don't I get 12340000000000000000?
Thanks so much
You are getting 34120000000000000000 because you are on a little-endian system. You would get 12340000000000000000 on a big-endian system. Endianness gives a full discussion of big-endian vs. little-endian systems.
little endian/big endian architecture ?
which mean that 2 byte of checksum is inverted.
It is just a guess, if my answer is not true Comment it and I will delete it.
Intel's CPUs are little endian, they store numbers little word first
This is apparently evidence that Intel don't do inhouse drug testing.

Resources