This question already has answers here:
Macro definition to determine big endian or little endian machine?
(22 answers)
Closed 2 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Is there a safe, portable way to determine (during compile time) the endianness of the platform that my program is being compiled on? I'm writing in C.
[EDIT]
Thanks for the answers, I decided to stick with the runtime solution!
To answer the original question of a compile-time check, there's no standardized way to do it that will work across all existing and all future compilers, because none of the existing C, C++, and POSIX standards define macros for detecting endianness.
But, if you're willing to limit yourself to some known set of compilers, you can look up each of those compilers' documentations to find out which predefined macros (if any) they use to define endianness. This page lists several macros you can look for, so here's some code which would work for those:
#if defined(__BYTE_ORDER) && __BYTE_ORDER == __BIG_ENDIAN || \
defined(__BIG_ENDIAN__) || \
defined(__ARMEB__) || \
defined(__THUMBEB__) || \
defined(__AARCH64EB__) || \
defined(_MIBSEB) || defined(__MIBSEB) || defined(__MIBSEB__)
// It's a big-endian target architecture
#elif defined(__BYTE_ORDER) && __BYTE_ORDER == __LITTLE_ENDIAN || \
defined(__LITTLE_ENDIAN__) || \
defined(__ARMEL__) || \
defined(__THUMBEL__) || \
defined(__AARCH64EL__) || \
defined(_MIPSEL) || defined(__MIPSEL) || defined(__MIPSEL__)
// It's a little-endian target architecture
#else
#error "I don't know what architecture this is!"
#endif
If you can't find what predefined macros your compiler uses from its documentation, you can also try coercing it to spit out its full list of predefined macros and guess from there what will work (look for anything with ENDIAN, ORDER, or the processor architecture name in it). This page lists a number of methods for doing that in different compilers:
Compiler C macros C++ macros
Clang/LLVM clang -dM -E -x c /dev/null clang++ -dM -E -x c++ /dev/null
GNU GCC/G++ gcc -dM -E -x c /dev/null g++ -dM -E -x c++ /dev/null
Hewlett-Packard C/aC++ cc -dM -E -x c /dev/null aCC -dM -E -x c++ /dev/null
IBM XL C/C++ xlc -qshowmacros -E /dev/null xlc++ -qshowmacros -E /dev/null
Intel ICC/ICPC icc -dM -E -x c /dev/null icpc -dM -E -x c++ /dev/null
Microsoft Visual Studio (none) (none)
Oracle Solaris Studio cc -xdumpmacros -E /dev/null CC -xdumpmacros -E /dev/null
Portland Group PGCC/PGCPP pgcc -dM -E (none)
Finally, to round it out, the Microsoft Visual C/C++ compilers are the odd ones out and don't have any of the above. Fortunately, they have documented their predefined macros here, and you can use the target processor architecture to infer the endianness. While all of the currently supported processors in Windows are little-endian (_M_IX86, _M_X64, _M_IA64, and _M_ARM are little-endian), some historically supported processors like the PowerPC (_M_PPC) were big-endian. But more relevantly, the Xbox 360 is a big-endian PowerPC machine, so if you're writing a cross-platform library header, it can't hurt to check for _M_PPC.
This is for compile time checking
You could use information from the boost header file endian.hpp, which covers many platforms.
edit for runtime checking
bool isLittleEndian()
{
short int number = 0x1;
char *numPtr = (char*)&number;
return (numPtr[0] == 1);
}
Create an integer, and read its first byte (least significant byte). If that byte is 1, then the system is little endian, otherwise it's big endian.
edit Thinking about it
Yes you could run into a potential issue in some platforms (can't think of any) where sizeof(char) == sizeof(short int). You could use fixed width multi-byte integral types available in <stdint.h>, or if your platform doesn't have it, again you could adapt a boost header for your use: stdint.hpp
With C99, you can perform the check as:
#define I_AM_LITTLE (((union { unsigned x; unsigned char c; }){1}).c)
Conditionals like if (I_AM_LITTLE) will be evaluated at compile-time and allow the compiler to optimize out whole blocks.
I don't have the reference right off for whether this is strictly speaking a constant expression in C99 (which would allow it to be used in initializers for static-storage-duration data), but if not, it's the next best thing.
Interesting read from the C FAQ:
You probably can't. The usual techniques for detecting endianness
involve pointers or arrays of char, or maybe unions, but preprocessor
arithmetic uses only long integers, and there is no concept of
addressing. Another tempting possibility is something like
#if 'ABCD' == 0x41424344
but this isn't reliable, either.
I would like to extend the answers for providing a constexpr function for C++
union Mix {
int sdat;
char cdat[4];
};
static constexpr Mix mix { 0x1 };
constexpr bool isLittleEndian() {
return mix.cdat[0] == 1;
}
Since mix is constexpr too it is compile time and can be used in constexpr bool isLittleEndian(). Should be safe to use.
Update
As #Cheersandhth pointed out below, these seems to be problematic.
The reason is, that it is not C++11-Standard conform, where type punning is forbidden. There can always only one union member be active at a time. With a standard conforming compiler you will get an error.
So, don't use it in C++. It seems, you can do it in C though. I leave my answer in for educational purposes :-) and because the question is about C...
Update 2
This assumes that int has the size of 4 chars, which is not always given as #PetrVepřek correctly pointed out below. To make your code truly portable you have to be more clever here. This should suffice for many cases though. Note that sizeof(char) is always 1, by definition. The code above assumes sizeof(int)==4.
Use CMake TestBigEndian as
INCLUDE(TestBigEndian)
TEST_BIG_ENDIAN(ENDIAN)
IF (ENDIAN)
# big endian
ELSE (ENDIAN)
# little endian
ENDIF (ENDIAN)
Not during compile time, but perhaps during runtime. Here's a C function I wrote to determine endianness:
/* Returns 1 if LITTLE-ENDIAN or 0 if BIG-ENDIAN */
#include <inttypes.h>
int endianness()
{
union { uint8_t c[4]; uint32_t i; } data;
data.i = 0x12345678;
return (data.c[0] == 0x78);
}
From Finally, one-line endianness detection in the C preprocessor:
#include <stdint.h>
#define IS_BIG_ENDIAN (*(uint16_t *)"\0\xff" < 0x100)
Any decent optimizer will resolve this at compile-time. gcc does at -O1.
Of course stdint.h is C99. For ANSI/C89 portability see Doug Gwyn's Instant C9x library.
I took it from rapidjson library:
#define BYTEORDER_LITTLE_ENDIAN 0 // Little endian machine.
#define BYTEORDER_BIG_ENDIAN 1 // Big endian machine.
//#define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
#ifndef BYTEORDER_ENDIAN
// Detect with GCC 4.6's macro.
# if defined(__BYTE_ORDER__)
# if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
# else
# error "Unknown machine byteorder endianness detected. User needs to define BYTEORDER_ENDIAN."
# endif
// Detect with GLIBC's endian.h.
# elif defined(__GLIBC__)
# include <endian.h>
# if (__BYTE_ORDER == __LITTLE_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif (__BYTE_ORDER == __BIG_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
# else
# error "Unknown machine byteorder endianness detected. User needs to define BYTEORDER_ENDIAN."
# endif
// Detect with _LITTLE_ENDIAN and _BIG_ENDIAN macro.
# elif defined(_LITTLE_ENDIAN) && !defined(_BIG_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif defined(_BIG_ENDIAN) && !defined(_LITTLE_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
// Detect with architecture macros.
# elif defined(__sparc) || defined(__sparc__) || defined(_POWER) || defined(__powerpc__) || defined(__ppc__) || defined(__hpux) || defined(__hppa) || defined(_MIPSEB) || defined(_POWER) || defined(__s390__)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
# elif defined(__i386__) || defined(__alpha__) || defined(__ia64) || defined(__ia64__) || defined(_M_IX86) || defined(_M_IA64) || defined(_M_ALPHA) || defined(__amd64) || defined(__amd64__) || defined(_M_AMD64) || defined(__x86_64) || defined(__x86_64__) || defined(_M_X64) || defined(__bfin__)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif defined(_MSC_VER) && (defined(_M_ARM) || defined(_M_ARM64))
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# else
# error "Unknown machine byteorder endianness detected. User needs to define BYTEORDER_ENDIAN."
# endif
#endif
I once used a construct like this one:
uint16_t HI_BYTE = 0,
LO_BYTE = 1;
uint16_t s = 1;
if(*(uint8_t *) &s == 1) {
HI_BYTE = 1;
LO_BYTE = 0;
}
pByte[HI_BYTE] = 0x10;
pByte[LO_BYTE] = 0x20;
gcc with -O2 was able to make it completely compile time. That means, the HI_BYTE and LO_BYTE variables were replaced entirely and even the pByte acces was replaced in the assembler by the equivalent of *(unit16_t *pByte) = 0x1020;.
It's as compile time as it gets.
To my knowledge no, not during compile time.
At run-time, you can do trivial checks such as setting a multi-byte value to a known bit string and inspect what bytes that results in. For instance using a union,
typedef union {
uint32_t word;
uint8_t bytes[4];
} byte_check;
or casting,
uint32_t word;
uint8_t * bytes = &word;
Please note that for completely portable endianness checks, you need to take into account both big-endian, little-endian and mixed-endian systems.
For my part, I decided to use an intermediate approach: try the macros, and if they don't exist, or if we can't find them, then do it in runtime. Here is one that works on the GNU-compiler:
#define II 0x4949 // arbitrary values != 1; examples are
#define MM 0x4D4D // taken from the TIFF standard
int
#if defined __BYTE_ORDER__ && __BYTE_ORDER__ == __LITTLE_ENDIAN
const host_endian = II;
# elif defined __BYTE_ORDER__ && __BYTE_ORDER__ == __BIG__ENDIAN
const host_endian = MM;
#else
#define _no_BYTE_ORDER
host_endian = 1; // plain "int", not "int const" !
#endif
and then, in the actual code:
int main(int argc, char **argv) {
#ifdef _no_BYTE_ORDER
host_endian = * (char *) &host_endian ? II : MM;
#undef _no_BYTE_ORDER
#endif
// .... your code here, for instance:
printf("Endedness: %s\n", host_endian == II ? "little-endian"
: "big-endian");
return 0;
}
On the other hand, as the original poster noted, the overhead of a runtime check is so little (two lines of code, and micro-seconds of time) that it's hardly worth the bother to try and do it in the preprocessor.
Related
I'm designing custom network protocol and I need to send uint64_t variable (representing file's length in bytes) through socket in portable and POSIX-compliant manner.
Unfortunately manual says that integer types with width 64 are not guaranteed to exist:
If an implementation provides integer types with width 64 that meet these requirements, then the following types are required: int64_t uint64_t
What's more there is no POSIX-compliant equivalent of htonl, htons, ntohl, ntohs (note that bswap_64 is not POSIX-compliant).
What is the best practice to send 64-bit variable through socket?
You can just apply htonl() twice, of course:
const uint64_t x = ...
const uint32_t upper_be = htonl(x >> 32);
const uint32_t lower_be = htonl((uint32_t) x);
This will give you two 32-bit variables containing big-endian versions of the upper and lower 32-bit halves of the 64-bit variable x.
If you are strict POSIX, you can't use uint64_t since it's not guaranteed to exist. Then you can do something like:
typedef struct {
uint32_t upper;
uint32_t lower;
} my_uint64;
And just htonl() those directly, of course.
My personal favorite is a macro... mine looks similar to this and checks for local byte ordering before deciding how to handle the byte ordering:
// clang-format off
#if !defined(__BIG_ENDIAN__) && !defined(__LITTLE_ENDIAN__)
# if defined(__has_include)
# if __has_include(<endian.h>)
# include <endian.h>
# elif __has_include(<sys/endian.h>)
# include <sys/endian.h>
# endif
# endif
# if !defined(__LITTLE_ENDIAN__) && \
(defined(__BIG_ENDIAN__) || __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
# define __BIG_ENDIAN__
# define bswap64(i) (i) // do nothing
# else
# define __LITTLE_ENDIAN__
# define bswap64(i) ((((i)&0xFFULL) << 56) | (((i)&0xFF00ULL) << 40) | \
(((i)&0xFF0000ULL) << 24) | (((i)&0xFF000000ULL) << 8) | \
(((i)&0xFF00000000ULL) >> 8) | (((i)&0xFF0000000000ULL) >> 24) | \
(((i)&0xFF000000000000ULL) >> 40) | \
(((i)&0xFF00000000000000ULL) >> 56))
# endif
#endif
Assuming a POSIX platform with C99 or greater, {u,}int64_t are not required to exist but {u,}int_{least,fast}64_t are.
Additionally, POSIX requires {u,}int{8,16,32}_t.
So what you can do is:
#include <stdint.h>
//host-to-network (native endian to big endian)
void hton64(unsigned char *B, uint_least64_t X)
{
B[0]=X>>56&0xFF;
B[1]=X>>48&0xFF;
B[2]=X>>40&0xFF;
B[3]=X>>32&0xFF;
B[4]=X>>24&0xFF;
B[5]=X>>16&0xFF;
B[6]=X>>8&0xFF;
B[7]=X>>0&0xFF;
}
//network-to-host (big endian to native endian)
uint_least64_t ntoh64(unsigned char const *B)
{
return (uint_least64_t)B[0]<<56|
(uint_least64_t)B[1]<<48|
(uint_least64_t)B[2]<<40|
(uint_least64_t)B[3]<<32|
(uint_least64_t)B[4]<<24|
(uint_least64_t)B[5]<<16|
(uint_least64_t)B[6]<<8|
(uint_least64_t)B[7]<<0;
}
If the machine has uint64_t, then uint_least64_t will be (due to requirements imposed by the C standard) identical to uint64_t.
If it doesn't, then uint_least64_t might not be 2's-complement or it might have more value bits (I have no idea if there are such architectures), but regardless of that, the above routines will send or receive exactly (if there's more) 64 lower-order bits of it (to or from a buffer).
(Anyway, this solutionshould be good as a generic backend, but if you want to be slightly more optimal, then you can try to first detect your endianness and do nothing if it's a big endian platform; if it's a little endian and sizeof(uint_least64_t)*CHAR_BIT==64, then if you can detect you have byteswap.h with bswap_64, then you should use that as it's likely to compile down to a single instruction. If all else fails, I'd use something like the above.)
I wrote a dovecot plugin in C the last days. My source code itself seems to be quite fine, but I'm currently wondering how to compile it or how to have a more dynamical Makefile.
The problem is, that whenever I try to compile, I get the error Error: unknown type name: »uoff_t«
The problem is, that this type is defined in one referenced library in this way:
#if defined (HAVE_UOFF_T)
/* native support */
#elif defined (UOFF_T_INT)
typedef unsigned int uoff_t;
#elif defined (UOFF_T_LONG)
typedef unsigned long uoff_t;
#elif defined (UOFF_T_LONG_LONG)
typedef unsigned long long uoff_t;
#else
# error uoff_t size not set
#endif
Within dovecot's Autoconf these variables are set based on another type:
AC_CHECK_TYPE(uoff_t, [
have_uoff_t=yes
AC_DEFINE(HAVE_UOFF_T,, Define if you have a native uoff_t type)
], [
have_uoff_t=no
])
AC_TYPEOF(off_t, long int long-long)
case "$typeof_off_t" in
int)
offt_max=INT_MAX
uofft_fmt="u"
if test "$have_uoff_t" != "yes"; then
AC_DEFINE(UOFF_T_INT,, Define if off_t is int)
fi
offt_bits=`expr 8 \* $ac_cv_sizeof_int`
;;
long)
offt_max=LONG_MAX
uofft_fmt="lu"
if test "$have_uoff_t" != "yes"; then
AC_DEFINE(UOFF_T_LONG,, Define if off_t is long)
fi
offt_bits=`expr 8 \* $ac_cv_sizeof_long`
;;
"long long")
offt_max=LLONG_MAX
uofft_fmt="llu"
if test "$have_uoff_t" != "yes"; then
AC_DEFINE(UOFF_T_LONG_LONG,, Define if off_t is long long)
fi
offt_bits=`expr 8 \* $ac_cv_sizeof_long_long`
;;
*)
AC_MSG_ERROR([Unsupported off_t type])
;;
esac
So after all my question is, whether I can have this stuff in an equivalent way in my Makefile without using Automake.
My goal is to check, whether uoff_t is defined already somewhere (for HAVE_UOFF_T) or how the type off_t is defined (for the other parameters).
Any ideas, or am I missing something?
Thanks in advance!
Obviously what I tried to do seems not to be possible.
I ended up digging myself into autoconf and reusing dovecot's generic definition.
Thanks anyway!
Cheers
My first idea is that if it is a Dovecot plugin (which I know nothing about) doesn't it have to include Dovecot include files?
Wouldn't those include files have already been through the Dovecot autoconf and have all the right values for the uoff_t type?
So the first thing that I would try is to just rely on the Dovecot definition.
I suppose the other thing to do is something that I have done on occasion. Reproduce the autoconf tests inside your Makefile. But I have to warn you it tends to look really ugly. Sort of like this:
TESTOPTTMP:=$(shell mkdir -p tmp; mktemp tmp/test_opt_XXXXXXXXXX)
CFLAGS += $(shell (echo '\#include <pthread.h>'; echo '__thread int global; int main() { global = 1; return 0; }') | gcc -x c -D_GNU_SOURCE -pthread -o $(TESTOPTTMP) - >>/dev/stderr 2>&1; $(TESTOPTTMP) && echo '-DHAVE_TLS'; rm -f $(TESTOPTTMP))
That was for testing compiler support for __thread. Ugly, isn't it.
For example, does MIN_N_THINGIES below compile to 2? Or will I recompute the division every time I use the macro in code (e.g. recomputing the end condition of a for loop each iteration).
#define MAX_N_THINGIES (10)
#define MIN_N_THINGIES ((MAX_N_THINGIES) / 5)
uint8_t i;
for (i = 0; i < MIN_N_THINGIES; i++) {
printf("hi");
}
This question stems from the fact that I'm still learning about the build process. Thanks!
If you pass -E to gcc it will show what the preprocessor stage outputted.
gcc -E test.c | tail -n11
Outputs:
# 3 "test.c" 2
int main() {
uint8_t i;
for (i = 0; i < ((10) / 5); i++) {
printf("hi");
}
return 0;
}
Then if you pass -s flag to gcc you will see that the division was optimized out. If you also pass the -o flag you can set the output files and diff them to see that they generated the same code.
gcc -S test.c -o test-with-div.s
edit test.c to make MIN_N_THINGIES equal a const 2
gcc -S test.c -o test-constant.s
diff test-with-div.s test-constant.s
// for educational purposes you should look at the .s files generated.
Then as mentioned in another comment you can change the optimization flag by using -O...
gcc -S test.c -O2 -o test-unroll-loop.s
Will unroll the for loop even such that there isn't even a loop.
Preprocessor will replace MIN_N_THINGIES with ((10)/5), then it is up to the compiler to optimize ( or not ) the expression.
Maybe. The standard does not mandate that it is or it is not. On most compilers it will do after passing optimization flags (for example gcc with -O0 does not do it while with -O2 it even unrolls the loop).
Modern compilers perform even much more complicated techniques (vectorization, loop skewing, blocking ...). However unless you really care about performance, for ex. you program HPC, program real time system etc., you probably should not care about the output of the compiler - unless you're just interested (and yes - compilers can be a fascinating subject).
No. The preprocessor does not calculate macros, they're handled by the compiler. The preprocessor can calculate arithmetic expressions (no floating point values) in #if conditionals though.
Macros are simply text substitutions.
Note that the expanded macros can still be calculated and optimized by the compiler, it's just that it's not done by the preprocessor.
The standard mandates that some expressions are evaluated at compile time. But note that the preprocessor does just text splicing (well, almost) when the macro is called, so if you do:
#define A(x) ((x) / (S))
#define S 5
A(10) /* Gives ((10) / (5)) == 2 */
#undef S
#define S 2
A(20) /* Gives ((20) / (2)) == 10 */
The parenteses are to avoid idiocies like:
#define square(x) x * x
square(a + b) /* Gets you a + b * a + b, not the expected square */
After preprocessing, the result is passed to the compiler proper, which does (most of) the computation in the source that the standard requests. Most compilers will do a lot of constant folding, i.e., computing (sub)expressions made of known constants, as this is simple to do.
To see the expansions, it is useful to write a *.c file of a few lines, just with the macros to check, and run it just through the preprocessor (typically someting like cc -E file.c) and check the output.
Here's my code:
#include <stdio.h>
#include <CL/cl.h>
#include <CL/cl_platform.h>
int main(){
cl_float3 f3 = (cl_float3){1, 1, 1};
cl_float3 f31 = (cl_float3) {2, 2, 2};
cl_float3 f32 = (cl_float3) {2, 2, 2};
f3 = f31 + f32;
printf("%g %g %g \n", f3.x, f3.y, f3.z);
return 0;
}
When compiling with gcc 4.6, it produces the error
test.c:14:11: error: invalid operands to binary + (have ‘cl_float3’ and ‘cl_float3’)
Very strange to me, because the OpenCL Specification demontrates in section 6.4 just that, an addition of two floatn. Do I need to include any other headers?
But even more strange is that when compiling with -std=c99 I get errors like
test.c:16:26: error: ‘cl_float3’ has no member named ‘x’
..for all components (x, y and z)...
The reason for the compilation problem with structure subscripts can be seen in the implementation of the standard in the AMD SDK.
If you look at the <CL/cl_platform.h> header from AMD toolkit you could see how the structures are defined.
typedef cl_float4 cl_float3;
typedef union
{
cl_float CL_ALIGNED(16) s[4];
#if (defined( __GNUC__) || defined( __IBMC__ )) && ! defined( __STRICT_ANSI__ )
__extension__ struct{ cl_float x, y, z, w; };
....
#endif
}cl_float4;
The #if clause is ignored when gcc is invoked with --std=c99.
To make your code work with --std=c99 you could replace references to f3.x with f3.s[0] and so on.
OpenCL programs consist of two parts.
A program which runs on the host. This is normally written in C or C++, but it's nothing special except that it uses the API described in sections 4 & 5 of the OpenCL Specification.
A kernel which runs on the OpenCL device (normally a GPU). This is written in the language specified in section 6. This isn't C, but it's close. It adds things like vector operations (like you're trying to use). This is compiled by the host program passing a string which contains the kernel code to OpenCL via the API.
You've confused the two, and tried to use features of the kernel language in the host code.
cl_float.v is another option:
#include <assert.h>
#include <CL/cl.h>
int main(void) {
cl_float4 f = {{1, 2, 3, 4}};
cl_float4 g = {{5, 6, 7, 8}};
cl_float4 h;
h.v4 = f.v4 + g.v4;
assert(h.s[0] == 6);
assert(h.s[1] == 8);
return EXIT_SUCCESS;
}
which can be run as:
gcc -std=c89 -Wall -Wextra tmp.c -lOpenCL && ./a.out
in Ubuntu 16.10, gcc 6.2.0.
v is defined in Linux GCC x86 via GCC vector extensions.
The file https://github.com/KhronosGroup/OpenCL-Headers/blob/bf0f43b76f4556c3d5717f8ba8a01216b27f4af7/cl_platform.h contains:
#if defined( __SSE__ )
[...]
#if defined( __GNUC__ )
typedef float __cl_float4 __attribute__((vector_size(16)));
[...]
#define __CL_FLOAT4__ 1
and then:
typedef union
{
[...]
#if defined( __CL_FLOAT4__)
__cl_float4 v4;
#endif
}cl_float4;
Not sure this barrage of ifdefs was a good move by Khronos, but it is what we have.
I recommend that you just always use .s[0], which is the most portable option. We shouldn't need to speedup the host SIMD if we are focusing on the GPU...
C11 anonymous structs
The error error: ‘cl_float3’ has no member named ‘x’ happens because of the lines mentioned at: https://stackoverflow.com/a/10981639/895245
More precisely, this feature is called "anonymous struct", and it is an extension that was standardized in C11.
So in theory it should also work with -std=c11, but it currently doesn't because the CL headers weren't updated to check for C11, see also: https://github.com/KhronosGroup/OpenCL-Headers/issues/18
I'm currently trying to create a C source code which properly handles I/O whatever the endianness of the target system.
I've selected "little endian" as my I/O convention, which means that, for big endian CPU, I need to convert data while writing or reading.
Conversion is not the issue. The problem I face is to detect endianness, preferably at compile time (since CPU do not change endianness in the middle of execution...).
Up to now, I've been using this :
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
...
#else
...
#endif
It's documented as a GCC pre-defined macro, and Visual seems to understand it too.
However, I've received report that the check fails for some big_endian systems (PowerPC).
So, I'm looking for a foolproof solution, which ensures that endianess is correctly detected, whatever the compiler and the target system. well, most of them at least...
[Edit] : Most of the solutions proposed rely on "run-time tests". These tests may sometimes be properly evaluated by compilers during compilation, and therefore cost no real runtime performance.
However, branching with some kind of << if (0) { ... } else { ... } >> is not enough. In the current code implementation, variable and functions declaration depend on big_endian detection. These cannot be changed with an if statement.
Well, obviously, there is fall back plan, which is to rewrite the code...
I would prefer to avoid that, but, well, it looks like a diminishing hope...
[Edit 2] : I have tested "run-time tests", by deeply modifying the code. Although they do their job correctly, these tests also impact performance.
I was expecting that, since the tests have predictable output, the compiler could eliminate bad branches. But unfortunately, it doesn't work all the time. MSVC is good compiler, and is successful in eliminating bad branches, but GCC has mixed results, depending on versions, kind of tests, and with greater impact on 64 bits than on 32 bits.
It's strange. And it also means that the run-time tests cannot be ensured to be dealt with by the compiler.
Edit 3 : These days, I'm using a compile-time constant union, expecting the compiler to solve it to a clear yes/no signal.
And it works pretty well :
https://godbolt.org/g/DAafKo
As stated earlier, the only "real" way to detect Big Endian is to use runtime tests.
However, sometimes, a macro might be preferred.
Unfortunately, I've not found a single "test" to detect this situation, rather a collection of them.
For example, GCC recommends : __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ . However, this only works with latest versions, and earlier versions (and other compilers) will give this test a false value "true", since NULL == NULL. So you need the more complete version : defined(__BYTE_ORDER__)&&(__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
OK, now this works for newest GCC, but what about other compilers ?
You may try __BIG_ENDIAN__ or __BIG_ENDIAN or _BIG_ENDIAN which are often defined on big endian compilers.
This will improve detection. But if you specifically target PowerPC platforms, you can add a few more tests to improve even more detection. Try _ARCH_PPC or __PPC__ or __PPC or PPC or __powerpc__ or __powerpc or even powerpc. Bind all these defines together, and you have a pretty fair chance to detect big endian systems, and powerpc in particular, whatever the compiler and its version.
So, to summarize, there is no such thing as a "standard pre-defined macros" which guarantees to detect big-endian CPU on all platforms and compilers, but there are many such pre-defined macros which, collectively, give a high probability of correctly detecting big endian under most circumstances.
At compile time in C you can't do much more than trusting preprocessor #defines, and there are no standard solutions because the C standard isn't concerned with endianness.
Still, you could add an assertion that is done at runtime at the start of the program to make sure that the assumption done when compiling was true:
inline int IsBigEndian()
{
int i=1;
return ! *((char *)&i);
}
/* ... */
#ifdef COMPILED_FOR_BIG_ENDIAN
assert(IsBigEndian());
#elif COMPILED_FOR_LITTLE_ENDIAN
assert(!IsBigEndian());
#else
#error "No endianness macro defined"
#endif
(where COMPILED_FOR_BIG_ENDIAN and COMPILED_FOR_LITTLE_ENDIAN are macros #defined previously according to your preprocessor endianness checks)
Instead of looking for a compile-time check, why not just use big-endian order (which is considered the "network order" by many) and use the htons/htonl/ntohs/ntohl functions provided by most UNIX-systems and Windows. They're already defined to do the job you're trying to do. Why reinvent the wheel?
Try something like:
if(*(char *)(int[]){1}) {
/* little endian code */
} else {
/* big endian code */
}
and see if your compiler resolves it at compile-time. If not, you might have better luck doing the same with a union. Actually I like defining macros using unions that evaluate to 0,1 or 1,0 (respectively) so that I can just do things like accessing buf[HI] and buf[LO].
Notwithstanding compiler-defined macros, I don't think there's a compile-time way to detect this, since determining the endianness of an architecture involves analyzing the manner in which it stores data in memory.
Here's a function which does just that:
bool IsLittleEndian () {
int i=1;
return (int)*((unsigned char *)&i)==1;
}
As others have pointed out, there isn't a portable way to check for endianness at compile-time. However, one option would be to use the autoconf tool as part of your build script to detect whether the system is big-endian or little-endian, then to use the AC_C_BIGENDIAN macro, which holds this information. In a sense, this builds a program that detects at runtime whether the system is big-endian or little-endian, then has that program output information that can then be used statically by the main source code.
Hope this helps!
This comes from p. 45 of Pointers in C:
#include <stdio.h>
#define BIG_ENDIAN 0
#define LITTLE_ENDIAN 1
int endian()
{
short int word = 0x0001;
char *byte = (char *) &word;
return (byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}
int main(int argc, char* argv[])
{
int value;
value = endian();
if (value == 1)
printf("The machine is Little Endian\n");
else
printf("The machine is Big Endian\n");
return 0;
}
Socket's ntohl function can be used for this purpose. Source
// Soner
#include <stdio.h>
#include <arpa/inet.h>
int main() {
if (ntohl(0x12345678) == 0x12345678) {
printf("big-endian\n");
} else if (ntohl(0x12345678) == 0x78563412) {
printf("little-endian\n");
} else {
printf("(stupid)-middle-endian\n");
}
return 0;
}
My GCC version is 9.3.0, it's configured to support powerpc64 platform, and I've tested it and verified that it supports the following macros logic:
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
......
#endif
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
.....
#endif
As of C++20, no more hacks or compiler extensions are necessary.
https://en.cppreference.com/w/cpp/types/endian
std::endian (Defined in header <bit>)
enum class endian
{
little = /*implementation-defined*/,
big = /*implementation-defined*/,
native = /*implementation-defined*/
};
If all scalar types are little-endian, std::endian::native equals std::endian::little
If all scalar types are big-endian, std::endian::native equals std::endian::big
You can't detect it at compile time to be portable across all compilers. Maybe you can change the code to do it at run-time - this is achievable.
It is not possible to detect endianness portably in C with preprocessor directives.
I took the liberty of reformatting the quoted text
As of 2017-07-18, I use union { unsigned u; unsigned char c[4]; }
If sizeof (unsigned) != 4 your test may fail.
It may be better to use
union { unsigned u; unsigned char c[sizeof (unsigned)]; }
As most have mentioned, compile time is your best bet. Assuming you do not do cross compilations and you use cmake (it will also work with other tools such as a configure script, of course) then you can use a pre-test which is a compiled .c or .cpp file and that gives you the actual verified endianness of the processor you're running on.
With cmake you use the TestBigEndian macro. It sets a variable which you can then pass to your software. Something like this (untested):
TestBigEndian(IS_BIG_ENDIAN)
...
set(CFLAGS ${CFLAGS} -DIS_BIG_ENDIAN=${IS_BIG_ENDIAN}) // C
set(CXXFLAGS ${CXXFLAGS} -DIS_BIG_ENDIAN=${IS_BIG_ENDIAN}) // C++
Then in your C/C++ code you can check that IS_BIG_ENDIAN define:
#if IS_BIG_ENDIAN
...do big endian stuff here...
#else
...do little endian stuff here...
#endif
So the main problem with such a test is cross compiling since you may be on a completely different CPU with a different endianness... but at least it gives you the endianness at time of compiling the rest of your code and will work for most projects.
I provided a general approach in C with no preprocessor, but only runtime that compute endianess for every C type.
the output if this on my Linux x86_64 architecture is:
fabrizio#toshibaSeb:~/git/pegaso/scripts$ gcc -o sizeof_endianess sizeof_endianess.c
fabrizio#toshibaSeb:~/git/pegaso/scripts$ ./sizeof_endianess
INTEGER TYPE | signed | unsigned | 0x010203... | Endianess
--------------+---------+------------+-------------------------+--------------
int | 4 | 4 | 04 03 02 01 | little
char | 1 | 1 | - | -
short | 2 | 2 | 02 01 | little
long int | 8 | 8 | 08 07 06 05 04 03 02 01 | little
long long int | 8 | 8 | 08 07 06 05 04 03 02 01 | little
--------------+---------+------------+-------------------------+--------------
FLOATING POINT| size |
--------------+---------+
float | 4
double | 8
long double | 16
Get source at: https://github.com/bzimage-it/pegaso/blob/master/scripts/sizeof_endianess.c
This is a more general approach is to not detect endianess at compilation time (not possibile) nor assume any endianess escludes another one. In fact is important to remark that endianess is not a concept of the architecture/processor but regards single type. As argued by
#Christoph at https://stackoverflow.com/a/4712594/3280080 PDP-11 for example can have different endianess at the same time.
The approach consist to set an integer to be x = 0x010203... as long is it, then print them looking at casted-at-single-byte incrementing the address by one.
Can somebody test it please in a big endian and/or mixed endianess ?
I know I'm late to this party, but here is my take.
int is_big_endian() {
return 1 & *(uint16_t*)"01";
}
This is based on the fact that '0' is 48 in decimal and '1' 49, so '1' has the LSB bit set, while '0' not. I could make them '\x00' and '\x01' but I think my version makes it more readable.
#define BIG_ENDIAN ((1 >> 1 == 0) ? 0 : 1)