Implementing std::bit_cast equivalent in C

Implementing std::bit_cast equivalent in C - c

Is it possible to implement something similar to C++20's std::bit_cast in C? It would be a lot more convenient than using union or casting pointers to different types and dereferencing.
If you had a bit_cast, then implementing some floating point functions would be easier:
float Q_rsqrt( float number )
{
int i = 0x5f3759df - ( bit_cast(int, number) >> 1 );
float y = bit_cast(float, i);
y = y * ( 1.5f - ( number * 0.5f * y * y ) );
y = y * ( 1.5f - ( number * 0.5f * y * y ) );
return y;
}
See also Fast inverse square root
The naive solution is:
#define bit_cast(T, ...) (*(T*) &(__VA_ARGS__))
But it has major problems:
it is undefined behavior because it violates strict aliasing
it doesn't work for bit-casting rvalues because we are taking the address of the second operand directly
it doesn't make sure that the operands have the same size
Can we implement a bit_cast without these issues?

It is possible in non-standard standard C, thanks to typeof. typeof is also a further proposed feature for C23, so it may become possible in standard C23. One of the solutions below makes some sacrifices which allow C99 compliance.
Implementation Using union
Let's look at how the approach using union works first:
#define bit_cast(T, ...) \
((union{typeof(T) a; typeof(__VA_ARGS__) b;}) {.b=(__VA_ARGS__)}.a)
We are creating a compound literal from an anonymous union made of T and whatever type the given expression has. We initialize this literal to .b= ... using designated initializers and then access the .a member of type T.
The typeof(T) is necessary if we want to pun function pointers, arrays, etc., due to C's type syntax.
Implementation using memcpy
This implementation is slightly longer, but has the advantage of relying only on C99, and can even work without the use of typeof:
#define bit_cast(T, ...) \
(*(typeof(T)*) memcpy(&(T){0}, &(typeof(__VA_ARGS__)) {(__VA_ARGS__)}, sizeof(T)))
We are copying from one compound literal to another and then accessing the destination's value:
the source literal is a copy of our input expression, which allows us to take its address, even for bit_cast(float, 123) where 123 is an rvalue
the destination is a zero-initialized literal of type T
memcpy returns the destination operand, so we can cast the result to typeof(T)* and then dereference that pointer.
We can completely eliminate typeof here and make this C99-compliant, but there are downsides:
#define bit_cast(T, ...) \
(*((T*) memcpy(&(T){0}, &(__VA_ARGS__), sizeof(T))))
We are now taking the address of the expression directly, so we can't use bit_cast on rvalues anymore. We are using T* without typeof, so we can no longer convert to function pointers, arrays, etc.
Implementing Size Checking (since C11)
As for the last issue, which is that we don't verify that both operands have the same size: We can use _Static_assert (since C11) to make sure of that. Unfortunately, _Static_assert is a declaration, not an expression, so we have to wrap it up:
#define static_assert_expr(...) \
((void) (struct{_Static_assert(__VA_ARGS__); int _;}) {0})
We are creating a compound literal that contains the assertion and discarding the expression.
We can easily integrate this in the previous two implementations using the comma operator:
#define bit_cast_memcpy(T, ...) ( \
static_assert_expr(sizeof(T) == sizeof(__VA_ARGS__), "operands must have the same size"), \
(*(typeof(T)*) memcpy(&(T){0}, &(typeof(__VA_ARGS__)) {(__VA_ARGS__)}, sizeof(T))) \
)
#define bit_cast_union(T, ...) ( \
static_assert_expr(sizeof(T) == sizeof(__VA_ARGS__), "operands must have the same size"), \
((union{typeof(T) a; typeof(__VA_ARGS__) b;}) {.b=(__VA_ARGS__)}.a) \
)
Known and Unfixable Issues
Because of how macros work, we can not use this if the punned type contains a comma:
bit_cast(int[0,1], x)
This doesn't work because macros ignore square brackets and the 1] would not be considered part of the type, but would go into __VA_ARGS__.

Related

_Static_assert in unused generic selection

It looks like the typeof operator is likely to be accepted into the next C standard, and I was looking to see if there was a way to leverage this to create a macro using portable ISO-C that can get the length of an array passed into it or fail to compile if a pointer is passed into it. Normally generic selection can be used to force a compiler error when using an unwanted type by leaving it out of the generic association list, but in this case, we need a default association to deal with arrays of any length, so instead I am trying to force a compiler error for the generic association for the type we don't want. Here's an example of what the macro could look like:
#define ARRAY_SIZE(X) _Generic(&(X), \
typeof(&X[0]) *: sizeof(struct{_Static_assert(0, "Trying to get the array length of a pointer"); int _a;}), \
default: (sizeof(X) / sizeof(X[0])) \
)
The problem is that _Static_assert is tripping even when the generic association selected is the default association. For sake of simplicity, since the issue at hand is not related anything being introduced in C23, we'll make a test program that works explicitly to reject a pointer to int:
#include <stdio.h>
#include <stdlib.h>
#define ARRAY_SIZE(X) _Generic(&(X), \
int **: sizeof(struct{_Static_assert(0, "Trying to get the array length of a pointer"); int _a;}), \
default: (sizeof(X) / sizeof(X[0])) \
)
int main(void) {
int x[100] = {0};
int *y = x;
int (*z)[100] = {&x};
printf("length of x: %zu\n", ARRAY_SIZE(x));
printf("length of y: %zu\n", ARRAY_SIZE(y));
printf("length of z: %zu\n", ARRAY_SIZE(z));
printf("length of *z: %zu\n", ARRAY_SIZE(*z));
return EXIT_SUCCESS;
}
Building the above with -std=c11, I find _Static_assert tripping on all expansions of ARRAY_SIZE when I would expect to only have problems with the pointers that will use the int ** generic association.
According to 6.5.1.1 p3 of the C11 standard for Generic Selection,
None of the expressions from any other generic association of the generic selection is evaluated
Is this a bug in gcc and clang, or is there something I've missed in the standard that would cause the compile-time evaluation of this _Static_assert in the unused generic association?

It doesn't matter which generic selection is evaluated.
When the expression that is part of a _Status_assert has the value 0, this is considered a constraint violation and the compiler is required to generate a diagnostic.

You can't really mix _Static_assert with expressions that should return a value, such as a function-like macro. You could perhaps work around that with a "poor man's static assert", like one of the ugly tricks we used before C11:
#define POOR_STATIC_ASSERT(expr) (int[expr]){0}
#define CHECK(X) _Generic((&X), \
int **: 0,\
default: (sizeof(X) / sizeof(X[0])) \
)
#define ARRAY_SIZE(X) ( (void)POOR_STATIC_ASSERT(CHECK(X)), CHECK(X) )
Here the comma operator is called to have the macro CHECK return the size or zero, in case a type is valid or not. Then call the same macro again to have that one returned from the function-like macro ARRAY_SIZE. This will lead to some cryptic error from an ISO C compiler such as "error: ISO C forbids zero-size array".
The next problem is that &(X) in _Generic is by no means guaranteed to boil down to a int** so this macro isn't safe or reliable. Regarding array sizes though, there's a trick we can use. A pointer to an array of no size (incomplete type) is compatible with every array of the same element type no matter it's size. The macro could be rewritten as:
#define POOR_STATIC_ASSERT(expr) (int[expr]){0}
#define CHECK(X) _Generic((&X), \
int (*)[]: sizeof(X) / sizeof(X[0]), \
default: 0)
#define ARRAY_SIZE(X) ( (void)POOR_STATIC_ASSERT(CHECK(X)), CHECK(X) )
This will work for any int array no matter size but fail for everything else.

Utilizing some of the suggestions from Lundin's answer, I have come up with the following solution to the simplified problem:
#define STATIC_ASSERT_EXPRESSION(X, ERROR_MESSAGE) (sizeof(struct {_Static_assert((X), ERROR_MESSAGE); int _a;}))
#define NO_POINTERS(X) _Generic(&(X), \
int (*)[]: 1, \
default: 0 \
)
#define ARRAY_SIZE(X) ( (void)STATIC_ASSERT_EXPRESSION(NO_POINTERS(X), "Cannot retrieve the number of array elements from a pointer"), (sizeof(X) / sizeof(X[0])) )
For the actual use-case to be type generic using typeof, which should be coming to the C23 standard, replace the NO_POINTERS macro using this:
#define NO_POINTERS(X) _Generic(&(X), \
typeof(*X) (*)[]: 1, \
default: 0 \
)
By moving the _Static_assert outside of the Generic Selection, it will only be evaluated with the value that actually returns from the selection, so it won't get fired off for existing in an unused selection. Additionally, the number of elements calculation was also removed from the generic selection so that expression of the generic selection could safely be used in a Static Assert even if your array was a Variable-Length array which requires its size to be calculated at run-time.
The Static Assert itself is placed inside of an anonymous struct that we take the sizeof so that it is a part of an expression. And then as in Lundin's example, we use the comma operator to have that expression evaluated, and then thrown out and use the results of the array size calculation.
With this, we reject pointers while getting the number of elements in both static arrays and VLAs, plus we get a nice compiler error message when trying to pass in a pointer.

Double evaluation within macro: a case of sizeof() to determine array's size passed as compound literal

C99 makes it possible to define arrays basically anywhere, as compound literals.
For example, given a trivial function sumf() that accepts an array of float as input, we would expect the prototype to be :
float sumf(const float* arrayf, size_t size);
This can then be used like that :
float total = sumf( (const float[]){ f1, f2, f3 }, 3 );
It's convenient because there is no need to declare a variable beforehand.
The syntax is slightly ugly, but this could be hidden behind a macro.
However, note the final 3. This is the size of the array. It is required so that sumf() knows where to stop. But as code ages and get refactored, it's also an easy source of errors, because now this second argument must be kept in sync with the first parameter definition. For example, adding f4 requires to update this value to 4, otherwise the function returns a wrong calculation (and there is no warning notifying this issue).
So it would be better to keep both in sync.
If it was an array which was declared through a variable, it would be easy.
We could have a macro, that simplifies the expression like this : float total = sumf( ARRAY(array_f) ); with just #define ARRAY(a) (a) , sizeof(a) / sizeof(*(a)). But then, array_f must be defined before calling the function, so it's not longer a compound literal.
Since it's a compound literal, it has no name, so it can't be referenced. Hence I could not find any better way than to repeat the compound literal in both parameters.
#define LIST_F(...) (const float*)( __VA_ARGS__) , sizeof((const float*)( __VA_ARGS__)) / sizeof(float)
float total = sumf ( LIST_F( f1, f2, f3 ) );
and this would work. Adding an f4 into the list would automatically update the size argument to correct size.
However, this all works fine as long as all members are variables. But what about cases where it's a function ? Would the function be invoked twice ?
Say for example : float total = sumf ( LIST_F( v1, f2() ) );, will f2() be invoked twice? This is unclear to me as f2() is mentioned within sizeof(), so it could, in theory, know the return type size without actually invoking f2(). But I'm unsure what the standard says about that. Is there a guarantee ? Is it implementation dependent ?

will f2() be invoked twice?
No, sizeof is not evaluated (unless it's a variable length array, but it's not).
what the standard says about that. Is there a guarantee ?
From C11 6.5.3.4p2:
The sizeof operator yields the size (in bytes) of its operand, [...] If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.
Is it implementation dependent ?
No, it should be always fine.
Note that your other macro uses (const float*)(__VA_ARGS__), that will not work - the syntax is (float[]){ stuff }. Anyway, I would just do one macro, why two, too much typing. Just:
#define SUMF_ARRAY(...) \
sumf( \
(const float[]){__VA_ARGS__}, \
sizeof((const float[]){__VA_ARGS__}) / sizeof(float))
float total = SUMF_ARRAY(f1(), f2(), f3());

Multiline macro function with "return" statement

I'm currently working on a project, and a particular part needs a multi-line macro function (a regular function won't work here as far as I know).
The goal is to make a stack manipulation macro, that pulls data of an arbitrary type off the stack (being the internal stack from a function call, not a high-level "stack" data type). If it were a function, it'd look like this:
type MY_MACRO_FUNC(void *ptr, type);
Where type is the type of data being pulled from the stack.
I currently have a working implementation of this for my platform (AVR):
#define MY_MACRO_FUNC(ptr, type) (*(type*)ptr); \
(ptr = /* Pointer arithmetic and other stuff here */)
This allows me to write something like:
int i = MY_MACRO_FUNC(ptr, int);
As you can see in the implementation, this works because the statement which assigns i is the first line in the macro: (*(type*)ptr).
However, what I'd really like is to be able to have a statement before this, to verify that ptr is a valid pointer before anything gets broken. But, this would cause the macro to be expanded with the int i = pointing to that pointer check. Is there any way to get around this issue in standard C? Thanks for any help!

As John Bollinger points out, macros expanding to multiple statements can have surprising results. A way to make several statements (and declarations!) a single statement is to wrap them into a block (surrounded by do … while(0), see for example here).
In this case, however, the macro should evaluate to something, so it must be an expression (and not a statement). Everything but declarations and iteration and jump statements (for, while, goto) can be transformed to an expression: Several expressions can be sequenced with the comma operator, if-else-clauses can be replaced by the conditional operator (?:).
Given that the original value of ptr can be recovered (I’ll assume "arithmetic and other stuff here" as adding 4 for the sake of having an example)
#define MY_MACRO_FUNC(ptr, type) \
( (ptr) && (uintptr_t)(ptr)%4 == 0 \
? (ptr) += 4 , *(type*)((ptr) - 4) \
: (abort() , (type){ 0 }) )
Note, that I put parentheses around ptr and around the whole expression, see e.g. here for an explanation.
The second and third operand of ?: must be of the same type, so I included (type){0} after the abort call. This expression is never evaluated. You just need some valid dummy object; here, type cannot be a function type.
If you use C89 and can’t use compound literals, you can use (type)0, but that wouldn’t allow for structure or union types.
Just as a note, Gcc has an extension Statements and Declarations in Expressions.

This is very nasty:
#define MY_MACRO_FUNC(ptr, type) (*(type*)ptr); \
(ptr = /* Pointer arithmetic and other stuff here */)
It may have unexpected results in certain inoccuous-looking circumstances, such as
if (foo) bar = MY_MACRO_FUNC(ptr, int);
Consider: what happens then if foo is 0?
I think you would be better off implementing this in a form that assigns the popped value instead of 'returning' it:
#define MY_POP(stack, type, v) do { \
if (!stack) abort_abort_abort(); \
v = *((type *) stack); \
stack = (... compute new value ...); \
} while (0)

Is this function macro safe?

Can you tell me if anything and what can go wrong with this C "function macro"?
#define foo(P,I,X) do { (P)[I] = X; } while(0)
My goal is that foo behaves exactly like the following function foofunc for any POD data type T (i.e. int, float*, struct my_struct { int a,b,c; }):
static inline void foofunc(T* p, size_t i, T x) { p[i] = x; }
For example this is working correctly:
int i = 0;
float p;
foo(&p,i++,42.0f);
It can handle things like &p due to putting P in parentheses, it does increment i exactly once because I appears only once in the macro and it requires a semicolon at the end of the line due to do {} while(0).
Are there other situations of which I am not aware of and in which the macro foo would not behave like the function foofunc?
In C++ one could define foofunc as a template and would not need the macro. But I look for a solution which works in plain C (C99).

The fact that your macro works for arbitrary X arguments hinges on the details of operator precedence. I recommend using parentheses even if they happen not to be necessary here.
#define foo(P,I,X) do { (P)[I] = (X); } while(0)
This is an instruction, not an expression, so it cannot be used everywhere foofunc(P,I,X) could be. Even if foofunc returns void, it can be used in comma expressions; foo can't. But you can easily define foo as an expression, with a cast to void if you don't want to risk using the result.
#define foo(P,I,X) ((void)((P)[I] = (X)))
With a macro instead of a function, all you lose is the error checking. For example, you can write foo(3, ptr, 42) instead of foo(ptr, 3, 42). In an implementation where size_t is smaller than ptrdiff_t, using the function may truncate I, but the macro's behavior is more intuitive. The type of X may be different from the type that P points to: an automatic conversion will take place, so in effect it is the type of P that determines which typed foofunc is equivalent.
In the important respects, the macro is safe. With appropriate parentheses, if you pass syntactically reasonable arguments, you get a well-formed expansion. Since each parameter is used exactly once, all side effects will take place. The order of evaluation between the parameters is undefined either way.

The do { ... } while(0) construct protects your result from any harm, your inputs P and I are protected by () and [], respectively. What is not protected, is X. So the question is, whether protection is needed for X.
Looking at the operator precedence table (http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence), we see that only two operators are listed as having lower precedence than = so that the assignment could steal their argument: the throw operator (this is C++ only) and the , operator.
Now, apart from being C++ only, the throw operator is uncritical because it does not have a left hand argument that could be stolen.
The , operator, on the other hand, would be a problem if X could contain it as a top level operator. But if you parse the statement
foo(array, index, x += y, y)
you see that the , operator would be interpreted to delimit a fourth argument, and
foo(array, index, (x += y, y))
already comes with the parentheses it requires.
To make a long story short:
Yes, your definition is safe.
However, your definition relies on the impossibility to pass stuff, more_stuff as one macro parameter without adding parentheses. I would prefer not to rely on such intricacies, and just write the obviously safe
#define foo(P, I, X) do { (P)[I] = (X); } while(0)

How to check if a parameter is an integral constant expression in a C preprocessor macro?

I'm currently cleaning up an existing C-library to publish it shamelessly.
A preprocessor macro NPOT is used to calculate the next greater power of two for a given integral constant expression at compile time. The macro is normally used in direct initialisations. For all other cases (e.g. using variable parameters), there is an inline function with the same function.
But if the user passes a variable, the algorithm expands to a huge piece of machine code. My question is:
What may I do to prevent a user from passing anything but an integral constant expression to my macro?
#define NPOT(x) complex_algorithm(x)
const int c=10;
int main(void) {
int i=5;
foo = NPOT(5); // works, and does everything it should
foo = NPOT(c); // works also, but blows up the code extremely
foo = NPOT(i); // blows up the code also
}
What I already tried:
Define the macro to #define NPOT(x) complex_algorithm(x ## u). It still works and throws a - even if hardly helpful - compiler error for variable parameters. Unless there is no variable like iu... Dirty, dangerous, don't want it.
Documentation, didn't work for most users.

You can use any expression that needs a constant integral expression and that will then be optimized out.
#define NPOT(X) \
(1 \
? complex_algorithm(X) \
: sizeof(struct { int needs_constant[1 ? 1 : (X)]; }) \
)
eventually you should cast the result of the sizeof to the appropriate integer type, so the return expression is of a type that you'd expect.
I am using an untagged struct here to
have a type so really no temporary is produced
have a unique type such that the expression can be repeated anywhere in the code without causing conflicts
trigger the use of a VLA, which is not allowed inside a struct as of C99:
A member of a structure or union may have any object type other than a
variably modified type.
I am using the ternary ?: with 1 as the selecting expression to ensure that the : is always evaluated for its type, but never evaluated as an expression.
Edit: It seems that gcc accepts VLA inside struct as an extension and doesn't even warn about it, even when I explicitly say -std=c99. This is really a bad idea of them.
For such a weird compiler :) you could use sizeof((int[X]){ 0 }), instead. This is "as forbidden" as the above version, but additionally even gcc complains about it.

#define INTEGRAL_CONST_EXPR(x) ((void) sizeof (struct {int a:(x);}), (x))
This will give a compile error if x is not a integral constant expression.
my_function(INTEGRAL_CONST_EXPR(1 + 2 + 3)); // OK
my_function(INTEGRAL_CONST_EXPR(1.0 + 2 + 3)); // compile error
Note that this solution does not work for initializing a static variable:
static int a = INTEGRAL_CONST_EXPR(2 + 3);
will trigger a compile error because of an expression with , is not a constant expression.
As #JensGustedt put in the comment, an integral constant expression resolving to a negative integer number cannot be used in this solution as bit-field width cannot be negative.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Implementing std::bit_cast equivalent in C - c

Related

_Static_assert in unused generic selection

Double evaluation within macro: a case of sizeof() to determine array's size passed as compound literal

Multiline macro function with "return" statement

Is this function macro safe?

How to check if a parameter is an integral constant expression in a C preprocessor macro?

Categories

Resources