Efficient conversion from Indeterminate Value to Unspecified Value - c

Sometimes in C it is necessary to read a possibly-written item from a partially-written array, such that:
If the item has been written, the read will yield the value that was in fact written, and
If the item hasn't been written, the read will convert an Unspecified bit pattern to a value of the appropriate type, with no side-effects.
There are a variety of algorithms where finding a solution from scratch is expensive, but validating a proposed solution is cheap. If an array holds solutions for all cases where they have been found, and arbitrary bit patterns for other cases, reading the array, testing whether it holds a valid solution, and slowly computing the solution only if the one in the array isn't valid, may be a useful optimization.
If an attempt to read a non-written array element of a types like uint32_t could be guaranteed to always yield a value of the appropriate type, efficiently such an approach would be easy and straightforward. Even if that requirement only held for unsigned char, it might still be workable. Unfortunately, compilers sometimes behave as though reading an Indeterminate Value, even of type unsigned char, may yield something that doesn't behave consistently as a value of that type. Further, discussions in a Defect Report suggest that operations involving Indeterminate values yield Indeterminate results, so even given something like unsigned char x, *p=&x; unsigned y=*p & 255; unsigned z=(y < 256); it would be possible for z to receive the value 0.
From what I can tell, the function:
unsigned char solidify(unsigned char *p)
{
unsigned char result = 0;
unsigned char mask = 1;
do
{
if (*p & mask) result |= mask;
mask += (unsigned)mask; // Cast only needed for capricious type ranges
} while(mask);
return result;
}
would be guaranteed to always yield a value in the range of type unsigned char any time the storage identified can be accessed as that type, even if it happens to hold Indeterminate Value. Such an approach seems rather slow and clunky, however, given that the required machine code to obtain the desired effect should usually be equivalent to returning x.
Are there any better approaches that would be guaranteed by the Standard to always yield a value within the range of unsigned char, even if the source value is Indeterminate?
Addendum
The ability to solidify values is necessary, among other things, when performing I/O with partially-written arrays and structures, in cases where nothing will care about what bits get output for the parts that were never set. Whether or not the Standard would require that fwrite be usable with partially-writtten structures or arrays, I would regard I/O routines that can be used in such fashion (writing arbitrary values for portions that weren't set) to be of higher quality than those which might jump the rails in such cases.
My concern is largely with guarding against optimizations which are unlikely to be used in dangerous combinations, but which could nonetheless occur as compilers get more and more "clever".
A problem with something like:
unsigned char solidify_alt(unsigned char *p)
{ return *p; }
is that compilers may combine an optimization which could be troublesome but tolerable in isolation, with one that would be good in isolation but deadly in combination with the first:
If the function is passed the address of an unsigned char which has been optimized to e.g. a 32-bit register, a function like the above may blindly return the contents of that register without clipping it to the range 0-255. Requiring that callers manually clip the results of such functions would be annoying but survivable if that were the only problem. Unfortunately...
Since the above function function will "always" return a value 0-255, compilers may omit any "downstream" code that would try to mask the value into that range, check if it was outside, or otherwise do things that would be irrelevant for values outside the range 0-255.
Some I/O devices may require that code wishing to write an octet perform a 16-bit or 32-bit store to an I/O register, and may require that 8 bits contain the data to be written and other bits hold a certain pattern. They may malfunction badly if any of the other bits are set wrong. Consider the code:
void send_byte(unsigned char *p, unsigned int n)
{
while(n--)
OUTPUT_REG = solidify_alt(*p++) | 0x0200;
}
void send_string4(char *st)
{
unsigned char buff[5]; // Leave space for zero after 4-byte string
strcpy((char*)buff, st);
send_bytes(buff, 4);
}
with the indended semantics that send_string4("Ok"); should send out an 'O', a 'k', a zero byte, and an arbitrary value 0-255. Since the code uses solidify_alt rather than solidify, a compiler could legally turn that into:
void send_string4(char *st)
{
unsigned buff0, buff1, buff2, buff3;
buff0 = st[0]; if (!buff0) goto STRING_DONE;
buff1 = st[1]; if (!buff1) goto STRING_DONE;
buff2 = st[2]; if (!buff2) goto STRING_DONE;
buff3 = st[3];
STRING_DONE:
OUTPUT_REG = buff0 | 0x0200;
OUTPUT_REG = buff1 | 0x0200;
OUTPUT_REG = buff2 | 0x0200;
OUTPUT_REG = buff3 | 0x0200;
}
with the effect that OUTPUT_REG may receive values with bits set outside the proper range. Even if output expression were changed to ((unsigned char)solidify_alt(*p++) | 0x0200) & 0x02FF) a compiler could still simplify that to yield the code given above.
The authors of the Standard refrained from requiring compiler-generated initialization of automatic variables because it would have made code slower in cases where such initialization would be semantically unnecessary. I don't think they intended that programmers should have to manually initialize automatic variables in cases where all bit patterns would be equally acceptable.
Note, btw, that when dealing with short arrays, initializing all the values will be inexpensive and would often be a good idea, and when using large arrays a compiler would be unlikely to impose the above "optimization". Omitting the initialization in cases where the array is large enough that the cost matters, however, would make the program's correct operation reliant upon "hope".

This is not an answer, but an extended comment.
The immediate solution would be for the compiler to provide a built-in, for example assume_initialized(variable [, variable ... ]*), that generates no machine code, but simply makes the compiler treat the contents of the specified variable (either scalars or arrays) to be defined but unknown.
One can achieve a similar effect using a dummy function defined in another compilation unit, for example
void define_memory(void *ptr, size_t bytes)
{
/* Nothing! */
}
and calling that (e.g. define_memory(some_array, sizeof some_array)), to stop the compiler from treating the values in the array as indeterminate; this works because at compile time, the compiler cannot determine the values are unspecified or not, and therefore must consider them specified (defined but unknown).
Unfortunately, that has serious performance penalties. The call itself, even though the function body is empty, has a performance impact. However, worse yet is the effect on the code generation: because the array is accessed in a separate compilation unit, the data must actually reside in memory in array form, and thus typically generates extra memory accesses, plus restricts the optimization opportunities for the compiler. In particular, even a small array must then exist, and cannot be implicit or reside completely in machine registers.
I have experimented with a few architecture (x86-64) and compiler (GCC) -specific workarounds (using extended inline assembly to fool the compiler to believe that the values are defined but unknown (unspecified, as opposed to indeterminate), without generating actual machine code -- because this does not require any machine code, just a small adjustment to how the compiler treats the arrays/variables --, but with about zero success.
Now, to the underlying reason why I wrote this comment.
Years and years ago, working on numerical computation code and comparing performance to a similar implementation in Fortran 95, I discovered the lack of a memrepeat(ptr, first, bytes) function: the counterpart to memmove() with respect to memcpy(), that would repeat first bytes at ptr to ptr+first up to ptr+bytes-1. Like memmove(), it would work on the storage representation of the data, so even if the ptr to ptr+first contained a trap representation, no trap would actually trigger.
Main use case is to initialize arrays with floating-point data (one-dimensional, multidimensional, or structures with floating-point members), by initializing the first structure or group of values, and then simply repeating the storage pattern over the entire array. This is a very common pattern in numerical computation.
As an example, using
double nums[7] = { 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0 };
memrepeat(nums, 2 * sizeof nums[0], sizeof nums);
yields
double nums[7] = { 7.0, 6.0, 7.0, 6.0, 7.0, 6.0, 7.0 };
(It is possible that the compiler could optimize the operation even better, if it was defined as e.g. memsetall(data, size, count), where size is the size of the duplicated storage unit, and count the total number of storage units (so count-1 units are actually copied). In particular, this allows easy implementation that uses nontemporal stores for the copies, reading from the initial storage unit. On the other hand, memsetall() can only copy full storage units unlike memrepeat(), so memsetall(nums, 2 * sizeof nums[0], 3); would leave the 7th element in nums[] unchanged -- i.e., in the above example, it'd yield { 7.0, 6.0, 7.0, 6.0, 7.0, 6.0, 1.0 }.)
Although you can trivially implement memrepeat() or memsetall(), even optimize them for a specific architecture and compiler, it is difficult to write a portable optimized version.
In particular, loop-based implementations that use memcpy() (or memmove()) yield quite inefficient code when compiled by e.g. GCC, because the compiler cannot coalesce a pattern of function calls into a single operation.
Most compilers often inline memcpy() and memmove() with internal, target-and-use-case-optimized versions, and doing that for such a memrepeat() and/or memsetall() function would make it portable. In Linux on x86-64, GCC inlines known-size calls, but keeps the function calls where the size is only known at runtime.
I did try to push it upstream, with some private and some public discussions on various mailing lists. The response was cordial, but clear: there is no way to get such features included into compilers, unless it is standardized by someone first, or you pique the interest of one of the core developers enough so that they want to try it themselves.
Because the C standards committee is only concerned at fulfilling the commercial interests of its corporate sponsors, there is zero chance of getting anything like that standardized into ISO C. (If there were, we really should push for basic features from POSIX like getline(), regex, and iconv to be included first; they'd have a much bigger positive impact on code we can teach new C programmers.)
None of this piqued the interest of the core GCC developers either, so at that point, I lost my interest in trying to push it upstream.
If my experience is typical -- and discussing it with a few people it does seem like it is --, OP and others worrying about such things will better utilize their time to find compiler/architecture-specific workarounds, rather than point out the deficiencies in the standard: the standard is already lost, those people do not care.
Better spend your time and efforts in something you can actually accomplish without having to fight against windmills.

I think this is pretty clear. C11 3.19.2
indeterminate value
either an unspecified value or a trap
representation
Period. It cannot be anything else than the two cases above.
So code such as unsigned z=(y < 256) can never return 0, because x in your example cannot hold a value larger than 255. As per the representation of character types, 6.2.6, an unsigned char is not allowed to contain padding bits or trap representations.
Other types, on wildly exotic systems, could in theory hold values outside their range, padding bits and trap representations.
On real-world systems, that are extremely likely to use two's complement, trap representations do not exist. So the indeterminate value can only be unspecified. Unspecified, not undefined! There is a myth saying that "reading an indeterminate value is always undefined behavior". Save for trap representations and some other special cases, this is not true, see this. This is merely unspecified behavior.
Unspecified behavior does not mean that the compiler can run havoc and make weird assumptions, as it can when it encounters undefined behavior. It will have to assume that the variable values are in range. What the compiler cannot assume, is that the value is the same between reads - this was addressed by some DR.

Related

convert uint8_t* to uint64_t

What's more recommended or advisable way to convert array of uint8_t at offset i to uint64_t and why?
uint8_t * bytes = ...
uint64_t const v = ((uint64_t *)(bytes + i))[0];
or
uint64_t const v = ((uint64_t)(bytes[i+7]) << 56)
| ((uint64_t)(bytes[i+6]) << 48)
| ((uint64_t)(bytes[i+5]) << 40)
| ((uint64_t)(bytes[i+4]) << 32)
| ((uint64_t)(bytes[i+3]) << 24)
| ((uint64_t)(bytes[i+2]) << 16)
| ((uint64_t)(bytes[i+1]) << 8)
| ((uint64_t)(bytes[i]));
There are two primary differences.
One, the behavior of ((uint64_t *)(bytes + i))[0] is not defined by the C standard (unless certain prerequisites about what bytes point to are met). Generally, an array of bytes should not be accessed using a uint64_t type.
When memory defined as one type is accessed with another type, it is called aliasing, and the C standard only defines certain combinations of aliasing. Some compilers may support some aliasing beyond what the standard requires, but using it is not portable. Additionally, if bytes + i is not suitably aligned for a uint64_t, the access may cause an exception or otherwise malfunction.
Two, loading the bytes through aliasing, if it is defined (by the standard or by compiler extension), interprets the bytes using the memory ordering for the C implementation. Some C implementations store the bytes representing integers in memory from low address to high address for low-position-value bytes to high-position-value bytes, and some store them from high address to low address. (And they can be stored in non-consecutive orders too, although this is rare.) So loading the bytes this way will produce different values from the same bytes in memory based on what order the C implementation uses.
But loading the bytes and using shifts to combine them will always produce the same value from the same bytes in memory regardless of what order the C implementation uses.
The first method should be avoided, because there is no need for it. If one desires to interpret the bytes using the C implementation’s ordering, this can be done with:
uint64_t t;
memcpy(&t, bytes+i, sizeof t);
const uint64_t v = t;
Using memcpy provides a portable way of aliasing the uint64_t to store bytes into it. Good compilers recognize this idiom and will optimize the memcpy to a load from memory, if suitable for the target architecture (and if optimization is enabled).
If one desires to interpret the bytes using little-endian ordering, as shown in the code in the question, then the second method may be used. (Sometimes platforms will have routines that may provide more efficient code for this.)
You can also use memcpy
uint64_t n;
memcpy(&n, bytes + i, sizeof(uint64_t));
const uint64_t v = n;
The first option has two big problems that qualify as undefined behavior (anything can happen):
A uint8_t* or array of uint8_t is not necessarily aligned the same way as required by a larger type like uint64_t. Simply casting to uint64_t* leads to misaligned access. This can cause hardware exceptions, program crashes, slower code etc, all depending on the alignment requirements of the specific target.
It violates the internal type system of C, where each object in memory known by the compiler has an "effective type" that the compiler keeps track of. Based on this, the compiler is allowed to make certain assumptions regarding if a certain memory region have been accessed or not during optimization. If your code violates these type rules, as it would in this case, wrong machine code could get generated.
This is most commonly referred to as the strict aliasing rule and your cast followed by dereferencing would be a so-called "strict aliasing violation".
The second option is sound code, because:
When doing shifts or other forms of bitwise arithmetic, a large integer type should be used. That is, unsigned int or larger - depending on system. Using signed types or small integer types can lead to undefined behavior or unexpected results. See Implicit type promotion rules regarding problems with small integer types implicitly changing signedness in some expressions.
If not for the cast to uint64_t, then the bytes[i+7] << 56 shift would involve an implicit promotion of the left operand from uint8_t to int, which would be a bug. Because if the most significant bit (MSB) of the byte is set and we shift into/beyond the sign bit, we invoke undefined behavior - again, anything can happen.
And naturally we need to use a 64 bit type in this specific case or otherwise we wouldn't be able to shift as far as 56 bits. Shifting beyond the range of the type of the left operand is also undefined behavior.
Note that whether to pick the order of bytes[i+7] << 56 versus the alternative bytes[i+0] << 56 depends on the underlying CPU endianess. Bit shifts are nice since the actual shift ignores if the destination type is using big or little endian. But in this case you must know in advance which byte in the source array you want to correspond to the most significant. This code you have here will work if the array was built based on little endian formatting, since the last byte of the array is shifted to the highest address.
As for the uint64_t const v = , the const qualifier is a bit strange to have at local scope like that. It's harmless but confusing and doesn't really add anything of value inside a local scope. I would just drop it.

Using bit operations to "turn off" binary digits of a pointer

I was able to use bit operations to "turn off" binary digits of a number.
Ex:
x = x & ~(1<<0)
x = x & ~(1<<1)
(and repeat until desired number of digits starting from the right are changed to 0)
I would like to apply this technique to a pointer's address.
Unfortunately, the & operator cannot be used with pointers. Using the same lines of code as above, where x is a pointer, the compiler says "invalid operands to binary & (have int and int)."
I tried to typecast the pointers as ints, but that doesn't work as I assume the ints are too small (and I just realized I'm not allowed to cast).
(note: though this is part of a homework problem, I've already reasoned out why I need to turn off some digits after a good couple hours, so I'm fine in that regard. I'm simply trying to see if I can get a clever technique to do what I want to do here).
Restrictions: I cannot use loops, conditionals, any special functions, constants greater than 255, division, mod.
(edit: added restrictions to the bottom)
Use uintptr_t from <stdint.h>. You should always use unsigned types for bit twiddling, and (u)intptr_t is specifically chosen to be able to hold a pointer's value.
Note however that adjusting a pointer manually and dereferencing it is undefined behaviour, so watch your step. You shall be able to recover the exact original value of the pointer (or another valid pointer) before doing so.
Edit : from your comment I understand that you don't plan on dereferencing the twiddled pointer at all, so no undefined behaviour for you. Here is how you can check if your pointers share the same 64-byte block :
uintptr_t p1 = (uintptr_t)yourPointer1;
uintptr_t p2 = (uintptr_t)yourPointer2;
uintptr_t mask = ~(uintptr_t)63u; // Shave off 5 low-order bits
return (p1 & mask) == (p2 & mask);
C language standard library includes the (optional though) type intptr_t, for which there is guarantee that "any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".
Of course if you perform bitwise operation on the integer than the result is undefined behaviour.
Edit:
How unfortunate haha. I need a function to show two pointers are in
the same 64-byte block of memory. This holds true so long as every
digit but the least significant 6 digits of their binary
representations are equal. By making sure the last 6 digits are all
the same (ex: 0), I can return true if both pointers are equal. Well,
at least I hope so.
You should be able to check if they're in the same 64 block of memory by something like this:
if ((char *)high_pointer - (char *)low_pointer < 64) {
// do stuff
}
Edit2: This is likely to be undefined behaviour as pointed out by chris.
Original post:
You're probably looking for intptr_t or uintptr_t. The standard says you can cast to and from these types to pointers and have the value equal to the original.
However, despite it being a standard type, it is optional so some library implementations may choose not to implement it. Some architectures might not even represent pointers as integers so such a type wouldn't make sense.
It is still better than casting to and from an int or a long since it is guaranteed to work on implementations that supply it. Otherwise, at least you'll know at compile time that your program will break on a certain implementation/architecture.
(Oh, and as other answers have stated, manually changing the pointer when casted to an integer type and dereferencing it is undefined behaviour)

How is {int i=999; char c=i;} different from {char c=999;}?

My friend says he read it on some page on SO that they are different,but how could the two be possibly different?
Case 1
int i=999;
char c=i;
Case 2
char c=999;
In first case,we are initializing the integer i to 999,then initializing c with i,which is in fact 999.In the second case, we initialize c directly with 999.The truncation and loss of information aside, how on earth are these two cases different?
EDIT
Here's the link that I was talking of
why no overflow warning when converting int to char
One member commenting there says --It's not the same thing. The first is an assignment, the second is an initialization
So isn't it a lot more than only a question of optimization by the compiler?
They have the same semantics.
The constant 999 is of type int.
int i=999;
char c=i;
i created as an object of type int and initialized with the int value 999, with the obvious semantics.
c is created as an object of type char, and initialized with the value of i, which happens to be 999. That value is implicitly converted from int to char.
The signedness of plain char is implementation-defined.
If plain char is an unsigned type, the result of the conversion is well defined. The value is reduced modulo CHAR_MAX+1. For a typical implementation with 8-bit bytes (CHAR_BIT==8), CHAR_MAX+1 will be 256, and the value stored will be 999 % 256, or 231.
If plain char is a signed type, and 999 exceeds CHAR_MAX, the conversion yields an implementation-defined result (or, starting with C99, raises an implementation-defined signal, but I know of no implementations that do that). Typically, for a 2's-complement system with CHAR_BIT==8, the result will be -25.
char c=999;
c is created as an object of type char. Its initial value is the int value 999 converted to char -- by exactly the same rules I described above.
If CHAR_MAX >= 999 (which can happen only if CHAR_BIT, the number of bits in a byte, is at least 10), then the conversion is trivial. There are C implementations for DSPs (digital signal processors) with CHAR_BIT set to, for example, 32. It's not something you're likely to run across on most systems.
You may be more likely to get a warning in the second case, since it's converting a constant expression; in the first case, the compiler might not keep track of the expected value of i. But a sufficiently clever compiler could warn about both, and a sufficiently naive (but still fully conforming) compiler could warn about neither.
As I said above, the result of converting a value to a signed type, when the source value doesn't fit in the target type, is implementation-defined. I suppose it's conceivable that an implementation could define different rules for constant and non-constant expressions. That would be a perverse choice, though; I'm not sure even the DS9K does that.
As for the referenced comment "The first is an assignment, the second is an initialization", that's incorrect. Both are initializations; there is no assignment in either code snippet. There is a difference in that one is an initialization with a constant value, and the other is not. Which implies, incidentally, that the second snippet could appear at file scope, outside any function, while the first could not.
Any optimizing compiler will just make the int i = 999 local variable disappear and assign the truncated value directly to c in both cases. (Assuming that you are not using i anywhere else)
It depends on your compiler and optimization settings. Take a look at the actual assembly listing to see how different they are. For GCC and reasonable optimizations, the two blocks of code are probably equivalent.
Aside from the fact that the first also defines an object iof type int, the semantics are identical.
i,which is in fact 999
No, i is a variable. Semantically, it doesn't have a value at the point of the initialization of c ... the value won't be known until runtime (even though we can clearly see what it will be, and so can an optimizing compiler). But in case 2 you're assigning 999 to a char, which doesn't fit, so the compiler issues a warning.

What are the common undefined/unspecified behavior for C that you run into? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
An example of unspecified behavior in the C language is the order of evaluation of arguments to a function. It might be left to right or right to left, you just don't know. This would affect how foo(c++, c) or foo(++c, c) gets evaluated.
What other unspecified behavior is there that can surprise the unaware programmer?
A language lawyer question. Hmkay.
My personal top3:
violating the strict aliasing rule
violating the strict aliasing rule
violating the strict aliasing rule
:-)
Edit Here is a little example that does it wrong twice:
(assume 32 bit ints and little endian)
float funky_float_abs (float a)
{
unsigned int temp = *(unsigned int *)&a;
temp &= 0x7fffffff;
return *(float *)&temp;
}
That code tries to get the absolute value of a float by bit-twiddling with the sign bit directly in the representation of a float.
However, the result of creating a pointer to an object by casting from one type to another is not valid C. The compiler may assume that pointers to different types don't point to the same chunk of memory. This is true for all kind of pointers except void* and char* (sign-ness does not matter).
In the case above I do that twice. Once to get an int-alias for the float a, and once to convert the value back to float.
There are three valid ways to do the same.
Use a char or void pointer during the cast. These always alias to anything, so they are safe.
float funky_float_abs (float a)
{
float temp_float = a;
// valid, because it's a char pointer. These are special.
unsigned char * temp = (unsigned char *)&temp_float;
temp[3] &= 0x7f;
return temp_float;
}
Use memcopy. Memcpy takes void pointers, so it will force aliasing as well.
float funky_float_abs (float a)
{
int i;
float result;
memcpy (&i, &a, sizeof (int));
i &= 0x7fffffff;
memcpy (&result, &i, sizeof (int));
return result;
}
The third valid way: use unions. This is explicitly not undefined since C99:
float funky_float_abs (float a)
{
union
{
unsigned int i;
float f;
} cast_helper;
cast_helper.f = a;
cast_helper.i &= 0x7fffffff;
return cast_helper.f;
}
My personal favourite undefined behaviour is that if a non-empty source file doesn't end in a newline, behaviour is undefined.
I suspect it's true though that no compiler I will ever see has treated a source file differently according to whether or not it is newline terminated, other than to emit a warning. So it's not really something that will surprise unaware programmers, other than that they might be surprised by the warning.
So for genuine portability issues (which mostly are implementation-dependent rather than unspecified or undefined, but I think that falls into the spirit of the question):
char is not necessarily (un)signed.
int can be any size from 16 bits.
floats are not necessarily IEEE-formatted or conformant.
integer types are not necessarily two's complement, and integer arithmetic overflow causes undefined behaviour (modern hardware won't crash, but some compiler optimizations will result in behavior different from wraparound even though that's what the hardware does. For example if (x+1 < x) may be optimized as always false when x has signed type: see -fstrict-overflow option in GCC).
"/", "." and ".." in a #include have no defined meaning and can be treated differently by different compilers (this does actually vary, and if it goes wrong it will ruin your day).
Really serious ones that can be surprising even on the platform you developed on, because behaviour is only partially undefined / unspecified:
POSIX threading and the ANSI memory model. Concurrent access to memory is not as well defined as novices think. volatile doesn't do what novices think. Order of memory accesses is not as well defined as novices think. Accesses can be moved across memory barriers in certain directions. Memory cache coherency is not required.
Profiling code is not as easy as you think. If your test loop has no effect, the compiler can remove part or all of it. inline has no defined effect.
And, as I think Nils mentioned in passing:
VIOLATING THE STRICT ALIASING RULE.
My favorite is this:
// what does this do?
x = x++;
To answer some comments, it is undefined behaviour according to the standard. Seeing this, the compiler is allowed to do anything up to and including format your hard drive.
See for example this comment here. The point is not that you can see there is a possible reasonable expectation of some behaviour. Because of the C++ standard and the way the sequence points are defined, this line of code is actually undefined behaviour.
For example, if we had x = 1 before the line above, then what would the valid result be afterwards? Someone commented that it should be
x is incremented by 1
so we should see x == 2 afterwards. However this is not actually true, you will find some compilers that have x == 1 afterwards, or maybe even x == 3. You would have to look closely at the generated assembly to see why this might be, but the differences are due to the underlying problem. Essentially, I think this is because the compiler is allowed to evaluate the two assignments statements in any order it likes, so it could do the x++ first, or the x = first.
Dividing something by a pointer to something. Just won't compile for some reason... :-)
result = x/*y;
Another issue I encountered (which is defined, but definitely unexpected).
char is evil.
signed or unsigned depending on what the compiler feels
not mandated as 8 bits
I can't count the number of times I've corrected printf format specifiers to match their argument. Any mismatch is undefined behavior.
No, you must not pass an int (or long) to %x - an unsigned int is required
No, you must not pass an unsigned int to %d - an int is required
No, you must not pass a size_t to %u or %d - use %zu
No, you must not print a pointer with %d or %x - use %p and cast to a void *
I've seen a lot of relatively inexperienced programmers bitten by multi-character constants.
This:
"x"
is a string literal (which is of type char[2] and decays to char* in most contexts).
This:
'x'
is an ordinary character constant (which, for historical reasons, is of type int).
This:
'xy'
is also a perfectly legal character constant, but its value (which is still of type int) is implementation-defined. It's a nearly useless language feature that serves mostly to cause confusion.
A compiler doesn't have to tell you that you're calling a function with the wrong number of parameters/wrong parameter types if the function prototype isn't available.
The clang developers posted some great examples a while back, in a post every C programmer should read. Some interesting ones not mentioned before:
Signed integer overflow - no it's not ok to wrap a signed variable past its max.
Dereferencing a NULL Pointer - yes this is undefined, and might be ignored, see part 2 of the link.
The EE's here just discovered that a>>-2 is a bit fraught.
I nodded and told them it was not natural.
Be sure to always initialize your variables before you use them! When I had just started with C, that caused me a number of headaches.

Safely punning char* to double in C

In an Open Source program I
wrote, I'm reading binary data (written by another program) from a file and outputting ints, doubles,
and other assorted data types. One of the challenges is that it needs to
run on 32-bit and 64-bit machines of both endiannesses, which means that I
end up having to do quite a bit of low-level bit-twiddling. I know a (very)
little bit about type punning and strict aliasing and want to make sure I'm
doing things the right way.
Basically, it's easy to convert from a char* to an int of various sizes:
int64_t snativeint64_t(const char *buf)
{
/* Interpret the first 8 bytes of buf as a 64-bit int */
return *(int64_t *) buf;
}
and I have a cast of support functions to swap byte orders as needed, such
as:
int64_t swappedint64_t(const int64_t wrongend)
{
/* Change the endianness of a 64-bit integer */
return (((wrongend & 0xff00000000000000LL) >> 56) |
((wrongend & 0x00ff000000000000LL) >> 40) |
((wrongend & 0x0000ff0000000000LL) >> 24) |
((wrongend & 0x000000ff00000000LL) >> 8) |
((wrongend & 0x00000000ff000000LL) << 8) |
((wrongend & 0x0000000000ff0000LL) << 24) |
((wrongend & 0x000000000000ff00LL) << 40) |
((wrongend & 0x00000000000000ffLL) << 56));
}
At runtime, the program detects the endianness of the machine and assigns
one of the above to a function pointer:
int64_t (*slittleint64_t)(const char *);
if(littleendian) {
slittleint64_t = snativeint64_t;
} else {
slittleint64_t = sswappedint64_t;
}
Now, the tricky part comes when I'm trying to cast a char* to a double. I'd
like to re-use the endian-swapping code like so:
union
{
double d;
int64_t i;
} int64todouble;
int64todouble.i = slittleint64_t(bufoffset);
printf("%lf", int64todouble.d);
However, some compilers could optimize away the "int64todouble.i" assignment
and break the program. Is there a safer way to do this, while considering
that this program must stay optimized for performance, and also that I'd
prefer not to write a parallel set of transformations to cast char* to
double directly? If the union method of punning is safe, should I be
re-writing my functions like snativeint64_t to use it?
I ended up using Steve Jessop's answer because the conversion functions re-written to use memcpy, like so:
int64_t snativeint64_t(const char *buf)
{
/* Interpret the first 8 bytes of buf as a 64-bit int */
int64_t output;
memcpy(&output, buf, 8);
return output;
}
compiled into the exact same assembler as my original code:
snativeint64_t:
movq (%rdi), %rax
ret
Of the two, the memcpy version more explicitly expresses what I'm trying to do and should work on even the most naive compilers.
Adam, your answer was also wonderful and I learned a lot from it. Thanks for posting!
I highly suggest you read Understanding Strict Aliasing. Specifically, see the sections labeled "Casting through a union". It has a number of very good examples. While the article is on a website about the Cell processor and uses PPC assembly examples, almost all of it is equally applicable to other architectures, including x86.
Since you seem to know enough about your implementation to be sure that int64_t and double are the same size, and have suitable storage representations, you might hazard a memcpy. Then you don't even have to think about aliasing.
Since you're using a function pointer for a function that might easily be inlined if you were willing to release multiple binaries, performance must not be a huge issue anyway, but you might like to know that some compilers can be quite fiendish optimising memcpy - for small integer sizes a set of loads and stores can be inlined, and you might even find the variables are optimised away entirely and the compiler does the "copy" simply be reassigning the stack slots it's using for the variables, just like a union.
int64_t i = slittleint64_t(buffoffset);
double d;
memcpy(&d,&i,8); /* might emit no code if you're lucky */
printf("%lf", d);
Examine the resulting code, or just profile it. Chances are even in the worst case it will not be slow.
In general, though, doing anything too clever with byteswapping results in portability issues. There exist ABIs with middle-endian doubles, where each word is little-endian, but the big word comes first.
Normally you could consider storing your doubles using sprintf and sscanf, but for your project the file formats aren't under your control. But if your application is just shovelling IEEE doubles from an input file in one format to an output file in another format (not sure if it is, since I don't know the database formats in question, but if so), then perhaps you can forget about the fact that it's a double, since you aren't using it for arithmetic anyway. Just treat it as an opaque char[8], requiring byteswapping only if the file formats differ.
The standard says that writing to one field of a union and reading from it immediately is undefined behaviour. So if you go by the rule book, the union based method won't work.
Macros are usually a bad idea, but this might be an exception to the rule. It should be possible to get template-like behaviour in C using a set of macros using the input and output types as parameters.
As a very small sub-suggestion, I suggest you investigate if you can swap the masking and the shifting, in the 64-bit case. Since the operation is swapping bytes, you should be able to always get away with a mask of just 0xff. This should lead to faster, more compact code, unless the compiler is smart enough to figure that one out itself.
In brief, changing this:
(((wrongend & 0xff00000000000000LL) >> 56)
into this:
((wrongend >> 56) & 0xff)
should generate the same result.
Edit:
Removed comments regarding how to effectively store data always big endian and swapping to machine endianess, as questioner hasn't mentioned another program writes his data (which is important information).Still if the data needs conversion from any endian to big and from big to host endian, ntohs/ntohl/htons/htonl are the best methods, most elegant and unbeatable in speed (as they will perform task in hardware if CPU supports that, you can't beat that).
Regarding double/float, just store them to ints by memory casting:
double d = 3.1234;
printf("Double %f\n", d);
int64_t i = *(int64_t *)&d;
// Now i contains the double value as int
double d2 = *(double *)&i;
printf("Double2 %f\n", d2);
Wrap it into a function
int64_t doubleToInt64(double d)
{
return *(int64_t *)&d;
}
double int64ToDouble(int64_t i)
{
return *(double *)&i;
}
Questioner provided this link:
http://cocoawithlove.com/2008/04/using-pointers-to-recast-in-c-is-bad.html
as a prove that casting is bad... unfortunately I can only strongly disagree with most of this page. Quotes and comments:
As common as casting through a pointer
is, it is actually bad practice and
potentially risky code. Casting
through a pointer has the potential to
create bugs because of type punning.
It is not risky at all and it is also not bad practice. It has only a potential to cause bugs if you do it incorrectly, just like programming in C has the potential to cause bugs if you do it incorrectly, so does any programming in any language. By that argument you must stop programming altogether.
Type punning A form of pointer
aliasing where two pointers and refer
to the same location in memory but
represent that location as different
types. The compiler will treat both
"puns" as unrelated pointers. Type
punning has the potential to cause
dependency problems for any data
accessed through both pointers.
This is true, but unfortunately totally unrelated to my code.
What he refers to is code like this:
int64_t * intPointer;
:
// Init intPointer somehow
:
double * doublePointer = (double *)intPointer;
Now doublePointer and intPointer both point to the same memory location, but treating this as the same type. This is the situation you should solve with a union indeed, anything else is pretty bad. Bad that is not what my code does!
My code copies by value, not by reference. I cast a double to int64 pointer (or the other way round) and immediately deference it. Once the functions return, there is no pointer held to anything. There is a int64 and a double and these are totally unrelated to the input parameter of the functions. I never copy any pointer to a pointer of a different type (if you saw this in my code sample, you strongly misread the C code I wrote), I just transfer the value to a variable of different type (in an own memory location). So the definition of type punning does not apply at all, as it says "refer to the same location in memory" and nothing here refers to the same memory location.
int64_t intValue = 12345;
double doubleValue = int64ToDouble(intValue);
// The statement below will not change the value of doubleValue!
// Both are not pointing to the same memory location, both have their
// own storage space on stack and are totally unreleated.
intValue = 5678;
My code is nothing more than a memory copy, just written in C without an external function.
int64_t doubleToInt64(double d)
{
return *(int64_t *)&d;
}
Could be written as
int64_t doubleToInt64(double d)
{
int64_t result;
memcpy(&result, &d, sizeof(d));
return result;
}
It's nothing more than that, so there is no type punning even in sight anywhere. And this operation is also totally safe, as safe as an operation can be in C. A double is defined to always be 64 Bit (unlike int it does not vary in size, it is fixed at 64 bit), hence it will always fit into a int64_t sized variable.

Resources