Fast floating point abs function - c

What is the fastest way to take the absolute value of a standard 32 bit float on x86-64 architectures in C99? The builtin functions fabsf and fabs are not fast enough. My current approach is bit twiddling:
unsigned int tmp = *((unsigned int *)&f) & 0x7fffffff;
float abs = *((float *)&tmp);
It works but is ugly. And I'm not sure it is optimal?
Please stop telling me about type-punned pointers because it's not what I'm asking about. I know the code can be phrased using unions but it doesn't matter because on all compilers (written in the last 10 years) it will emit exactly the same code.

Less standard violations:
/* use type punning instead of pointer arithmatics, to require proper alignment */
static inline float float2absf(float f) {
/* optimizer will optimize away the `if` statement and the library call */
if (sizeof(float) == sizeof(uint32_t)) {
union {
float f;
uint32_t i;
} u;
u.f = f;
u.i &= 0x7fffffff;
return u.f;
}
return fabsf(f);
}
IMHO, it would be safer to use the library function. This will improve code portability, especially on platforms where you might encounter a non-IEEE float representation or where type sizes might differ.
In general, once compiled for your platform, the library function should provide the fastest solution.
Having said that, library calls require both stack management and code jumps unless optimized away, which - for a simple bit-altering function - could result in more then twice the number of operations as well as cache misses. In many cases, this is avoidable by using compiler builtins, which could be done automatically by the compiler (it can optimize library functions into inline instructions).
Your bit-approach is (in theory) correct and could optimize away the operations related to function calls, as well as improve code locality... although the same could be achieved using compiler builtins and optimizations.
Also, please note that your approach isn't standard compliant and it assumes that sizeof(int) == sizeof(float)... I think that type punning using a union will improve that little bit.
In addition, using an inline function could work out like using a macro and make the code more readable. In addition, it could allow a fallbacks to the library function if type sizes don't match.

Related

Efficient conversion from Indeterminate Value to Unspecified Value

Sometimes in C it is necessary to read a possibly-written item from a partially-written array, such that:
If the item has been written, the read will yield the value that was in fact written, and
If the item hasn't been written, the read will convert an Unspecified bit pattern to a value of the appropriate type, with no side-effects.
There are a variety of algorithms where finding a solution from scratch is expensive, but validating a proposed solution is cheap. If an array holds solutions for all cases where they have been found, and arbitrary bit patterns for other cases, reading the array, testing whether it holds a valid solution, and slowly computing the solution only if the one in the array isn't valid, may be a useful optimization.
If an attempt to read a non-written array element of a types like uint32_t could be guaranteed to always yield a value of the appropriate type, efficiently such an approach would be easy and straightforward. Even if that requirement only held for unsigned char, it might still be workable. Unfortunately, compilers sometimes behave as though reading an Indeterminate Value, even of type unsigned char, may yield something that doesn't behave consistently as a value of that type. Further, discussions in a Defect Report suggest that operations involving Indeterminate values yield Indeterminate results, so even given something like unsigned char x, *p=&x; unsigned y=*p & 255; unsigned z=(y < 256); it would be possible for z to receive the value 0.
From what I can tell, the function:
unsigned char solidify(unsigned char *p)
{
unsigned char result = 0;
unsigned char mask = 1;
do
{
if (*p & mask) result |= mask;
mask += (unsigned)mask; // Cast only needed for capricious type ranges
} while(mask);
return result;
}
would be guaranteed to always yield a value in the range of type unsigned char any time the storage identified can be accessed as that type, even if it happens to hold Indeterminate Value. Such an approach seems rather slow and clunky, however, given that the required machine code to obtain the desired effect should usually be equivalent to returning x.
Are there any better approaches that would be guaranteed by the Standard to always yield a value within the range of unsigned char, even if the source value is Indeterminate?
Addendum
The ability to solidify values is necessary, among other things, when performing I/O with partially-written arrays and structures, in cases where nothing will care about what bits get output for the parts that were never set. Whether or not the Standard would require that fwrite be usable with partially-writtten structures or arrays, I would regard I/O routines that can be used in such fashion (writing arbitrary values for portions that weren't set) to be of higher quality than those which might jump the rails in such cases.
My concern is largely with guarding against optimizations which are unlikely to be used in dangerous combinations, but which could nonetheless occur as compilers get more and more "clever".
A problem with something like:
unsigned char solidify_alt(unsigned char *p)
{ return *p; }
is that compilers may combine an optimization which could be troublesome but tolerable in isolation, with one that would be good in isolation but deadly in combination with the first:
If the function is passed the address of an unsigned char which has been optimized to e.g. a 32-bit register, a function like the above may blindly return the contents of that register without clipping it to the range 0-255. Requiring that callers manually clip the results of such functions would be annoying but survivable if that were the only problem. Unfortunately...
Since the above function function will "always" return a value 0-255, compilers may omit any "downstream" code that would try to mask the value into that range, check if it was outside, or otherwise do things that would be irrelevant for values outside the range 0-255.
Some I/O devices may require that code wishing to write an octet perform a 16-bit or 32-bit store to an I/O register, and may require that 8 bits contain the data to be written and other bits hold a certain pattern. They may malfunction badly if any of the other bits are set wrong. Consider the code:
void send_byte(unsigned char *p, unsigned int n)
{
while(n--)
OUTPUT_REG = solidify_alt(*p++) | 0x0200;
}
void send_string4(char *st)
{
unsigned char buff[5]; // Leave space for zero after 4-byte string
strcpy((char*)buff, st);
send_bytes(buff, 4);
}
with the indended semantics that send_string4("Ok"); should send out an 'O', a 'k', a zero byte, and an arbitrary value 0-255. Since the code uses solidify_alt rather than solidify, a compiler could legally turn that into:
void send_string4(char *st)
{
unsigned buff0, buff1, buff2, buff3;
buff0 = st[0]; if (!buff0) goto STRING_DONE;
buff1 = st[1]; if (!buff1) goto STRING_DONE;
buff2 = st[2]; if (!buff2) goto STRING_DONE;
buff3 = st[3];
STRING_DONE:
OUTPUT_REG = buff0 | 0x0200;
OUTPUT_REG = buff1 | 0x0200;
OUTPUT_REG = buff2 | 0x0200;
OUTPUT_REG = buff3 | 0x0200;
}
with the effect that OUTPUT_REG may receive values with bits set outside the proper range. Even if output expression were changed to ((unsigned char)solidify_alt(*p++) | 0x0200) & 0x02FF) a compiler could still simplify that to yield the code given above.
The authors of the Standard refrained from requiring compiler-generated initialization of automatic variables because it would have made code slower in cases where such initialization would be semantically unnecessary. I don't think they intended that programmers should have to manually initialize automatic variables in cases where all bit patterns would be equally acceptable.
Note, btw, that when dealing with short arrays, initializing all the values will be inexpensive and would often be a good idea, and when using large arrays a compiler would be unlikely to impose the above "optimization". Omitting the initialization in cases where the array is large enough that the cost matters, however, would make the program's correct operation reliant upon "hope".
This is not an answer, but an extended comment.
The immediate solution would be for the compiler to provide a built-in, for example assume_initialized(variable [, variable ... ]*), that generates no machine code, but simply makes the compiler treat the contents of the specified variable (either scalars or arrays) to be defined but unknown.
One can achieve a similar effect using a dummy function defined in another compilation unit, for example
void define_memory(void *ptr, size_t bytes)
{
/* Nothing! */
}
and calling that (e.g. define_memory(some_array, sizeof some_array)), to stop the compiler from treating the values in the array as indeterminate; this works because at compile time, the compiler cannot determine the values are unspecified or not, and therefore must consider them specified (defined but unknown).
Unfortunately, that has serious performance penalties. The call itself, even though the function body is empty, has a performance impact. However, worse yet is the effect on the code generation: because the array is accessed in a separate compilation unit, the data must actually reside in memory in array form, and thus typically generates extra memory accesses, plus restricts the optimization opportunities for the compiler. In particular, even a small array must then exist, and cannot be implicit or reside completely in machine registers.
I have experimented with a few architecture (x86-64) and compiler (GCC) -specific workarounds (using extended inline assembly to fool the compiler to believe that the values are defined but unknown (unspecified, as opposed to indeterminate), without generating actual machine code -- because this does not require any machine code, just a small adjustment to how the compiler treats the arrays/variables --, but with about zero success.
Now, to the underlying reason why I wrote this comment.
Years and years ago, working on numerical computation code and comparing performance to a similar implementation in Fortran 95, I discovered the lack of a memrepeat(ptr, first, bytes) function: the counterpart to memmove() with respect to memcpy(), that would repeat first bytes at ptr to ptr+first up to ptr+bytes-1. Like memmove(), it would work on the storage representation of the data, so even if the ptr to ptr+first contained a trap representation, no trap would actually trigger.
Main use case is to initialize arrays with floating-point data (one-dimensional, multidimensional, or structures with floating-point members), by initializing the first structure or group of values, and then simply repeating the storage pattern over the entire array. This is a very common pattern in numerical computation.
As an example, using
double nums[7] = { 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0 };
memrepeat(nums, 2 * sizeof nums[0], sizeof nums);
yields
double nums[7] = { 7.0, 6.0, 7.0, 6.0, 7.0, 6.0, 7.0 };
(It is possible that the compiler could optimize the operation even better, if it was defined as e.g. memsetall(data, size, count), where size is the size of the duplicated storage unit, and count the total number of storage units (so count-1 units are actually copied). In particular, this allows easy implementation that uses nontemporal stores for the copies, reading from the initial storage unit. On the other hand, memsetall() can only copy full storage units unlike memrepeat(), so memsetall(nums, 2 * sizeof nums[0], 3); would leave the 7th element in nums[] unchanged -- i.e., in the above example, it'd yield { 7.0, 6.0, 7.0, 6.0, 7.0, 6.0, 1.0 }.)
Although you can trivially implement memrepeat() or memsetall(), even optimize them for a specific architecture and compiler, it is difficult to write a portable optimized version.
In particular, loop-based implementations that use memcpy() (or memmove()) yield quite inefficient code when compiled by e.g. GCC, because the compiler cannot coalesce a pattern of function calls into a single operation.
Most compilers often inline memcpy() and memmove() with internal, target-and-use-case-optimized versions, and doing that for such a memrepeat() and/or memsetall() function would make it portable. In Linux on x86-64, GCC inlines known-size calls, but keeps the function calls where the size is only known at runtime.
I did try to push it upstream, with some private and some public discussions on various mailing lists. The response was cordial, but clear: there is no way to get such features included into compilers, unless it is standardized by someone first, or you pique the interest of one of the core developers enough so that they want to try it themselves.
Because the C standards committee is only concerned at fulfilling the commercial interests of its corporate sponsors, there is zero chance of getting anything like that standardized into ISO C. (If there were, we really should push for basic features from POSIX like getline(), regex, and iconv to be included first; they'd have a much bigger positive impact on code we can teach new C programmers.)
None of this piqued the interest of the core GCC developers either, so at that point, I lost my interest in trying to push it upstream.
If my experience is typical -- and discussing it with a few people it does seem like it is --, OP and others worrying about such things will better utilize their time to find compiler/architecture-specific workarounds, rather than point out the deficiencies in the standard: the standard is already lost, those people do not care.
Better spend your time and efforts in something you can actually accomplish without having to fight against windmills.
I think this is pretty clear. C11 3.19.2
indeterminate value
either an unspecified value or a trap
representation
Period. It cannot be anything else than the two cases above.
So code such as unsigned z=(y < 256) can never return 0, because x in your example cannot hold a value larger than 255. As per the representation of character types, 6.2.6, an unsigned char is not allowed to contain padding bits or trap representations.
Other types, on wildly exotic systems, could in theory hold values outside their range, padding bits and trap representations.
On real-world systems, that are extremely likely to use two's complement, trap representations do not exist. So the indeterminate value can only be unspecified. Unspecified, not undefined! There is a myth saying that "reading an indeterminate value is always undefined behavior". Save for trap representations and some other special cases, this is not true, see this. This is merely unspecified behavior.
Unspecified behavior does not mean that the compiler can run havoc and make weird assumptions, as it can when it encounters undefined behavior. It will have to assume that the variable values are in range. What the compiler cannot assume, is that the value is the same between reads - this was addressed by some DR.

C long double in golang

I am porting an algorithm from C to Go. And I got a little bit confused. This is the C function:
void gauss_gen_cdf(uint64_t cdf[], long double sigma, int n)
{
int i;
long double s, d, e;
//Calculations ...
for (i = 1; i < n - 1; i++) {
cdf[i] = s;
}
}
And in the for loop value "s" is assigned to element "x" the array cdf. How is this possible? As far as I know, a long double is a float64 (in the Go context). So I shouldn't be able to compile the C code because I am assigning an long double to an array which just contains uint64 elements. But the C code is working fine.
So can someone please explain why this is working?
Thank you very much.
UPDATE:
The original C code of the function can be found here: https://github.com/mjosaarinen/hilabliss/blob/master/distribution.c#L22
The assignment cdf[i] = s performs an implicit conversion to uint64_t. It's hard to tell if this is intended without the calculations you omitted.
In practice, long double as a type has considerable variance across architectures. Whether Go's float64 is an appropriate replacement depends on the architecture you are porting from. For example, on x86, long double is an 80-byte extended precision type, but Windows systems are usually configured in such a way to compute results only with the 53-bit mantissa, which means that float64 could still be equivalent for your purposes.
EDIT In this particular case, the values computed by the sources appear to be static and independent of the input. I would just use float64 on the Go side and see if the computed values are identical to those of the C version, when run on a x86 machine under real GNU/Linux (virtualization should be okay), to work around the Windows FPU issues. The choice of x86 is just a guess because it is likely what the original author used. I do not understand the underlying cryptography, so I can't say whether a difference in the computed values impact the security. (Also note that the C code does not seem to properly seed its PRNG.)
C long double in golang
The title suggests an interest in whether of not Go has an extended precision floating-point type similar to long double in C.
The answer is:
Not as a primitive, see Basic types.
But arbitrary precision is supported by the math/big library.
Why this is working?
long double s = some_calculation();
uint64_t a = s;
It compiles because, unlike Go, C allows for certain implicit type conversions. The integer portion of the floating-point value of s will be copied. Presumably the s value has been scaled such that it can be interpreted as a fixed-point value where, based on the linked library source, 0xFFFFFFFFFFFFFFFF (2^64-1) represents the value 1.0. In order to make the most of such assignments, it may be worthwhile to have used an extended floating-point type with 64 precision bits.
If I had to guess, I would say that the (crypto-related) library is using fixed-point here because they want to ensure deterministic results, see: How can floating point calculations be made deterministic?. And since the extended-precision floating point is only being used for initializing a lookup table, using the (presumably slow) math/big library would likely perform perfectly fine in this context.

What is better, macros or inline functions for square or sqrt of constants?

In algorithms I am using such constants:
sqrt(3)
pow(M_PI, 2)
Unfortunately the C preprocessor is not smart enough to pre-compute these constants. Is there any additional layer of preprocessing that can be used with GCC or any other C compiler?
I have currently implemented these two constants as:
#define SQRT3 1.7320508
#define PIPI (M_PI*M_PI)
But I feel using obscure names like PIPI (which also means pee in french) is not the best solution. I think it would be better to write:
inline float square(float x) {
return x * x;
}
However this is not possible for the square root. At least, I can get sufficient an approximation with this:
inline float sqrt_approx(float z)
{
int val_int = *(int*)&z;
val_int -= 1 << 23;
val_int >>= 1;
val_int += 1 << 29;
return *(float*)&val_int;
}
Unfortunately, and again, the compiler is not smart enough to interpret sqrt_approx(3) as 1.73
Is there a better way to deal with these C limitations?
We are in the year 2015, we have rovers that roams on Mars and we are still dealing with C compilers that makes us feel in the 80s. Am I wrong?
Without -ffreestanding and the like, Gcc computes them at compile-time, at least with optimizations turned on. So there is unlikely to be a function call.
If -ffreestanding is used, I don't see a better way than defining the constants manually for cases like sqrt, if a self-made inline function turns out to be insufficiently fast. Gcc's const attribute may help to avoid re-computations (but I guess Gcc can infere that on its own if the definition is visible).
The question said they should be computed by the preprocessor. I don't see a reason for this, the only thing one can do with floating-point constants in the preprocessor, is to stringify or concatenate them. If this is really needed, they need to be hard-coded, too. An inline function also cannot be called by the preprocessor.
Because those constants are true constants (not to be confused with const variables). You'd want your code to use them directly at runtime, and you certainly do not want to have a call to the sqrt or pow function that is basically useless because you already know the result at compile time.
If you want to be sure that there is no useless calls, you should use the macros with C pre-processor for this. And you are not wrong, the C compilers can sometimes make us feel in the 80s. There is many other more modern programming languages available.
However as the compilers are getting more modern at optimizing prorgrams it is also possible that the compiler might inline functions then pre-compute them at compile time. The only way to know if it is possible is to test and look at the generating assembly. For instance in my test program :
static inline int twice(int x)
{
return 2*x;
}
int main()
{
int i = twice(2);
i += twice(4);
printf("Result is %d\n", twice(i));
return 0;
}
Compiles with latest gcc and -Os turned on to :
main:
sub rsp, 40
.seh_stackalloc 40
.seh_endprologue
call __main
lea rcx, .LC0[rip]
mov edx, 24
call printf
xor eax, eax
add rsp, 40
ret
As you can see, the result, 24 is pre-computed in the assembly code. With double type it's less obvious to proove because of how floating points number doesn't immediately appear in the assembly, however I checked and the optimisation is also made. Then it is not neccessary to use the C pre-processor for constants anymore. But if you want performance, always check the assembly code.
I would suggest to use constants which makes the compiler aware of it.
And pick more explicit names for your constants.
const float SQRT_OF_3 = 1.7320508;
const float PI_SQUARE = M_PI*M_PI;

How can I minimize the code size of this program?

I have some problems with memory. Is it possible to reduce memory of compiled program in this function?
It makes some calculations with time variables {hh,mm,ss.0} and returns time (in millis) that depends on current progress (_SHOOT_COUNT)
unsigned long hour_koef=3600000L;
unsigned long min_koef=60000;
unsigned long timeToMillis(int* time)
{
return (hour_koef*time[0]+min_koef*time[1]+1000*time[2]+100*time[3]);
}
float Func1(float x)
{
return (x*x)/(x*x+(1-x)*(1-x));
}
float EaseFunction(byte percent,byte type)
{
if(type==0)
return Func1(float(percent)/100);
}
unsigned long DelayEasyControl()
{
long dd=timeToMillis(D1);
long dINfrom=timeToMillis(Din);
long dOUTto=timeToMillis(Dout);
if(easyINmode==0 && easyOUTmode==0) return dd;
if(easyINmode==1 && easyOUTmode==0)
{
if(_SHOOT_COUNT<duration) return (dINfrom+(dd-dINfrom)*EaseFunction(_SHOOT_COUNT*100/duration,0));
else return dd;
}
if(easyOUTmode==1)
{
if(_SHOOT_COUNT>=_SHOOT_activation && _SHOOT_activation!=-1)
{
if((_SHOOT_COUNT-_SHOOT_activation)<current_settings.delay_easyOUT_duration) return (dOUTto-(dOUTto-dd)*(1-EaseFunction((_SHOOT_COUNT-_SHOOT_activation)*100/duration,0)));
else return dOUTto;
}
else
{
if(easyINmode==0) return dd;
else if(_SHOOT_COUNT<duration) return (dINfrom+(dd-dINfrom)*EaseFunction(_SHOOT_COUNT*90/duration,0));
else return dd;
}
}
}
You mention that it's code size you want to optimize, and that you're doing this on an Arduino clone (based on the ATmega32U4).
Those controllers don't have hardware support for floating-point, so it's all going to be emulated in software which takes up a lot of code.
Try re-writing it to do fixed-point arithmetic, you will save a lot of code space that way.
You might see minor gains by optimizing the other data types, i.e. uint16_t instead of long might suffice for some of the values, and marking functions as inline can save the instructions needed to do the jump. The compiler might already be inlining, of course.
Most compilers have an option for optimizing for size, try it first. Then you may try a non-standard 24-bit float type available in some compilers for 8-bit MCUs like NXP's MRK III or MPLAB XC8
By default, the XC8 compiler uses a 24-bit floating-point format that is a truncated form of the 32-bit format and that has eight bits of exponent but only 16 bits of signed mantissa.
Understanding Floating-Point Values
That'll reduce the floating-point math library size a lot without any code changes, but it may still be too big for your MCU. In this case you'll need to rewrite the program. The most effective solution is to switch to fixed-point (A.K.A scaled integers) like #unwind said if you don't need very wide ranges. In fact that's a lot faster and takes much less ROM size than a software floating-point solution. Microchip's document above also suggests that solution:
The larger IEEE formats allow precise numbers, covering a large range of values to be handled. However, these formats require more data memory to store values of this type and the library routines that process these values are very large and slow. Floating-point calculations are always much slower than integer calculations and should be avoided if at all possible, especially if you are using an 8-bit device. This page indicates one alternative you might consider.
Also, you can try storing duplicated expressions like x*x and 1-x to a variable instead of calculating them twice like this (x*x)/(x*x+(1-x)*(1-x)), which helps a little bit if the compiler is too dumb. Same to easyINmode==0, easyOUTmode==1...
Some other things:
ALL_CAPS should be used for macros and constants only
Identifiers begin with _ and a capital letter is reserved for libraries. C may also use it for future features like _Bool or _Atomic. See What are the rules about using an underscore in a C++ identifier? (Arduino is probably C++)
Use functions instead of macros for things that are reused many times, because the inline expansion will eat some space each time it's used

Safely punning char* to double in C

In an Open Source program I
wrote, I'm reading binary data (written by another program) from a file and outputting ints, doubles,
and other assorted data types. One of the challenges is that it needs to
run on 32-bit and 64-bit machines of both endiannesses, which means that I
end up having to do quite a bit of low-level bit-twiddling. I know a (very)
little bit about type punning and strict aliasing and want to make sure I'm
doing things the right way.
Basically, it's easy to convert from a char* to an int of various sizes:
int64_t snativeint64_t(const char *buf)
{
/* Interpret the first 8 bytes of buf as a 64-bit int */
return *(int64_t *) buf;
}
and I have a cast of support functions to swap byte orders as needed, such
as:
int64_t swappedint64_t(const int64_t wrongend)
{
/* Change the endianness of a 64-bit integer */
return (((wrongend & 0xff00000000000000LL) >> 56) |
((wrongend & 0x00ff000000000000LL) >> 40) |
((wrongend & 0x0000ff0000000000LL) >> 24) |
((wrongend & 0x000000ff00000000LL) >> 8) |
((wrongend & 0x00000000ff000000LL) << 8) |
((wrongend & 0x0000000000ff0000LL) << 24) |
((wrongend & 0x000000000000ff00LL) << 40) |
((wrongend & 0x00000000000000ffLL) << 56));
}
At runtime, the program detects the endianness of the machine and assigns
one of the above to a function pointer:
int64_t (*slittleint64_t)(const char *);
if(littleendian) {
slittleint64_t = snativeint64_t;
} else {
slittleint64_t = sswappedint64_t;
}
Now, the tricky part comes when I'm trying to cast a char* to a double. I'd
like to re-use the endian-swapping code like so:
union
{
double d;
int64_t i;
} int64todouble;
int64todouble.i = slittleint64_t(bufoffset);
printf("%lf", int64todouble.d);
However, some compilers could optimize away the "int64todouble.i" assignment
and break the program. Is there a safer way to do this, while considering
that this program must stay optimized for performance, and also that I'd
prefer not to write a parallel set of transformations to cast char* to
double directly? If the union method of punning is safe, should I be
re-writing my functions like snativeint64_t to use it?
I ended up using Steve Jessop's answer because the conversion functions re-written to use memcpy, like so:
int64_t snativeint64_t(const char *buf)
{
/* Interpret the first 8 bytes of buf as a 64-bit int */
int64_t output;
memcpy(&output, buf, 8);
return output;
}
compiled into the exact same assembler as my original code:
snativeint64_t:
movq (%rdi), %rax
ret
Of the two, the memcpy version more explicitly expresses what I'm trying to do and should work on even the most naive compilers.
Adam, your answer was also wonderful and I learned a lot from it. Thanks for posting!
I highly suggest you read Understanding Strict Aliasing. Specifically, see the sections labeled "Casting through a union". It has a number of very good examples. While the article is on a website about the Cell processor and uses PPC assembly examples, almost all of it is equally applicable to other architectures, including x86.
Since you seem to know enough about your implementation to be sure that int64_t and double are the same size, and have suitable storage representations, you might hazard a memcpy. Then you don't even have to think about aliasing.
Since you're using a function pointer for a function that might easily be inlined if you were willing to release multiple binaries, performance must not be a huge issue anyway, but you might like to know that some compilers can be quite fiendish optimising memcpy - for small integer sizes a set of loads and stores can be inlined, and you might even find the variables are optimised away entirely and the compiler does the "copy" simply be reassigning the stack slots it's using for the variables, just like a union.
int64_t i = slittleint64_t(buffoffset);
double d;
memcpy(&d,&i,8); /* might emit no code if you're lucky */
printf("%lf", d);
Examine the resulting code, or just profile it. Chances are even in the worst case it will not be slow.
In general, though, doing anything too clever with byteswapping results in portability issues. There exist ABIs with middle-endian doubles, where each word is little-endian, but the big word comes first.
Normally you could consider storing your doubles using sprintf and sscanf, but for your project the file formats aren't under your control. But if your application is just shovelling IEEE doubles from an input file in one format to an output file in another format (not sure if it is, since I don't know the database formats in question, but if so), then perhaps you can forget about the fact that it's a double, since you aren't using it for arithmetic anyway. Just treat it as an opaque char[8], requiring byteswapping only if the file formats differ.
The standard says that writing to one field of a union and reading from it immediately is undefined behaviour. So if you go by the rule book, the union based method won't work.
Macros are usually a bad idea, but this might be an exception to the rule. It should be possible to get template-like behaviour in C using a set of macros using the input and output types as parameters.
As a very small sub-suggestion, I suggest you investigate if you can swap the masking and the shifting, in the 64-bit case. Since the operation is swapping bytes, you should be able to always get away with a mask of just 0xff. This should lead to faster, more compact code, unless the compiler is smart enough to figure that one out itself.
In brief, changing this:
(((wrongend & 0xff00000000000000LL) >> 56)
into this:
((wrongend >> 56) & 0xff)
should generate the same result.
Edit:
Removed comments regarding how to effectively store data always big endian and swapping to machine endianess, as questioner hasn't mentioned another program writes his data (which is important information).Still if the data needs conversion from any endian to big and from big to host endian, ntohs/ntohl/htons/htonl are the best methods, most elegant and unbeatable in speed (as they will perform task in hardware if CPU supports that, you can't beat that).
Regarding double/float, just store them to ints by memory casting:
double d = 3.1234;
printf("Double %f\n", d);
int64_t i = *(int64_t *)&d;
// Now i contains the double value as int
double d2 = *(double *)&i;
printf("Double2 %f\n", d2);
Wrap it into a function
int64_t doubleToInt64(double d)
{
return *(int64_t *)&d;
}
double int64ToDouble(int64_t i)
{
return *(double *)&i;
}
Questioner provided this link:
http://cocoawithlove.com/2008/04/using-pointers-to-recast-in-c-is-bad.html
as a prove that casting is bad... unfortunately I can only strongly disagree with most of this page. Quotes and comments:
As common as casting through a pointer
is, it is actually bad practice and
potentially risky code. Casting
through a pointer has the potential to
create bugs because of type punning.
It is not risky at all and it is also not bad practice. It has only a potential to cause bugs if you do it incorrectly, just like programming in C has the potential to cause bugs if you do it incorrectly, so does any programming in any language. By that argument you must stop programming altogether.
Type punning A form of pointer
aliasing where two pointers and refer
to the same location in memory but
represent that location as different
types. The compiler will treat both
"puns" as unrelated pointers. Type
punning has the potential to cause
dependency problems for any data
accessed through both pointers.
This is true, but unfortunately totally unrelated to my code.
What he refers to is code like this:
int64_t * intPointer;
:
// Init intPointer somehow
:
double * doublePointer = (double *)intPointer;
Now doublePointer and intPointer both point to the same memory location, but treating this as the same type. This is the situation you should solve with a union indeed, anything else is pretty bad. Bad that is not what my code does!
My code copies by value, not by reference. I cast a double to int64 pointer (or the other way round) and immediately deference it. Once the functions return, there is no pointer held to anything. There is a int64 and a double and these are totally unrelated to the input parameter of the functions. I never copy any pointer to a pointer of a different type (if you saw this in my code sample, you strongly misread the C code I wrote), I just transfer the value to a variable of different type (in an own memory location). So the definition of type punning does not apply at all, as it says "refer to the same location in memory" and nothing here refers to the same memory location.
int64_t intValue = 12345;
double doubleValue = int64ToDouble(intValue);
// The statement below will not change the value of doubleValue!
// Both are not pointing to the same memory location, both have their
// own storage space on stack and are totally unreleated.
intValue = 5678;
My code is nothing more than a memory copy, just written in C without an external function.
int64_t doubleToInt64(double d)
{
return *(int64_t *)&d;
}
Could be written as
int64_t doubleToInt64(double d)
{
int64_t result;
memcpy(&result, &d, sizeof(d));
return result;
}
It's nothing more than that, so there is no type punning even in sight anywhere. And this operation is also totally safe, as safe as an operation can be in C. A double is defined to always be 64 Bit (unlike int it does not vary in size, it is fixed at 64 bit), hence it will always fit into a int64_t sized variable.

Resources