I'm trying to round an input double using a specified rounding mode in in-line assembly in C. To do so, I need to grab the FPU control word using fstcw and then change the bits in the word. Unfortunately I'm encountering an error on the very first line:
double roundD(double n, RoundingMode roundingMode) {
asm("fstcw %%ax \n"
::: "ax"); //clobbers
return n;
}
The assembler error I receive is:
Error: operand type mismatch for 'fstcw'.
I'm under the impression this code snippet should store the FPU control word, which is 16 bits in length, in the AX register, which is also 16 bits in length. Just to be sure, I also tested the above code with the EAX register instead of AX, and received the same error.
What might I be missing here? Please let me know if any further information is needed.
fstcw (control word) only works with a memory destination operand, not register.
Perhaps you're getting mixed up with fstsw (status word) which has a separate form (separate opcode) where the destination is AX instead of specified by an addressing mode.
That was helpful to efficiently branch based on an FP compare result (before fcomi to compare into EFLAGS existed), which happens more often than anything with the control word. That's why there's an AX-destination version of fnstsw but not fnstcw.
And BTW, you can set the rounding mode using C. #include <fenv.h>
Or far better, if SSE4.1 is available, use roundsd (or the intrinsic) to do one rounding with a custom rounding mode, without setting / restoring the SSE rounding mode (in MXCSR, totally separate from the x87 rounding mode).
Related
How can I set the two-bit rounding control (RC) field of the FPU control word to 10 (round towards +infinity) using GAS in C.
Must use format
asm(
);
should only be about 7 lines of code altogether.
first push RC register onto the stack, flip the bits so that bit 10 & 11 are turned "on"
I'm unsure how to write this in the floating point stack using the correct syntax.
As I understand, floating points are stored in XMM registers and not the general purpose registers such as eax, so I did an experiment:
float a = 5;
in this case, a is stored as 1084227584 in the XMM register.
Here is an assembly version:
.text
.global _start
.LCO:
.long 1084227584
_start:
mov .LCO, %eax
movss .LCO, %xmm0
Executing the above assembly and debugging it using gdb shows that the value in eax will be 1084227584, however the value in ymm0 is 5.
Here are is my questions:
1- What's so special about the XMM registers? beside the SIMD instructions, are they the only type of registers to store floating points?
why can't I set the same bits in a regular register?
2- Are float and double values always stored as a floating point?
Can we never store them as a fixed point in C or assembly?
however the value in ymm0 is 5.
The bit-pattern in ymm0 is 1084227584. The float interpretation of that number is 5.0.
But you can print /x $xmm0.v4_int32 to see a hex representation of the bits in xmm0.
What's so special about the XMM registers? beside the SIMD instructions, are they the only type of registers to store floating points?
No, in asm everything is just bytes.
Some compilers will use an integer register to copy a float or double from one memory location to another, if not doing any computation on it. (Integer instructions are often smaller.) e.g. clang will do this: https://godbolt.org/z/76EWMY
void copy(float *d, float *s) { *d = *s; }
# clang8.0 -O3 targeting x86-64 System V
copy: # #copy
mov eax, dword ptr [rsi]
mov dword ptr [rdi], eax
ret
XMM/YMM/ZMM registers are special because they're the only registers that FP ALU instructions exist for (ignoring x87, which is only used for 80-bit long double in x86-64).
addsd xmm0, xmm1 (add scalar double) has no equivalent for integer registers.
Usually FP and integer data don't mingle very much, so providing a whole separate set of architectural registers allows more space for more data to be in registers. (Given the same instruction-encoding constraints, it's a choice between 16 FP + 16 GP integer vs. 16 unified registers, not vs. 32 unified registers).
Plus, a major microarchitectural benefit of a separate register file is that it can be physically close to the FP ALUs, while the integer register file can be physically close to the integer ALUs. For more, see Is there any architecture that uses the same register space for scalar integer and floating point operations?
are float and double values always stored as a floating point? can we never store them as a fixed point in C or assembly?
x86 compilers use float = IEEE754 binary32 https://en.wikipedia.org/wiki/Single-precision_floating-point_format. (And double = IEEE754 binary64). This is specified as part of the ABI.
Internally the as-if rule allows the compiler to do whatever it wants, as long as the final result is identical. (Or with -ffast-math, to pretend that FP math is associative, and assume NaN/Inf aren't possible.)
Compilers can't just randomly choose a different object representation for some float that other separately-compiled functions might look at.
There might be rare cases for locals that are never visible to other functions where a "human compiler" (hand-writing asm to implement C) could prove that fixed-point was safe. Or more likely, that the float values were exact integers small enough that double wouldn't round them, so your fixed-point could degenerate to integer (except maybe for a final step).
But it would be rare to know this much about possible values without just being able to do constant propagation and optimize everything away. That's why I say a human would have to be involved, to prove things the compiler wouldn't know to look for.
I think in theory you could have a C implementation that did use a fixed-point float or double. ISO C puts very little restrictions on what float and double actually are.
But limits.h constants like FLT_RADIX and DBL_MAX_EXP have interactions that might not make sense for a fixed-point format, which has constant distance between each representable value, instead of being much closer together near 0 and much farther apart for large number. (Rounding error of 0.5ulp is relative to the magnitude, instead of absolute.)
Still, most programs don't actually do things that would break if the "mantissa" and exponent limits didn't correspond to what you'd expect for DBL_MIN and DBL_MAX.
Another interesting possibility is to make float and double based on the Posit format (similar to traditional floating-point, but with a variable-length exponent encoding. https://www.johndcook.com/blog/2018/04/11/anatomy-of-a-posit-number/ https://posithub.org/index).
Modern hardware, especially Intel CPUs, has very good support for IEEE float/double, so fixed-point is often not a win. There are some nice SIMD instructions for 16-bit fixed-point, though, like high-half-only multiply, and even pmulhrsw which does fixed-point rounding.
But general 32-bit integer multiply has worse throughput than packed-float multiply. (Because the SIMD ALUs optimized for float/double only need 24x24-bit significand multipliers per 32 bits of vector element. Modern Intel CPUs run integer multiply and shift on the FMA execution units, with 2 uops per clock throughput.)
are they the only type of registers to store floating points?
No. There are the 80-bit floating-point registers (fp0-fp7) in the 8087-compatible FPU which should still be present in most modern CPUs.
Most 32-bit programs use these registers.
Can we store a floating point in a regular [integer] register?
Yes. 30 years ago many PCs contained a CPU without 80x87 FPU, so there were no fp0-fp7 registers. CPUs with XMM registers came even later.
We find a similar situation in mobile devices today.
What's so special about the XMM registers?
Using the 80x87 FPU seems to be more complicated than using XMM registers. Furthermore, I'm not sure if using the 80x87 is allowed in 64-bit programs in every operating system.
If you store a floating-point value in an integer register (such as eax), you don't have any instructions performing arithmetic: On x86 CPUs, there is no instruction for doing a multiplication or addition of floating-point values that are stored in integer registers.
In the case of CPUs without FPU, you have to do floating-point emulation. This means you have to perform one floating-point operation by doing multiple integer operations - just like you would do it with paper and pencil.
However, if you only want to store a floating-point value, you can of course also use an integer register. The same is true for copying a value or checking if two values are equal and similar operations.
Can we never store them as a fixed point in C or assembly?
Fixed point is used a lot when using CPUs that do not have an FPU.
For example when using 8- or 16-bit CPUs which are still used in automotive industry, consumer devices or PC peripheral devices.
However, I doubt that there are C compilers that automatically translate the keyword "float" to fixed point.
Since processors follow the convention of representing numbers as 2's complement how do they know whether the number resulted from an addition of two positive numbers is still positive and not negative.
For example if I add two 32bit numbers:
Let r2 contains the value- 0x50192E32
Sample Code:
add r1, r2, #0x6F06410C
str r1, [r3]
Here an overflow flag is set.
Now if I want to use the stored result from memory in later instructions(somewhere in the code...and by now due to different instructions let the processors cpsr has been changed) as shown below:
ldr r5, [r3]
add r7, r5
As the result of the first add instruction has 1 in it's MSB i.e.now r5 has 1 in it's MSB how do the processor interpret the value. Since the correct result on adding two positive numbers is positive. Is it just because the MSB has 1, it interprets as negative number? In that case we get different results from expected one.
Let for example in a 4 bit machine:
2's complement: 4=0100 and 5=0101;
-4=1100 and -5=1011
now 4+5=9 and if it is stored in a register/memory as 1001, and later if it is being accessed by another instruction and given the processor stores numbers in 2's complement format and checks the MSB and thinks that it is a negative 7.
If it all depends upon a programmer then how do one store the correct results in reg/mem. Is there anyway that we can do to our code to store the correct results?
If you care about overflow conditions, then you'd need to check the overflow flag before the status register is overwritten by some other operation - depending on the language involved, this may result in an exception being generated, or the operation being retried using a longer integer type. However, many languages (C, for example) DON'T care about overflow conditions - if the result is out of range of the type, you simply get an incorrect result. If a program written in such a language needs to detect overflow, it would have to implement the check itself - for example, in the case of addition, if the operands have the same sign, but the result is different, there was an overflow.
I know I have covered this many times as have others.
The carry flag can be considered the unsigned overflow flag for addition it is also the borrow flag or not borrow flag for subtraction depending on your architecture. The v flag is the signed overflow flag for addition (subtraction). YOU are the only one who knows or cares whether or not the addition is signed or unsigned as for addition/subtraction it doesnt matter.
It doesnt matter what flag it is, or what architecture, YOU have to make sure that if you care about the result (be it the result or a flag) that you preserve that information for as long as you have to until you need to use it, it is not the processors job to do that nor the instruction set nor the architecture in general. It goes for the answers in the registers as it does for the flags, it is all on you the programmer. Just preserve the state if you care. This question is like saying how do you solve this:
if(a==b)
{
}
stuff;
stuff;
I want to do the if a == b thing now.
It is all on you the programmer to make that work do the compare at the time you need to use it instead of at some other time, save the result of the compare at the time of the compare and then check the condition at the time you need to use it.
Edit: See the end of the question for an update on the answer.
I have spent several weeks tracking down a very odd bug in a piece of software I
maintain. Long story short, there is an old piece of software that is in
distribution, and a new piece of software that needs to match the output of the
old. The two rely (in theory) on a common library.[1] However, I cannot
duplicate the results being generated by the original version of the library,
even though the source for the two versions of the library matches. The actual
code in question is very simple. The original version looked like this (the
"voodoo" commented isn't mine):[2]
// float rstr[101] declared and initialized elsewhere as a global
void my_function() {
// I have elided several declarations not used until later in the function
double tt, p1, p2, t2;
char *ptr;
ptr = NULL;
p2 = 0.0;
t2 = 0.0; /* voooooodoooooooooo */
tt = (double) rstr[20];
p1 = (double) rstr[8];
// The code goes on and does lots of other things ...
}
The last statement I have included is where different behavior crops up. In the
original program, rstr[8] has the value 101325., and after casting it to
double[3] and assigning it, p1 has the value 101324.65625. Similarly, tt
ends up with the value 373.149999999996. I have confirmed these values with
both debug prints and examining the values in the debugger (including checking
the hex values). This is not surprising in any sense, it is as expected with
floating point values.
In a test wrapper around the same version of the library (as well as in any call
to a refactored version of the library), the first assignment (to tt)
produces the same results. However, p1 ends up as 101325.0, matching the original
value in rstr[8]. This difference, while small, sometimes produces substantial
variations in calculations that depend on the value of p1.
My test wrapper was simple, and matched the inclusion pattern of the original
exactly, but eliminated all other context:
#include "the_header.h"
float rstr[101];
int main() {
rstr[8] = 101325.;
rstr[20] = 373.15;
my_function();
}
Out of desperation, I have even gone to the trouble of looking at the
disassembly generated by VC6.
4550: tt = (double) rstr[20];
0042973F fld dword ptr [rstr+50h (006390a8)]
00429745 fstp qword ptr [ebp-0Ch]
4551: p1 = (double) rstr[8];
00429748 fld dword ptr [rstr+20h (00639078)]
0042974E fstp qword ptr [ebp-14h]
The version generated by VC6 for the same library function when called by the
test code wrapper (which matches the version generated by VC6 for my refactored
version of the library):
60: tt = (double) rstr[20];
00408BC8 fld dword ptr [_rstr+50h (0045bc88)]
00408BCE fstp qword ptr [ebp-0Ch]
61: p1 = (double) rstr[8];
00408BD1 fld dword ptr [_rstr+20h (0045bc58)]
00408BD7 fstp qword ptr [ebp-14h]
The only difference I can see, besides where in memory the array is stored and
how far along through the program this is occuring, is the leading _ on the
reference to rstr in the second. In general, VC6 uses a leading underscore for
name-mangling with functions, but I cannot find any documentation of it doing
name-mangling with array pointers. Nor can I see why these would produce
different results in any case, unless that name-mangling is involved with
reading the data accessed from the pointers in a different way.
The only other difference I can identify between the two (apart from calling
context) is that the original is an MFC-based Win32 application, while the
latter is a non-MFC console application. The two are otherwise configured the
same way, and they are built with identical compilation flags and against the
same C runtime.
Any suggestions would be much appreciated.
Edit: the solution, as several answers very helpfully pointed out, was to examine the binary/hex values and compare them to make sure the things I thought were exactly the same in fact were the same. This proved not to be the case—my strong protestations to the contrary notwithstanding.
Here I get to eat some humble pie and admit that while I thought I had checked those values, I had in fact checked some other, closely related values—a point I discovered only when I went back to look at the data again. As it turned out, the values being set in rstr[8] were very slightly different, and so the conversion to double highlighted the very slight differences, and these differences then propagated throughout the program in just the way I noted.
The discrepancy with the initialization I can explain based on the way the two programs work. Specifically, in one case rstr[8] is specified based on a user input to a GUI (and is in this case also the product of a conversion calculation), whereas in another, it is read in from a file where it has been stored with some loss of precision. Interestingly, in neither case was it actually exactly 101325.0, even the case in which it was read from a file where it had been stored as 1.01325e5.
This will teach me to double check my double checking of these sorts of things. Many thanks to Eric Postpischil and unwind for prompting me to check it again and for the prompt feedback. It was very helpful.
Footnotes
In actuality, the original "library" was a header file with all the
implementations done inline. The header was pulled in via #include and the
functions referenced via extern statements. I have fixed this in a
refactored version of the library that is actually a library, but see the
rest of the question.
Note that the variable names aren't mine, and are terrible. Likewise with the
use of global variables, which is rampant in this piece of software. I left
in the /* voooooodoooooooooo */ comment because it illustrates the…
unusual… programming practices of my predecessor. I think that element is
present because this was originally translated from Fortran and the developer
had used it as a means of dealing with some sort of memory bug. The line has
no effect whatsoever on the actual behavior of the code.
I am well aware that there doesn't actually need to be a cast here, but this
is how the original library worked, and I cannot modify it.
This:
In the original program, rstr[8] has the value 101325., and after casting it to double[3] and assigning it, p1 has the value 101324.65625
implies that the float value is not, in fact, exactly 101325.0, so when you convert to double you see more of the precision. I would (highly) suspect the method by which you inspect the float value, automatic (implicit and silent) rounding when printing is very common with floats. Inspect the bit pattern and decode it using the known format of the float on your system, to make sure you're not being tricked.
The possibilities are:
Despite the reported observations, rstr[8] has the value 101324.65625 in the original program immediately before the assignment to p1, not the reported 101325.
Despite the reported observations, p1 does not have the value 101324.65625 immediately after the assignment.
The program is not performing the assignment (including the conversion to double) correctly.
To test 1, carefully inspect the value of rstr[8] immediately before the assignment. I suggest:
printing or logging the value to 20 significant digits, and
printing or logging the bytes that comprise rstr[8], then interpreting the bytes in IEEE-754 64-bit binary format, or
using a debugger to do both of the above.
Additionally, I suggest testing whether floating-point values are displayed sufficiently well by injecting the value 101324.65625 into rstr[8] (by assignment or debugger) and displaying it in the same way as used above.
To test 2, carefully inspect the value of p1 immediately after the assignment. I suggest the above, applied to p1 instead of rstr[8].
The disassembly code shown in the question would appear to disprove 3. However, I would consider these tests:
Test whether these instructions are actually executed, perhaps by setting a breakpoint on them in the debugger.
Examine the instructions in the debugger immediately before they are executed.
Examine the memory to be loaded, the floating-point register after the load instruction, and the memory after it is stored.
What you need to do (debugging wise) is get the binary value of rstr[20] and rstr[8] between the old and refactored version. The binary values of tt and p1 wouldn't hurt either. That will prove that the arrays are initialized the same. Assigning a double to a float array and then converting it back to a double is not loss-less.
The only odd case I can think of is the FPU's rounding mode is set differently between the old and refactored program. Check the source code for "_control_fp(", "fesetround(" or "fenv.h".
The first rule of floating point is that results are approximations and should never be assumed to be exact.
Both the compiler and the CPU are able to do plenty of optimisations, and minor differences in optimisations (including lack of optimsations) can lead to minor differences in the resulting "approximations". This includes all sorts of things, like the order that operations are performed (e.g. don't assume that "(x + y) + z" is the same as "x + (y + z)"), if anything is pre-done by the compiler (e.g. constant folding), if something is inlined or not, etc.
For example, (internally) 80x86 uses 80-bit "extended precision" floating point which are more precise than double; so simply storing a result as double and loading it again causes different results to re-using the (higher precision) value already in the FPU's register.
Mostly what I'm saying is that if the exact value you're getting matters so much, then you shouldn't have been using floating point at all (consider "big rationals" or something).
Hello everyone
I am working on writing an assembly program and I would like to acquire some knowledge before I start on the looks of AT&T and Intel syntax when addressing xmm and fp. I know that in regular instructions a push when function on a byte is "pushb" in AT&T while "push byte" in Intel. Can anyone provide a similar comparison to when using xmm or fp? In sum I want to know how xmm operands are addressed
Thanks in advance
I'm not an AT&T fan/user, but the first place to start for intel would be the intel developer manuals(volumes 2a and 2b contain the instruction references), these list the sizes they operate on, which almost all intel syntax assemblers will try deduce (push will try narrow the variable or align it, depending on settings) if not specified, else you'll generally be using qword/dword for fp (for the likes of fld) and dword/qword/dqword for mmx/sse ops.