Convert a float literal to int representation in x86 assembly? - c

The following C code:
int main()
{
float f;
f = 3.0;
}
Is converted to the following assembly instructions:
main:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
flds .LC0
fstps -4(%ebp)
movl $0, %eax
leave
ret
.LC0:
.long 1077936128
What is the correct way to calculate the .long/int representation of the float literal?
e.g. 1077936128 generated from 3.0 for the example shown above
For this example gcc is used with the -m32 -S -O0 -fno-stack-protector -fno-asynchronous-unwind-tables flags using intel settings to generate the assembly output.
References:
Compiler Explorer Link with compilation flags and other settings

x86 FPU hardware uses IEEE754 binary32 / binary64 representations for float / double.
Determining the IEEE 754 representation of a floating point number is not trivial for humans. In handwritten assembly code, it's usually a good idea to use the .float or .double directives instead:
.float 3.0 # generates 3.0 as a 32 bit float
.double 3.0 # generates 3.0 as a 64 bit float
If you really want to compute this manually, refer to the explanations on Wikipedia. It might be interesting to do so as an exercise, but for actual programming it's tedious and mostly useless.
Compilers do the conversion (with rounding to the nearest representable FP value) internally, because FP values often don't come directly from a literal in the source; they can come from constant folding. e.g. 1.23 * 4.56 is evaluated at compile time, so the compiler already ends up with FP values in float or double binary representation. Printing them back to decimal for the assembler to parse and re-convert to binary would be slower and might require a lot of decimal places.
To compute the representation of a 32 bit float as a 32 bit integer, you can use an online IEEE754 converter, or a program like this:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
int main(int argc, char *argv[])
{
union { uint32_t u32; float f32; } intfloat;
if (argc != 2) {
fprintf(stderr, "Usage: %s some-number\n", argv[0]);
return EXIT_FAILURE;
}
intfloat.f32 = atof(argv[1]);
printf("0x%08" PRIx32 "\n", intfloat.u32);
return EXIT_SUCCESS;
}

Related

Definition floating-point numbers in X86 Assembly - C Translation

Currently studying C. When I define, for example, a vector, such as:
float var1[2023] = {-53.3125}
What would the corresponding X86 Assembly translation look like? I'm looking for the exact portion of code where the variable is defined, where the ".type" and ".size" and alignment values are mentioned.
I've seen on the internet that when dealing with a floating-point number, the X86 Assembly conversion will simply be ".long". However, I'm not sure to what point that is correct.
One easy way to find out is to ask the compiler to show you:
// float.c
float var1[2023] = { -53.3125 };
then compile it:
$ gcc -S float.c
and then study the output:
.file "float.c"
.globl var1
.data
.align 32
.type var1, #object
.size var1, 8092
var1:
.long 3260366848
.zero 8088
.ident "GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-39)"
.section .note.GNU-stack,"",#progbits
Note that this is just GCC's implementation; clang does it differently:
.file "float.c"
.type var1,#object # #var1
.data
.globl var1
.align 16
var1:
.long 3260366848 # float -5.331250e+01
.long 0 # float 0.000000e+00
.long 0 # float 0.000000e+00
// thousands of these
.size var1, 8092
.ident "clang version 3.4.2 (tags/RELEASE_34/dot2-final)"
.section ".note.GNU-stack","",#progbits
EDIT - To answer the comment below, the use of long simply lays down a specific bit pattern that encodes the compiler's idea of floating point format.
The value 3260366848 is the same as hex 0xC2554000, which is 11000010010101010100000000000000 in binary, and it's the binary value that the CPU cares about. If you care to, you can get out your IEEE floating point spec and decode this, there's the sign, that's the exponent, etc. but all the details of the floating point encoding were handled by the compiler, not the assembler.
I'm no kind of compiler expert, but decades ago I was tracking down a bug in a C compiler's floating point support, and though I don't remember the details, in the back of my mind it strike me as having the compiler do this would have been helpful by saving me from having to use a disassembler to find out what the bit pattern was actually encoded.
Surely others will weigh in here.
EDIT2 Bits are bits, and this little C program (which relies on sizeof int and sizeof float being the same size), demonstrates this:
// float2.c
#include <stdio.h>
#include <memory.h>
int main()
{
float f = -53.3125;
unsigned int i;
printf("sizeof int = %lu\n", sizeof(i));
printf("sizeof flt = %lu\n", sizeof(f));
memcpy(&i, &f, sizeof i); // copy float bits into an int
printf("float = %f\n", f);
printf("i = 0x%08x\n", i);
printf("i = %u\n", i);
return 0;
}
Running it shows that bits are bits:
sizeof int = 4
sizeof flt = 4
float = -53.312500
i = 0xc2554000
i = 3260366848 <-- there ya go
This is just a display notion for 32 bits depending on how you look at them.
Now to answer the question of how would you determine 3260366848 on your own from the floating point value, you'd need to get out your IEEE standard and draw out all the bits manually (recommend strong coffee), then read those 32 bits as an integer.

What is happening here in pow function?

I have seen various answer here that depicts Strange behavior of pow function in C.
But I Have something different to ask here.
In the below code I have initialized int x = pow(10,2) and int y = pow(10,n) (int n = 2).
In first case it when I print the result it shows 100 and in the other case it comes out to be 99.
I know that pow returns double and it gets truncated on storing in int, but I want to ask why the output comes to be different.
CODE1
#include<stdio.h>
#include<math.h>
int main()
{
int n = 2;
int x;
int y;
x = pow(10,2); //Printing Gives Output 100
y = pow(10,n); //Printing Gives Output 99
printf("%d %d" , x , y);
}
Output : 100 99
Why is the output coming out to be different. ?
My gcc version is 4.9.2
Update :
Code 2
int main()
{
int n = 2;
int x;
int y;
x = pow(10,2); //Printing Gives Output 100
y = pow(10,n); //Printing Gives Output 99
double k = pow(10,2);
double l = pow(10,n);
printf("%d %d\n" , x , y);
printf("%f %f\n" , k , l);
}
Output : 100 99
100.000000 100.000000
Update 2 Assembly Instructions FOR CODE1
Generated Assembly Instructions GCC 4.9.2 using gcc -S -masm=intel :
.LC1:
.ascii "%d %d\0"
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 48
call ___main
mov DWORD PTR [esp+44], 2
mov DWORD PTR [esp+40], 100 //Concerned Line
fild DWORD PTR [esp+44]
fstp QWORD PTR [esp+8]
fld QWORD PTR LC0
fstp QWORD PTR [esp]
call _pow //Concerned Line
fnstcw WORD PTR [esp+30]
movzx eax, WORD PTR [esp+30]
mov ah, 12
mov WORD PTR [esp+28], ax
fldcw WORD PTR [esp+28]
fistp DWORD PTR [esp+36]
fldcw WORD PTR [esp+30]
mov eax, DWORD PTR [esp+36]
mov DWORD PTR [esp+8], eax
mov eax, DWORD PTR [esp+40]
mov DWORD PTR [esp+4], eax
mov DWORD PTR [esp], OFFSET FLAT:LC1
call _printf
leave
ret
.section .rdata,"dr"
.align 8
LC0:
.long 0
.long 1076101120
.ident "GCC: (tdm-1) 4.9.2"
.def _pow; .scl 2; .type 32; .endef
.def _printf; .scl 2; .type 32; .endef
I know that pow returns double and it gets truncated on storing in int, but I want to ask why the output comes to be different.
You must first, if you haven't already, divest yourself of the idea that floating-point numbers are in any way sensible or predictable. double only approximates real numbers and almost anything you do with a double is likely to be an approximation to the actual result.
That said, as you have realized, pow(10, n) resulted in a value like 99.99999999999997, which is an approximation accurate to 15 significant figures. And then you told it to truncate to the largest integer less than that, so it threw away most of those.
(Aside: there is rarely a good reason to convert a double to an int. Usually you should either format it for display with something like sprintf("%.0f", x), which does rounding correctly, or use the floor function, which can handle floating-point numbers that may be out of the range of an int. If neither of those suit your purpose, like in currency or date calculations, possibly you should not be using floating point numbers at all.)
There are two weird things going on here. First, why is pow(10, n) inaccurate? 10, 2, and 100 are all precisely representable as double. The best answer I can offer is that the C standard library you are using has a bug. (The compiler and the standard library, which I assume are gcc and glibc, are developed on different release schedules and by different teams. If pow is returning inaccurate results, that is probably a bug in glibc, not gcc.)
In the comments on your question, amdn found a glibc bug to do with FP rounding that might be related and another Q&A that goes into more detail about why this happens and how it's not a violation of the C standard. chux's answer also addresses this. (C doesn't require implementation of IEEE 754, but even if it did, pow isn't required to use correct rounding.) I will still call this a glibc bug, because it's an undesirable property.
(It's also conceivable, though unlikely, that your processor's FPU is wrong.)
Second, why is pow(10, n) different from pow(10, 2)? This one is far easier. gcc optimizes away function calls for which the result can be calculated at compile time, so pow(10, 2) is almost certainly being optimized to 100.0. If you look at the generated assembly code, you will find only one call to pow.
The GCC manual, section 6.59 describes which standard library functions may be treated in this way (follow the link for the full list):
The remaining functions are provided for optimization purposes.
With the exception of built-ins that have library equivalents such as the standard C library functions discussed below, or that expand to library calls, GCC built-in functions are always expanded inline and thus do not have corresponding entry points and their address cannot be obtained. Attempting to use them in an expression other than a function call results in a compile-time error.
[...]
The ISO C90 functions abort, abs, acos, asin, atan2, atan, calloc, ceil, cosh, cos, exit, exp, fabs, floor, fmod, fprintf, fputs, frexp, fscanf, isalnum, isalpha, iscntrl, isdigit, isgraph, islower, isprint, ispunct, isspace, isupper, isxdigit, tolower, toupper, labs, ldexp, log10, log, malloc, memchr, memcmp, memcpy, memset, modf, pow, printf, putchar, puts, scanf, sinh, sin, snprintf, sprintf, sqrt, sscanf, strcat, strchr, strcmp, strcpy, strcspn, strlen, strncat, strncmp, strncpy, strpbrk, strrchr, strspn, strstr, tanh, tan, vfprintf, vprintf and vsprintf are all recognized as built-in functions unless -fno-builtin is specified (or -fno-builtin-function is specified for an individual function).
So it would seem you can disable this behavior with -fno-builtin-pow.
Why is the output coming out to be different. ? (in the updated appended code)
We do not know the values are that different.
When comparing the textual out of int/double, be sure to print the double with sufficient precision to see if it is 100.000000 or just near 100.000000 or in hex to remove all doubt.
printf("%d %d\n" , x , y);
// printf("%f %f\n" , k , l);
// Is it the FP number just less than 100?
printf("%.17e %.17e\n" , k , l); // maybe 9.99999999999999858e+01
printf("%a %a\n" , k , l); // maybe 0x1.8ffffffffffff0000p+6
Why is the output coming out to be different. ? (in the original code)
C does not specify the accuracy of most <math.h> functions. The following are all compliant results.
// Higher quality functions return 100.0
pow(10,2) --> 100.0
// Lower quality and/or faster one may return nearby results
pow(10,2) --> 100.0000000000000142...
pow(10,2) --> 99.9999999999999857...
Assigning a floating point (FP) number to an int simple drops the fraction regardless of how close the fraction is to 1.0
When converting FP to an integer, better to control the conversion and round to cope with minor computational differences.
// long int lround(double x);
long i = lround(pow(10.0,2.0));
You're not the first to find this. Here's a discussion form 2013:
pow() cast to integer, unexpected result
I'm speculating that the assembly code produced by the tcc guys is causing the second value to be rounded down after calculating a result that is REALLY close to 100.
Like mikijov said in that historic post, looks like the bug has been fixed.
As others have mentioned, Code 2 returns 99 due to floating point truncation. The reason why Code 1 returns a different and correct answer is because of a libc optimization.
When the power is a small positive integer, it is more efficient to perform the operation as repeated multiplication. The simpler path removes roundoff. Since this is inlined you don't see function calls being made.
You've fooled it into thinking that the inputs are real and so it gives an approximate answer, which happens to be slightly under 100, e.g. 99.999999 that is then truncated to 99.

Does optimization change the behavior of casts?

I'm working with a small UART device and frequently need to switch the baud rate at which it operates.
Essentially the whole set-up boils down to
#define FOSC 2000000
#define BAUD 9600
uint8_t rate = (uint8_t) ((FOSC / (16.0 * BAUD)) - 1 + 0.5);
(Where +0.5 is used to round the result.)
I'm currently compiling with gcc 4.8.1, -O1.
Does the compiler optimize away the whole cast or am I left with a cast followed by a constant? Would this differ with different -O# values (besides -O0)? What about -Os (which I might have to compile with eventually)?
If it matters, I'm developing for the Atmel AT90USB647 (or the datasheet [pdf]).
It is extremely likely that any sane compiler will convert that entire expression (including the cast) into a constant when compiling with optimizations enabled.
However, to be sure, you'll need to look at the assembly output of your compiler.
But what about GCC 4.8.1 in particular?
Code
#include <stdint.h>
#include <stdio.h>
#define FOSC 2000000
#define BAUD 9600
int main() {
uint8_t rate = (uint8_t) (FOSC / (16.0 * BAUD)) - 1 + 0.5;
printf("%u", rate);
}
Portion of the generated assembly with gcc -O1 red.c
main:
.LFB11:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $12, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
We can see clearly that gcc has precomputed the value of 12 for rate.
Atmel AVR ships newlib C library which is a simple ANSI C library, math library, and collection of board support packages. You may refer to ANSI C specifications to find out. Specifically the Conversion section.
I would make sure that rate is being written to a volatile variable or pointer. Maybe the rate value is calculated OK, but when it's written to the peripheral destination it is lacking the volatile tag and the optimizer does not perform the write operation.

Adding doubles in x86_64 assembly problems

Hello I am trying to learn assembly and learn how to work with floating point numbers in x86_64. From what I understand arguments are passed in xmm0, xmm1, xmm2, and so on, and the result is returned in xmm0. So I am trying to make a simple assembly function that adds to double together. Here is the function
.text
.global floatadd
.type floatadd,#function
floatadd:
addsd %xmm1,%xmm0
ret
And here is the C code I am using as well.
#include<stdio.h>
int main(){
double a = 1.5;
double b = 1.2;
double c = floatadd(a,b);
printf("result = %f\n",c);
}
I have been trying to following what is happening in gdb. When I set a breakpoint in my function I can see xmm0 has 1.5 and xmm1 has 1.2 and when they are added together they 2.7. In gdb print $xmm0 gives v2_double = {2.7000000000000002, 0} However when my function returns from main and calls
cvtsi2sd %eax,%xmm0
Print $xmm0 becomes v2_double = {2, 0}. I am not sure why gcc calls that or why it is uses the 32bit register instead of the 64bit register. I have tried using the modifier %lf, and %f and both of them do the same thing.
What is happening?
The problem is that you failed to declare floatadd before calling it. So the compiler assumes it returns an int in %eax and converts that int to a double. Add the declaration:
double floatadd(double, double);
before main.
Using -Wall or whatever equivalent your compiler uses to enable warnings would probably have told you about this problem...

absolute Value of double

I am trying write a function named absD that returns the absolute value of its argument.
I do not want to use any predefined functions. Right now i am getting a parse error when i try to compile it.
I would image all i would have to do to get the absolute value of a double is change the sign bit? this is what i have
#include <stdio.h>
#include <stdlib.h>
#define PRECISION 3
double absD (double n)
{
asm(" fld %eax \n"
" movl $0x7FFFFFFFFFFFFFFF, %eax \n"
" pop %eax \n"
);
return n;
}
int main (int argc, char **argv)
{
double n = 0.0;
printf("Absolute value\n");
if (argc > 1)
n = atof(argv[1]);
printf("abs(%.*f) = %.*f\n", PRECISION, n, PRECISION, absD(n));
return 0;
}
I fixed the curly brace..
the error i am getting is
~ $ gc a02
gcc -Wall -g a02.c -o a02
/tmp/ccl2H7rf.s: Assembler messages:
/tmp/ccl2H7rf.s:228: Error: suffix or operands invalid for `fld'
/tmp/ccl2H7rf.s:229: Error: missing or invalid immediate expression `0x7FFFFFFFF
FFFFFFF'
~ $
Do you need to do it in assembly? Is this a homework requirement, or are you looking for very high performance?
This doesn't use any predefined functions:
double absD(double n)
{
if (n < 0.0)
n = -n;
return n;
}
I'm no expert, but it looks like you're using ( to open the assembly block and } to end it. You probably should use one or the other, not both inconsistently.
asm(" fld %eax \n"
" movl $0x7FFFFFFFFFFFFFFF, %eax \n"
" pop %eax \n"
};
notice the curley bracket before the semicolon.
Depending on how you want to treat -0.0, you can use C99 / POSIX (2004)'s signbit() function.
#include <math.h>
double absD (double x)
{
if ( signbit(x) ) {
#ifdef NAIVE
return 0.0 - x;
#else
return x &= 0x7FFFFFFFFFFFFFFF;
#endif
} else {
return x;
}
}
But frankly if you're using Standard C Library (libc) atof and printf, I don't see why not using fabs() is desirable. As you can also do normal bit-twiddling in C.
Of course if you're using assembly, why not usefchs op anyhow?
You have errors in your assembly code, which the assembler gives you perfectly reasonable error messages about.
you can't load a floating point value directly from %eax -- the operand needs to be an address to load from
you can't have constant literals that don't fit in 32 bits.
The sign of a floating bit number is just the high bit, so all you need to to is clear the most significant bit.
If you must do this in assembly, then it seems to me that you would be better of using integer rather than floating point instructions. You can't do bitwise operations on floating point registers.
Also, there isn't any need to load the entire 8 byte value into any register, you could just as easily operate only on the high byte (or int) if your processor doesn't have 8 byte integer registers.
/tmp/ccl2H7rf.s:228: Error: suffix or
operands invalid for `fld'
fld needs the operand to be in memory. Put it in memory, ie the stack, and supply the address.
/tmp/ccl2H7rf.s:229: Error: missing or
invalid immediate expression
`0x7FFFFFFFF FFFFFFF'
EAX does not hold more than 32 bits. If you meant for this to be a floating point number, put it on the floating point stack with a load instruction, ie fld.

Resources