Definition floating-point numbers in X86 Assembly - C Translation - c

Currently studying C. When I define, for example, a vector, such as:
float var1[2023] = {-53.3125}
What would the corresponding X86 Assembly translation look like? I'm looking for the exact portion of code where the variable is defined, where the ".type" and ".size" and alignment values are mentioned.
I've seen on the internet that when dealing with a floating-point number, the X86 Assembly conversion will simply be ".long". However, I'm not sure to what point that is correct.

One easy way to find out is to ask the compiler to show you:
// float.c
float var1[2023] = { -53.3125 };
then compile it:
$ gcc -S float.c
and then study the output:
.file "float.c"
.globl var1
.data
.align 32
.type var1, #object
.size var1, 8092
var1:
.long 3260366848
.zero 8088
.ident "GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-39)"
.section .note.GNU-stack,"",#progbits
Note that this is just GCC's implementation; clang does it differently:
.file "float.c"
.type var1,#object # #var1
.data
.globl var1
.align 16
var1:
.long 3260366848 # float -5.331250e+01
.long 0 # float 0.000000e+00
.long 0 # float 0.000000e+00
// thousands of these
.size var1, 8092
.ident "clang version 3.4.2 (tags/RELEASE_34/dot2-final)"
.section ".note.GNU-stack","",#progbits
EDIT - To answer the comment below, the use of long simply lays down a specific bit pattern that encodes the compiler's idea of floating point format.
The value 3260366848 is the same as hex 0xC2554000, which is 11000010010101010100000000000000 in binary, and it's the binary value that the CPU cares about. If you care to, you can get out your IEEE floating point spec and decode this, there's the sign, that's the exponent, etc. but all the details of the floating point encoding were handled by the compiler, not the assembler.
I'm no kind of compiler expert, but decades ago I was tracking down a bug in a C compiler's floating point support, and though I don't remember the details, in the back of my mind it strike me as having the compiler do this would have been helpful by saving me from having to use a disassembler to find out what the bit pattern was actually encoded.
Surely others will weigh in here.
EDIT2 Bits are bits, and this little C program (which relies on sizeof int and sizeof float being the same size), demonstrates this:
// float2.c
#include <stdio.h>
#include <memory.h>
int main()
{
float f = -53.3125;
unsigned int i;
printf("sizeof int = %lu\n", sizeof(i));
printf("sizeof flt = %lu\n", sizeof(f));
memcpy(&i, &f, sizeof i); // copy float bits into an int
printf("float = %f\n", f);
printf("i = 0x%08x\n", i);
printf("i = %u\n", i);
return 0;
}
Running it shows that bits are bits:
sizeof int = 4
sizeof flt = 4
float = -53.312500
i = 0xc2554000
i = 3260366848 <-- there ya go
This is just a display notion for 32 bits depending on how you look at them.
Now to answer the question of how would you determine 3260366848 on your own from the floating point value, you'd need to get out your IEEE standard and draw out all the bits manually (recommend strong coffee), then read those 32 bits as an integer.

Related

Convert a float literal to int representation in x86 assembly?

The following C code:
int main()
{
float f;
f = 3.0;
}
Is converted to the following assembly instructions:
main:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
flds .LC0
fstps -4(%ebp)
movl $0, %eax
leave
ret
.LC0:
.long 1077936128
What is the correct way to calculate the .long/int representation of the float literal?
e.g. 1077936128 generated from 3.0 for the example shown above
For this example gcc is used with the -m32 -S -O0 -fno-stack-protector -fno-asynchronous-unwind-tables flags using intel settings to generate the assembly output.
References:
Compiler Explorer Link with compilation flags and other settings
x86 FPU hardware uses IEEE754 binary32 / binary64 representations for float / double.
Determining the IEEE 754 representation of a floating point number is not trivial for humans. In handwritten assembly code, it's usually a good idea to use the .float or .double directives instead:
.float 3.0 # generates 3.0 as a 32 bit float
.double 3.0 # generates 3.0 as a 64 bit float
If you really want to compute this manually, refer to the explanations on Wikipedia. It might be interesting to do so as an exercise, but for actual programming it's tedious and mostly useless.
Compilers do the conversion (with rounding to the nearest representable FP value) internally, because FP values often don't come directly from a literal in the source; they can come from constant folding. e.g. 1.23 * 4.56 is evaluated at compile time, so the compiler already ends up with FP values in float or double binary representation. Printing them back to decimal for the assembler to parse and re-convert to binary would be slower and might require a lot of decimal places.
To compute the representation of a 32 bit float as a 32 bit integer, you can use an online IEEE754 converter, or a program like this:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
int main(int argc, char *argv[])
{
union { uint32_t u32; float f32; } intfloat;
if (argc != 2) {
fprintf(stderr, "Usage: %s some-number\n", argv[0]);
return EXIT_FAILURE;
}
intfloat.f32 = atof(argv[1]);
printf("0x%08" PRIx32 "\n", intfloat.u32);
return EXIT_SUCCESS;
}

How to specify default global variable alignment for gcc?

How do I get rid of alignment (.align 4 below) for all global variables by default with GCC, without having to specify __attribute__((aligned(1))) for each variable?
I know that what I ask for is a bad idea to apply universally, becuase on some architectures an alignment of 1 wouldn't work, because e.g. the CPU is not able to dereference an unaligned pointer. Bit in my case I'm writing an i386 bootloader, and unaligned pointers are fine (but slower) there.
Source code (a.c):
__attribute__((aligned(1))) int answer0 = 41;
int answer = 42;
Compiled with: gcc -m32 -Os -S a.c
Assembly output (a.s):
.file "a.c"
.globl answer
.data
.align 4
.type answer, #object
.size answer, 4
answer:
.long 42
.globl answer0
.type answer0, #object
.size answer0, 4
answer0:
.long 41
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
.section .note.GNU-stack,"",#progbits
The flag gcc -fpack-struct=1 changes the alignment of all struct members and structs to 1. For example, with that flag
struct x { char a; int b; };
struct y { int v : sizeof(char) + sizeof(int) == sizeof(struct x); };
struct z { int b; };
struct x x = { 1, 1 };
int i = 42;
struct z z = { 2 };
compiles to no alignment for variables x' andz', but it still has an .align 4 for the variable i (of type int). I need a solution which also makes int i = 42; unaligned, without having to specify something extra for each such variable.
IMO packing variables to save the space using the packed struct is the easiest and safest way.
example:
#include <stdio.h>
#include <stdint.h>
#define _packed __attribute__((packed))
_packed struct
{
uint8_t x1;
_packed int x2;
_packed uint8_t x3[2];
_packed int x4;
}byte_int;
int main(void) {
printf("%p %p %p %p\n", &byte_int.x1, &byte_int.x2, &byte_int.x3, &byte_int.x4);
printf("%u %u %u %u\n", (unsigned int)&byte_int.x1, (unsigned int)&byte_int.x2, (unsigned int)&byte_int.x3, (unsigned int)&byte_int.x4); // I know it is an UB just to show the op in dec - easier to spot the odd and the even addresses
return 0;
}
https://ideone.com/bY1soH
Most probably gcc doesn't have such a flag which can change the default alignment of global variables.
gcc -fpack-struct=1 can be a workaround, but only for global variables which happen to be of struct type.
Also post-processing the .s output of gcc and removing (some of) the .align lines could work as a workaround.

Unable to access correct global label data of assembly from C in linux

I have an assembly code (hello1.s) where global label A_Td is defined and I want to access all the long data values defined with global label A_Td from/inside C program.
.file "hello1.s"
.globl A_Td
.text
.align 64
A_Td:
.long 1353184337,1353184337
.long 1399144830,1399144830
.long 3282310938,3282310938
.long 2522752826,2522752826
.long 3412831035,3412831035
.long 4047871263,4047871263
.long 2874735276,2874735276
.long 2466505547,2466505547
As A_Td is defined in text section, so it is placed in code section and only one copy is loaded into memory.
Using yasm , I have generated hello1.o file
yasm -p gas -f elf32 hello1.s
Now, to access all the long data using global label A_Td , I have written following C code (test_glob.c) taking clue from here global label.
//test_glob.c
extern A_Td ;
int main()
{
long *p;
int i;
p=(long *)(&A_Td);
for(i=0;i<16;i++)
{
printf("p+%d %p %ld\n",i, p+i,*(p+i));
}
return 0;
}
Using following command I have compiled C program and then run the C code.
gcc hello1.o test_glob.c
./a.out
I am getting following output
p+0 0x8048400 1353184337
p+1 0x8048404 1353184337
p+2 0x8048408 1399144830
p+3 0x804840c 1399144830 -----> correct till this place
p+4 0x8048410 -1012656358 -----> incorrect value retrieved from this place
p+5 0x8048414 -1012656358
p+6 0x8048418 -1772214470
p+7 0x804841c -1772214470
p+8 0x8048420 -882136261
p+9 0x8048424 -882136261
p+10 0x8048428 -247096033
p+11 0x804842c -247096033
p+12 0x8048430 -1420232020
p+13 0x8048434 -1420232020
p+14 0x8048438 -1828461749
p+15 0x804843c -1828461749
ONLY first 4 long values are correctly accessed from C program. Why this is happening ?
What needs to be done inside C program to access the rest of data correctly ?
I am using Linux. Any help to resolve this issue or any link will be a great help. Thanks in advance.
How many bytes does "long" have in this system?
It seems to me that printf interprets the numbers as four byte signed integers, where the value 3282310938 has the hex value C3A4171A, which is above 7FFFFFFF (in decimal: 2147483647) which is the largest four byte positive signed number, and hence a negative value -1012656358.
I assume that the assembler just interprets these four byte numbers as unsigned.
If you would use %lu instead of %ld, printf would interpret the numbers as unsigned, and should show what you expected.

What is happening here in pow function?

I have seen various answer here that depicts Strange behavior of pow function in C.
But I Have something different to ask here.
In the below code I have initialized int x = pow(10,2) and int y = pow(10,n) (int n = 2).
In first case it when I print the result it shows 100 and in the other case it comes out to be 99.
I know that pow returns double and it gets truncated on storing in int, but I want to ask why the output comes to be different.
CODE1
#include<stdio.h>
#include<math.h>
int main()
{
int n = 2;
int x;
int y;
x = pow(10,2); //Printing Gives Output 100
y = pow(10,n); //Printing Gives Output 99
printf("%d %d" , x , y);
}
Output : 100 99
Why is the output coming out to be different. ?
My gcc version is 4.9.2
Update :
Code 2
int main()
{
int n = 2;
int x;
int y;
x = pow(10,2); //Printing Gives Output 100
y = pow(10,n); //Printing Gives Output 99
double k = pow(10,2);
double l = pow(10,n);
printf("%d %d\n" , x , y);
printf("%f %f\n" , k , l);
}
Output : 100 99
100.000000 100.000000
Update 2 Assembly Instructions FOR CODE1
Generated Assembly Instructions GCC 4.9.2 using gcc -S -masm=intel :
.LC1:
.ascii "%d %d\0"
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 48
call ___main
mov DWORD PTR [esp+44], 2
mov DWORD PTR [esp+40], 100 //Concerned Line
fild DWORD PTR [esp+44]
fstp QWORD PTR [esp+8]
fld QWORD PTR LC0
fstp QWORD PTR [esp]
call _pow //Concerned Line
fnstcw WORD PTR [esp+30]
movzx eax, WORD PTR [esp+30]
mov ah, 12
mov WORD PTR [esp+28], ax
fldcw WORD PTR [esp+28]
fistp DWORD PTR [esp+36]
fldcw WORD PTR [esp+30]
mov eax, DWORD PTR [esp+36]
mov DWORD PTR [esp+8], eax
mov eax, DWORD PTR [esp+40]
mov DWORD PTR [esp+4], eax
mov DWORD PTR [esp], OFFSET FLAT:LC1
call _printf
leave
ret
.section .rdata,"dr"
.align 8
LC0:
.long 0
.long 1076101120
.ident "GCC: (tdm-1) 4.9.2"
.def _pow; .scl 2; .type 32; .endef
.def _printf; .scl 2; .type 32; .endef
I know that pow returns double and it gets truncated on storing in int, but I want to ask why the output comes to be different.
You must first, if you haven't already, divest yourself of the idea that floating-point numbers are in any way sensible or predictable. double only approximates real numbers and almost anything you do with a double is likely to be an approximation to the actual result.
That said, as you have realized, pow(10, n) resulted in a value like 99.99999999999997, which is an approximation accurate to 15 significant figures. And then you told it to truncate to the largest integer less than that, so it threw away most of those.
(Aside: there is rarely a good reason to convert a double to an int. Usually you should either format it for display with something like sprintf("%.0f", x), which does rounding correctly, or use the floor function, which can handle floating-point numbers that may be out of the range of an int. If neither of those suit your purpose, like in currency or date calculations, possibly you should not be using floating point numbers at all.)
There are two weird things going on here. First, why is pow(10, n) inaccurate? 10, 2, and 100 are all precisely representable as double. The best answer I can offer is that the C standard library you are using has a bug. (The compiler and the standard library, which I assume are gcc and glibc, are developed on different release schedules and by different teams. If pow is returning inaccurate results, that is probably a bug in glibc, not gcc.)
In the comments on your question, amdn found a glibc bug to do with FP rounding that might be related and another Q&A that goes into more detail about why this happens and how it's not a violation of the C standard. chux's answer also addresses this. (C doesn't require implementation of IEEE 754, but even if it did, pow isn't required to use correct rounding.) I will still call this a glibc bug, because it's an undesirable property.
(It's also conceivable, though unlikely, that your processor's FPU is wrong.)
Second, why is pow(10, n) different from pow(10, 2)? This one is far easier. gcc optimizes away function calls for which the result can be calculated at compile time, so pow(10, 2) is almost certainly being optimized to 100.0. If you look at the generated assembly code, you will find only one call to pow.
The GCC manual, section 6.59 describes which standard library functions may be treated in this way (follow the link for the full list):
The remaining functions are provided for optimization purposes.
With the exception of built-ins that have library equivalents such as the standard C library functions discussed below, or that expand to library calls, GCC built-in functions are always expanded inline and thus do not have corresponding entry points and their address cannot be obtained. Attempting to use them in an expression other than a function call results in a compile-time error.
[...]
The ISO C90 functions abort, abs, acos, asin, atan2, atan, calloc, ceil, cosh, cos, exit, exp, fabs, floor, fmod, fprintf, fputs, frexp, fscanf, isalnum, isalpha, iscntrl, isdigit, isgraph, islower, isprint, ispunct, isspace, isupper, isxdigit, tolower, toupper, labs, ldexp, log10, log, malloc, memchr, memcmp, memcpy, memset, modf, pow, printf, putchar, puts, scanf, sinh, sin, snprintf, sprintf, sqrt, sscanf, strcat, strchr, strcmp, strcpy, strcspn, strlen, strncat, strncmp, strncpy, strpbrk, strrchr, strspn, strstr, tanh, tan, vfprintf, vprintf and vsprintf are all recognized as built-in functions unless -fno-builtin is specified (or -fno-builtin-function is specified for an individual function).
So it would seem you can disable this behavior with -fno-builtin-pow.
Why is the output coming out to be different. ? (in the updated appended code)
We do not know the values are that different.
When comparing the textual out of int/double, be sure to print the double with sufficient precision to see if it is 100.000000 or just near 100.000000 or in hex to remove all doubt.
printf("%d %d\n" , x , y);
// printf("%f %f\n" , k , l);
// Is it the FP number just less than 100?
printf("%.17e %.17e\n" , k , l); // maybe 9.99999999999999858e+01
printf("%a %a\n" , k , l); // maybe 0x1.8ffffffffffff0000p+6
Why is the output coming out to be different. ? (in the original code)
C does not specify the accuracy of most <math.h> functions. The following are all compliant results.
// Higher quality functions return 100.0
pow(10,2) --> 100.0
// Lower quality and/or faster one may return nearby results
pow(10,2) --> 100.0000000000000142...
pow(10,2) --> 99.9999999999999857...
Assigning a floating point (FP) number to an int simple drops the fraction regardless of how close the fraction is to 1.0
When converting FP to an integer, better to control the conversion and round to cope with minor computational differences.
// long int lround(double x);
long i = lround(pow(10.0,2.0));
You're not the first to find this. Here's a discussion form 2013:
pow() cast to integer, unexpected result
I'm speculating that the assembly code produced by the tcc guys is causing the second value to be rounded down after calculating a result that is REALLY close to 100.
Like mikijov said in that historic post, looks like the bug has been fixed.
As others have mentioned, Code 2 returns 99 due to floating point truncation. The reason why Code 1 returns a different and correct answer is because of a libc optimization.
When the power is a small positive integer, it is more efficient to perform the operation as repeated multiplication. The simpler path removes roundoff. Since this is inlined you don't see function calls being made.
You've fooled it into thinking that the inputs are real and so it gives an approximate answer, which happens to be slightly under 100, e.g. 99.999999 that is then truncated to 99.

C compiler optimize loop by running it

Can a C compiler ever optimize a loop by running it?
For example:
int num[] = {1, 2, 3, 4, 5}, i;
for(i = 0; i < sizeof(num)/sizeof(num[0]); i++) {
if(num[i] > 6) {
printf("Error in data\n");
exit(1);
}
}
Instead of running this each time the program is executed, can the compiler simply run this and optimize it away?
Let's have a look… (This really is the only way to tell.)
Fist, I've converted your snippet into something we can actually try to compile and run and saved it in a file named main.c.
#include <stdio.h>
static int
f()
{
const int num[] = {1, 2, 3, 4, 5};
int i;
for (i = 0; i < sizeof(num) / sizeof(num[0]); i++)
{
if (num[i] > 6)
{
printf("Error in data\n");
return 1;
}
}
return 0;
}
int
main()
{
return f();
}
Running gcc -S -O3 main.c produces the following assembly file (in main.s).
.file "main.c"
.section .text.unlikely,"ax",#progbits
.LCOLDB0:
.section .text.startup,"ax",#progbits
.LHOTB0:
.p2align 4,,15
.globl main
.type main, #function
main:
.LFB22:
.cfi_startproc
xorl %eax, %eax
ret
.cfi_endproc
.LFE22:
.size main, .-main
.section .text.unlikely
.LCOLDE0:
.section .text.startup
.LHOTE0:
.ident "GCC: (GNU) 5.1.0"
.section .note.GNU-stack,"",#progbits
Even if you don't know assembly, you'll notice that the string "Error in data\n" is not present in the file so, apparently, some kind of optimization must have taken place.
If we look closer at the machine instructions generated for the main function,
xorl %eax, %eax
ret
We can see that all it does is XOR'ing the EAX register with itself (which always results in zero) and writing that value into EAX. Then it returns again. The EAX register is used to hold the return value. As we can see, the f function was completely optimized away.
Yes. The C compiler unrolls loops automatically with options -O3 and -Otime.
You didn't specify the compiler, but using gcc with -O3 and taking the size calculation outside the for maybe it could do a little adjustment.
Compilers can do even better than that. Not only can compilers examine the effect of running code "forward", but the Standard even allows them to work code logic in reverse in situations involving potential Undefined Behavior. For example, given:
#include <stdio.h>
int main(void)
{
int ch = getchar();
int q;
if (ch == 'Z')
q=5;
printf("You typed %c and the magic value is %d", ch, q);
return 0;
}
a compiler would be entitled to assume that the program will never receive any input which would cause the printf to be reached without q having received a value; since the only input character which would cause q to receive a value would be 'Z', a compiler could thus legitimately replace the code with:
int main(void)
{
getchar();
printf("You typed Z and the magic value is 5");
}
If the user types Z, the behavior of the original program will be well-defined, and the behavior of the latter will match it. If the user types anything else, the original program will invoke Undefined Behavior and, as a consequence, the Standard will impose no requirements on what the compiler may do. A compiler will be entitled to do anything it likes, including producing the same result as would be produced by typing Z.

Resources