Converting a signed int32 to an unsigned int64

Converting a signed int32 to an unsigned int64 - c

int main(){
unsigned long a = 5;
int b = -6;
long c = a + b;
return 0;
}
I wanted to follow the rules explined in this link and confirm my understanding
of how the compiler emits code for a + b:
https://en.cppreference.com/w/c/language/conversion
1- b is first converted to an unsigned long:
If the unsigned type has conversion rank greater than or equal to the rank of the signed type, then the operand with the signed type is implicitly converted to the unsigned type.
So the the compiler essentially does this:
unsigned long implicit_conversion_of_b = (unsigned long) b;
2- The above implicit conversion itself is covered with this rule under Integer conversions:
if the target type is unsigned, the value 2b
, where b is the number of bits in the target type, is repeatedly subtracted or added to the source value until the result fits in the target type.
3- We finlay end up with these 64-bit values in a register before addition takes place:
a = 0x5
b = 0xfffffffffffffffa
Is the above a correct mapping to the rules?
Edit:
4- The final result is an unsigned long which needs to be converted to long needed b c using this rule:
otherwise, if the target type is signed, the behavior is implementation-defined (which may include raising a signal)

Is the above a correct mapping to the rules?
Yes.

I'm certainly no assembly expert, but it's interesting to see what's happening here. Not sure what optimizations you're talking about compiling with -O0 (seen below), obviously -O2 and -O3 look a lot different:
main:
// setup stack
push rbp
mov rbp, rsp
// move 5 into 8-byte offset from the stack frame.
// rbp is the stack frame pointer, offset of 8 shows long is 8
// bytes on this architecture
mov QWORD PTR [rbp-8], 5
// move -6 to 12 bytes offset from rbp (or 4 byte offset from
// last value. This tells you int is 4 bytes on this architecture.
mov DWORD PTR [rbp-12], -6
// move the int into eax register. This is a 32-bit general
// purpose register
mov eax, DWORD PTR [rbp-12]
// movsx is a "move with sign extension" instruction. rdx is a
// 64-bit register, so this is your conversion from 32 to 64
// bits, preserving the sign
movsx rdx, eax
// moves 5 to the 64-bit rax register
mov rax, QWORD PTR [rbp-8]
// performs the 64-bit add
add rax, rdx
// not using the result, so cleanup, prepare to return
// from function
mov QWORD PTR [rbp-24], rax
mov eax, 0
pop rbp
ret
This assembly was generated with gcc 11.2 on x64-86

Related

Double cast to unsigned int on Win32 is truncating to 2,147,483,648

Compiling the following code:
double getDouble()
{
double value = 2147483649.0;
return value;
}
int main()
{
printf("INT_MAX: %u\n", INT_MAX);
printf("UINT_MAX: %u\n", UINT_MAX);
printf("Double value: %f\n", getDouble());
printf("Direct cast value: %u\n", (unsigned int) getDouble());
double d = getDouble();
printf("Indirect cast value: %u\n", (unsigned int) d);
return 0;
}
Outputs (MSVC x86):
INT_MAX: 2147483647
UINT_MAX: 4294967295
Double value: 2147483649.000000
Direct cast value: 2147483648
Indirect cast value: 2147483649
Outputs (MSVC x64):
INT_MAX: 2147483647
UINT_MAX: 4294967295
Double value: 2147483649.000000
Direct cast value: 2147483649
Indirect cast value: 2147483649
In Microsoft documentation there is no mention to signed integer max value in conversions from double to unsigned int.
All values above INT_MAX are being truncated to 2147483648 when it is the return of a function.
I'm using Visual Studio 2019 to build the program. This doesn't happen on gcc.
Am I doing someting wrong? Is there a safe way to convert double to unsigned int?

A compiler bug...
From assembly provided by #anastaciu, the direct cast code calls __ftol2_sse, which seems to convert the number to a signed long. The routine name is ftol2_sse because this is an sse-enabled machine - but the float is in a x87 floating point register.
; Line 17
call _getDouble
call __ftol2_sse
push eax
push OFFSET ??_C#_0BH#GDLBDFEH#Direct?5cast?5value?3?5?$CFu?6#
call _printf
add esp, 8
The indirect cast on the other hand does
; Line 18
call _getDouble
fstp QWORD PTR _d$[ebp]
; Line 19
movsd xmm0, QWORD PTR _d$[ebp]
call __dtoui3
push eax
push OFFSET ??_C#_0BJ#HCKMOBHF#Indirect?5cast?5value?3?5?$CFu?6#
call _printf
add esp, 8
which pops and stores the double value to the local variable, then loads it into a SSE register and calls __dtoui3 which is a double to unsigned int conversion routine...
The behaviour of the direct cast does not conform to C89; nor does it conform to any later revision - even C89 explicitly says that:
The remaindering operation done when a value of integral type is converted to unsigned type need not be done when a value of floating type is converted to unsigned type. Thus the range of portable values is [0, Utype_MAX + 1).
I believe the problem might be a continuation of this from 2005 - there used to be a conversion function called __ftol2 which probably would have worked for this code, i.e. it would have converted the value to a signed number -2147483647, which would have produced the correct result when interpreted an unsigned number.
Unfortunately __ftol2_sse is not a drop-in replacement for __ftol2, as it would - instead of just taking the least-significant value bits as-is - signal the out-of-range error by returning LONG_MIN / 0x80000000, which, interpreted as unsigned long here is not at all what was expected. The behaviour of __ftol2_sse would be valid for signed long, as conversion of a double a value > LONG_MAX to signed long would have undefined behaviour.

Following #AnttiHaapala's answer, I tested the code using optimization /Ox and found that this will remove the bug as __ftol2_sse is no longer used:
//; 17 : printf("Direct cast value: %u\n", (unsigned int)getDouble());
push -2147483647 //; 80000001H
push OFFSET $SG10116
call _printf
//; 18 : double d = getDouble();
//; 19 : printf("Indirect cast value: %u\n", (unsigned int)d);
push -2147483647 //; 80000001H
push OFFSET $SG10117
call _printf
add esp, 28 //; 0000001cH
The optimizations inlined getdouble() and added constant expression evaluation thus removing the need for a conversion at runtime making the bug go away.
Just out of curiosity, I made some more tests, namely changing the code to force float-to-int conversion at runtime. In this case the result is still correct, the compiler, with optimization, uses __dtoui3 in both conversions:
//; 19 : printf("Direct cast value: %u\n", (unsigned int)getDouble(d));
movsd xmm0, QWORD PTR _d$[esp+24]
add esp, 12 //; 0000000cH
call __dtoui3
push eax
push OFFSET $SG9261
call _printf
//; 20 : double db = getDouble(d);
//; 21 : printf("Indirect cast value: %u\n", (unsigned int)db);
movsd xmm0, QWORD PTR _d$[esp+20]
add esp, 8
call __dtoui3
push eax
push OFFSET $SG9262
call _printf
However, preventing inlining, __declspec(noinline) double getDouble(){...} will bring the bug back:
//; 17 : printf("Direct cast value: %u\n", (unsigned int)getDouble(d));
movsd xmm0, QWORD PTR _d$[esp+76]
add esp, 4
movsd QWORD PTR [esp], xmm0
call _getDouble
call __ftol2_sse
push eax
push OFFSET $SG9261
call _printf
//; 18 : double db = getDouble(d);
movsd xmm0, QWORD PTR _d$[esp+80]
add esp, 8
movsd QWORD PTR [esp], xmm0
call _getDouble
//; 19 : printf("Indirect cast value: %u\n", (unsigned int)db);
call __ftol2_sse
push eax
push OFFSET $SG9262
call _printf
__ftol2_sse is called in both conversions making the output 2147483648 in both situations, #zwol suspicions were correct.
Compilation details:
Using command line:
cl /permissive- /GS /analyze- /W3 /Gm- /Ox /sdl /D "WIN32" program.c
In Visual Studio:
Disabling RTC in Project -> Properties -> Code Generation and setting Basic Runtime Checks to default.
Enabling optimization in Project -> Properties -> Optimization and setting Optimization to /Ox.
With debugger in x86 mode.

Nobody has looked at the asm for MS's __ftol2_sse.
From the result, we can infer that it probably converted from x87 to signed int / long (both 32-bit types on Windows), instead of safely to uint32_t.
x86 FP -> integer instructions that overflow the integer result don't just wrap / truncate: they produce what Intel calls the "integer indefinite" when the exact value is not representable in the destination: high bit set, other bits clear. i.e. 0x80000000.
(Or if the FP invalid exception isn't masked, it fires and no value is stored. But in the default FP environment, all FP exceptions are masked. That's why for FP calculations you can get a NaN instead of a fault.)
That includes both x87 instructions like fistp (using the current rounding mode) and SSE2 instructions like cvttsd2si eax, xmm0 (using truncation toward 0, that's what the extra t means).
So it's a bug to compile double->unsigned conversion into a call to __ftol2_sse.
Side-note / tangent:
On x86-64, FP -> uint32_t can be compiled to cvttsd2si rax, xmm0, converting to a 64-bit signed destination, producing the uint32_t you want in the low half (EAX) of the integer destination.
It's C and C++ UB if the result is outside the 0..2^32-1 range so it's ok that huge positive or negative values will leave the low half of RAX (EAX) zero from the integer indefinite bit-pattern. (Unlike integer->integer conversions, modulo reduction of the value is not guaranteed. Is the behaviour of casting a negative double to unsigned int defined in the C standard? Different behaviour on ARM vs. x86. To be clear, nothing in the question is undefined or even implementation-defined behaviour. I'm just pointing out that if you have FP->int64_t, you can use it to efficiently implement FP->uint32_t. That includes x87 fistp which can write a 64-bit integer destination even in 32-bit and 16-bit mode, unlike SSE2 instructions which can only directly handle 64-bit integers in 64-bit mode.

Dividing in Assembly

I am trying to define a calculator in C language based on the Linux command dc the structure of the program is not so important all you need to know that I get two numbers and I want to divide them when typing /. Therefore, I send this two numbers to an assembly function that makes the division (see code below). But this works for positive numbers only.
When typing 999 3 / it returns 333 which is correct but when typing -999 3 / I get the strange number 1431655432 and also when typing both negative numbers like -999 -3 / I get 0 every time for any two negative numbers.
The code in assembly is:
section .text
global _div
_div:
push rbp ; Save caller state
mov rbp, rsp
mov rax, rdi ; Copy function args to registers: leftmost...
mov rbx, rsi ; Next argument...
cqo
idiv rbx ; divide 2 arguments
mov [rbp-8], rax
pop rbp ; Restore caller state

Your comments say you are passing integers to _idiv. If you are using int those are 32-bit values:
extern int _div (int a, int b);
When passed to the function a will be in the bottom 32-bits of RDI and b will be in the bottom 32-bits of RSI. The upper 32-bits of the arguments can be garbage but often they are zero, but doesn't have to be the case.
If you use a 64-bit register as a divisor with IDIV then the division is RDX:RAX / 64-bit divisor (in your case RBX). The problem here is that you are using the full 64-bit registers to do 32-bit division. If we assume for arguments sake that the upper bits of RDI and RSI were originally 0 then RSI would be 0x00000000FFFFFC19 (RAX) and RDI would be 0x0000000000000003 (RBX). CQO would zero extend RAX to RDX. The upper most bit of RAX is zero so RDX would be zero. The division would look like:
0x000000000000000000000000FFFFFC19 / 0x0000000000000003 = 0x55555408
0x55555408 happens to be 1431655432 (decimal) which is the result you were seeing. One fix for this is to use 32-bit registers for the division. To sign extend EAX (lower 32-bit of RAX) into EDX you can use CDQ instead of CQO.You can then divide EDX:EAX by EBX. This should get you the 32-bit signed division you are looking for. The code would look like:
cdq
idiv ebx ; divide 2 arguments EDX:EAX by EBX
Be aware that RBX, RBP, R12 to R15 all need to be preserved by your function of you modify them (they are volatile registers in the AMD 64-bit ABI). If you modify RBX you need to make sure you save and restore it like you do with RBP. A better alternative is to use one of the volatile registers like RCX instead of RBX.
You don't need the intermediate register to place the divisor into. You could have used RSI (or ESI in the fixed version) directly instead of moving it to a register like RBX.

Your issue has to do with how arguments are passed to _div.
Assuming your _div's prototype is:
int64_t _div(int32_t, int32_t);
Then, the arguments are passed in edi and esi (i.e., 32-bit signed integers), the upper halves of the registers rdi and rsi are undefined.
Sign extension is needed when assigning edi and esi to rax and rbx for performing a 64-bit signed division (for performing a 64-bit unsigned division zero extension would be needed instead).
That is, instead of:
mov rax, rdi
mov rbx, rsi
use the instruction movsx, which sign extends the source, on edi and esi:
movsx rax, edi
movsx rbx, esi
Using true 64-bit operands for the 64-bit division
The previous approach consits of performing a 64-bit division on "fake" 64-bit operands (i.e., sign-extended 32-bit operands). Mixing 64-bit instructions with "32-bit operands" is usually not a very good idea because it may result in worse performance and larger code size.
A better approach would be to simply change the C prototype of your _div function to accept actual 64-bit arguments, i.e.:
int64_t _div(int64_t, int64_t);
This way, the argument will be passed to rdi and rsi (i.e., already 64-bit signed integers) and a 64-bit division will be performed on true 64-bit integers.
Using a 32-bit division instead
You may also want to consider using the 32-bit idiv if it suits your needs, since it performs faster than a 64-bit division and the resulting code size is smaller (no REX prefix):
...
mov eax, edi
mov ebx, esi
cdq
idiv ebx
...
_div's prototype would be:
int32_t _div(int32_t, int32_t);

Move variable to cl and perform shr using inline assembly

So I am trying to translate the following assignment from C to inline assembly
resp = (0x1F)&(letter >> (3 - numB));
Assuming that the declaration of the variables are the following
unsigned char resp;
unsigned char letter;
int numB;
So I have tried the following:
_asm {
mov ebx, 01fh
movzx edx, letter
mov cl,3
sub cl, numB // Line 5
shr edx, cl
and ebx, edx
mov resp, ebx
}
or the following
_asm {
mov ebx, 01fh
movzx edx, letter
mov ecx,3
sub ecx, numB
mov cl, ecx // Line 5
shr edx, cl
and ebx, edx
mov resp, ebx
}
In both cases I get size operand error in Line 5.
How can I achieve the right shift?

The E*X registers are 32 bits, while the *L registers are 8 bits. Similarly, on Windows, the int type is 32 bits wide, while the char type is 8 bits wide. You cannot arbitrarily mix these sizes within a single instruction.
So, in your first piece of code:
sub cl, numB // Line 5
this is wrong because the cl register stores an 8-bit value, whereas the numB variable is of type int, which stores a 32-bit value. You cannot subtract a 32-bit value from an 8-bit value; both operands to the SUB instruction must be the same size.
Similarly, in your second piece of code:
mov cl, ecx // Line 5
you are trying to move the 32-bit value in ECX into the 8-bit CL register. That can't happen without some kind of truncation, so you have to indicate it explicitly. The MOV instruction requires that both of its operands have the same size.
(MOVZX and MOVSX are obvious exceptions to this rule that the operand types must match for a single instruction. These instructions zero-extend or sign-extend, respectively, a smaller value so that it can be stored into a larger-sized register.)
However, in this case, you don't even need the MOV instruction. Remember that CL is just the lower 8 bits of the full 32-bit ECX register. Therefore, setting ECX also implicitly sets CL. If you only need the lower 8 bits, you can just use CL in a subsequent instruction. Thus, your code becomes:
mov ebx, 01fh ; move constant into 32-bit EBX
movzx edx, BYTE PTR letter ; zero-extended move of 8-bit variable into 32-bit EDX
mov ecx, 3 ; move constant into ECX
sub ecx, DWORD PTR numB ; subtract 32-bit variable from ECX
shr edx, cl ; shift EDX right by the lower 8 bits of ECX
and ebx, edx ; bitwise AND of EDX and EBX, leaving result in EBX
mov BYTE PTR resp, bl ; move lower 8 bits of EBX into 8-bit variable
For the same operand-size matching issue discussed above, I've also had to change the final MOV instruction. You cannot move the value stored in a 32-bit register directly into an 8-bit variable. You will have to move either the lower 8 bits or the upper 8 bits, allowing you to use either the BL or BH registers, which are 8 bits and therefore match the size of resp. In the above code, I assumed that you want only the lower 8 bits, so I've used BL.
Also note that I've used the BYTE PTR and DWORD PTR specifications. These are not strictly necessary in MASM (or Visual Studio's inline assembler), since it can deduce the sizes of the types from the types of the variables. However, I think it increases readability, and is generally a recommended practice. DWORD means 32 bit; it is the same size as int and a 32-bit register (E*X). WORD means 16 bit; it is the same size as short and a 16-bit register (*X). BYTE means 8 bits; it is the same size as char and an 8-bit register (*L or *H).

Assembly How to translate IMUL opcode (with only one oprand) to C code

Say I got
EDX = 0xA28
EAX = 0x0A280105
I run this ASM code
IMUL EDX
which to my understand only uses EAX.. if one oprand is specified
So in C code it should be like
EAX *= EDX;
correct?
After looking in debugger.. I found out EDX got altered too.
0x0A280105 * 0xA28 = 0x67264A5AC8
in debugger
EAX = 264A5AC8
EDX = 00000067
now if you take the answer 0x67264A5AC8 and split off first hex pair, 0x67 264A5AC8
you can clearly see why the EDX and EAX are the way they are.
Okay so a overflow happens.. as it cannot store such a huge number into 32 bits. so it starts using extra 8 bits in EDX
But my question is how would I do this in C code now to get same results?
I'm guessing it would be like
EAX *= EDX;
EDX = 0xFFFFFFFF - EAX; //blah not good with math manipulation like this.

The IMUL instruction actually produces a result twice the size of the operand (unless you use one of the newer versions that can specify a destination). So:
imul 8bit -> result = ax, 16bits
imul 16bit -> result = dx:ax, 32bits
imul 32bit -> result = edx:eax, 64bits
To do this in C will be dependent on the compiler, but some will work doing this:
long result = (long) eax * (long) edx;
eax = result & 0xffffffff;
edx = result >> 32;
This assumes a long is 64 bits. If the compiler has no 64 bit data type then calculating the result becomes much harder, you need to do long multiplication.
You could always inline the imul instruction.

GCC hotpatching?

When I compile this piece of code
unsigned char A[] = {1, 2, 3, 4};
unsigned int
f (unsigned int x)
{
return A[x];
}
gcc outputs
mov edi, edi
movzx eax, BYTE PTR A[rdi]
ret
on a x86_64 machine.
The question is: why is a nop instruction (mov edi, edi) there for?
Im am using gcc-4.4.4.

In 64-bit mode, mov edi, edi is not a no-op. What it does is set the top 32 bits of rdi to 0.
This is a special case of the general fact that all 32-bit operations clear the top 32 bits of the destination register in 64-bit mode. (This allows a more efficient CPU than leaving them unchanged and is perhaps more useful as well.)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight