Why does MSVC not inline variable names in the generated assembly code? - c

When compiling the program beneath to assembly using MSVC:
int main() {int var1 = 1; int var2 = 2;}
Then the generated assembly code will look like this:
var1$ = 0
var2$ = 4
main PROC
sub rsp, 24
mov DWORD PTR var1$[rsp], 1
mov DWORD PTR var2$[rsp], 2
xor eax, eax
add rsp, 24
ret 0
main ENDP
I'm wondering why the var1$ = 0 and var2$ = 4 is necessary since the it could be avoided by inlining it in the mov as such: mov DWORD PTR [rsp], 1 and mov DWORD PTR [rsp + 4], 2. This results in a smaller assembly, and maybe also a smaller .obj and .exe file.
I'm compiling using CL Test.c /Fa

Related

returning Assembly Function In C Code x86 Compilation on 64 bit linux

C code
#include <stdio.h>
int fibonacci(int);
int main()
{
int x = fibonacci(3);
printf("Fibonacci is : %d",x);
return 0;
}
Assembly
section .text
global fibonacci
fibonacci:
push ebp;
mov ebp, esp;
; initialize
mov dword [prev], 0x00000000;
mov dword [cur], 0x00000001;
mov byte [it], 0x01;
mov eax, dword [ebp + 8]; // n = 3
mov byte [n], al;
getfib:
xor edx,edx;
mov dl, byte [n];
cmp byte [it] , dl;
jg loopend;
mov eax,dword [prev];
add eax, dword [cur];
mov ebx, dword [cur];
mov dword [prev], ebx;
mov dword [cur] , eax;
inc byte [it];
jmp getfib;
loopend:
mov eax, dword [cur];
pop ebp;
ret;
section .bss
it resb 1
prev resd 1
cur resd 1
n resb 1
I was trying to run this assembly function in C code and on debugging , i saw that value in variable x in C code is right but there is some error coming when i use the printf function
Need Help on it
Command to compile:
nasm -f elf32 asmcode.asm -o a.o
gcc -ggdb -no-pie -m32 a.o ccode.c -o a.out
Click Below Pictures if they seem blurred
Below is debug before printf execute
Below is after printf execute
Your code does not preserve the ebx register which is a callee-preserved register. The main function apparently tries to do some rip-relative addressing to obtain the address of the format string for printf using ebx as a base register. This fails because your code overwrote ebx.
To fix this issue, make sure to save all callee-saved registers before you use them and then restore their value on return. For example, you can do
fibonacci:
push ebp
mov ebp, esp
push ebx ; <---
...
pop ebx ; <---
pop ebp
ret

clang: compiling with non-flat x86 stack model

If I understand clang assumes that the stack segment for x86 is flat (has 0 base). E.g., when compiling using the following command line:
clang -cc1 -S -mllvm --x86-asm-syntax=intel -o - -triple i986-unknown-unknown -mrelocation-model static xxx.c
xxx.c:
void f()
{
int a = 5;
int *ap = &a;
int b = *ap;
}
the following assembly is produced:
f:
sub ESP, 12
lea EAX, DWORD PTR [ESP + 8]
mov DWORD PTR [ESP + 8], 5
mov DWORD PTR [ESP + 4], EAX
mov EAX, DWORD PTR [ESP + 4]
mov EAX, DWORD PTR [EAX]
mov DWORD PTR [ESP], EAX
add ESP, 12
ret
This may only be correct if the stack is flat, because EAX contains the offset from SS base.
Is it possible to compile the C code for SS with an arbitrary base?
When executing in protected mode, segment registers do not contain offsets you can use. Rather, each segment is independent of the others. As far as I know, no operating system in use today uses an ABI where pointers contain segment information and as you can take the address of any local variable (placed on the stack) this would be necessary. I don't think clang (or gcc for that matter) can compile code for such a model.

Annotated disassembly of executable

When compiling a C program to an object file, it's easy to get the Microsoft compiler to give you an annotated disassembly (with names of functions and variables, source line numbers etc.) using cl /Fa.
I'm trying to get something similar from the final linked executable (assuming the program was compiled with appropriate debug information), which seems to be trickier; dumpbin and objdump seem to only provide non-annotated disassembly.
What's the best way to obtain this?
if you have the program compiled with debuginfo windbg should provide disassembly of a function with line numbers
sample code compiled with debug info and an assembly file generated with /Fa
C:\codesnips\comparesrc\debug>cl /Zi /Fa comparesrc.cpp /link /Debug
comparesrc.cpp
/out:comparesrc.exe
/debug
/Debug
comparesrc.obj
the source for the above compilation
C:\codesnips\comparesrc\debug>type comparesrc.cpp
#include <stdio.h> // standard include file
int main (void)
{ // this line will become prolog
printf("hello my dear source compare\n"); // see str in .data section
puts("c"); // will put a char* with line break to console
puts("om");
puts("pare");
int a,b,c,d;
a = 2; b =3 ; c = 4;
d = a+b-c; // 2+3 -4 = 1
printf("%d\n",d); // should print 1
d = (a*b)/c; // 2*3 /4 = 6 /4 numerator = 1
printf("%d\n",d); // should printf 1
d = (a*b)%c; // 2 * 3 % 4 denominator = 2
printf("%d\n",d); // should print 2
return 0; // lets generate a cod file and see the assembly
} // this line will get converted to epilog
the assembly file created by /Fa switch
C:\codesnips\comparesrc\debug>type comparesrc.asm
; Listing generated by Microsoft (R) Optimizing Compiler Version 16.00.30319.01
TITLE C:\codesnips\comparesrc\debug\comparesrc.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
CONST SEGMENT
$SG3850 DB 'hello my dear source compare', 0aH, 00H
ORG $+2
$SG3851 DB 'c', 00H
ORG $+2
$SG3852 DB 'om', 00H
ORG $+1
$SG3853 DB 'pare', 00H
ORG $+3
$SG3858 DB '%d', 0aH, 00H
$SG3859 DB '%d', 0aH, 00H
$SG3860 DB '%d', 0aH, 00H
CONST ENDS
PUBLIC _main
EXTRN _puts:PROC
EXTRN _printf:PROC
; Function compile flags: /Odtp
_TEXT SEGMENT
_c$ = -16 ; size = 4
_d$ = -12 ; size = 4
_b$ = -8 ; size = 4
_a$ = -4 ; size = 4
_main PROC
; File c:\codesnips\comparesrc\debug\comparesrc.cpp
; Line 3
push ebp
mov ebp, esp
sub esp, 16 ; 00000010H
; Line 4
push OFFSET $SG3850
call _printf
add esp, 4
; Line 5
push OFFSET $SG3851
call _puts
add esp, 4
; Line 6
push OFFSET $SG3852
call _puts
add esp, 4
; Line 7
push OFFSET $SG3853
call _puts
add esp, 4
; Line 9
mov DWORD PTR _a$[ebp], 2
mov DWORD PTR _b$[ebp], 3
mov DWORD PTR _c$[ebp], 4
; Line 10
mov eax, DWORD PTR _a$[ebp]
add eax, DWORD PTR _b$[ebp]
sub eax, DWORD PTR _c$[ebp]
mov DWORD PTR _d$[ebp], eax
; Line 11
mov ecx, DWORD PTR _d$[ebp]
push ecx
push OFFSET $SG3858
call _printf
add esp, 8
; Line 12
mov eax, DWORD PTR _a$[ebp]
imul eax, DWORD PTR _b$[ebp]
cdq
idiv DWORD PTR _c$[ebp]
mov DWORD PTR _d$[ebp], eax
; Line 13
mov edx, DWORD PTR _d$[ebp]
push edx
push OFFSET $SG3859
call _printf
add esp, 8
; Line 14
mov eax, DWORD PTR _a$[ebp]
imul eax, DWORD PTR _b$[ebp]
cdq
idiv DWORD PTR _c$[ebp]
mov DWORD PTR _d$[ebp], edx
; Line 15
mov eax, DWORD PTR _d$[ebp]
push eax
push OFFSET $SG3860
call _printf
add esp, 8
; Line 16
xor eax, eax
; Line 17
mov esp, ebp
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END
and finally disassembly of the complete main function using cdb (console version of windbg)
cdb -c ".lines;g main;uf #eip;q;" comparesrc.exe
Microsoft (R) Windows Debugger Version 6.12.0002.633 X86
CommandLine: comparesrc.exe
0:000> cdb: Reading initial command '.lines;g main;uf #eip;q;'
Line number information will be loaded
comparesrc!main [c:\codesnips\comparesrc\debug\comparesrc.cpp # 3]:
3 00401010 55 push ebp
3 00401011 8bec mov ebp,esp
3 00401013 83ec10 sub esp,10h
4 00401016 685c8c4100 push offset comparesrc!__xt_z+0x120 (00418c5c)
4 0040101b e81b020000 call comparesrc!printf (0040123b)
4 00401020 83c404 add esp,4
5 00401023 687c8c4100 push offset comparesrc!__xt_z+0x140 (00418c7c)
5 00401028 e8bf000000 call comparesrc!puts (004010ec)
5 0040102d 83c404 add esp,4
6 00401030 68808c4100 push offset comparesrc!__xt_z+0x144 (00418c80)
6 00401035 e8b2000000 call comparesrc!puts (004010ec)
6 0040103a 83c404 add esp,4
7 0040103d 68848c4100 push offset comparesrc!__xt_z+0x148 (00418c84)
7 00401042 e8a5000000 call comparesrc!puts (004010ec)
7 00401047 83c404 add esp,4
9 0040104a c745fc02000000 mov dword ptr [ebp-4],2
9 00401051 c745f803000000 mov dword ptr [ebp-8],3
9 00401058 c745f004000000 mov dword ptr [ebp-10h],4
10 0040105f 8b45fc mov eax,dword ptr [ebp-4]
10 00401062 0345f8 add eax,dword ptr [ebp-8]
10 00401065 2b45f0 sub eax,dword ptr [ebp-10h]
10 00401068 8945f4 mov dword ptr [ebp-0Ch],eax
11 0040106b 8b4df4 mov ecx,dword ptr [ebp-0Ch]
11 0040106e 51 push ecx
11 0040106f 688c8c4100 push offset comparesrc!__xt_z+0x150 (00418c8c)
11 00401074 e8c2010000 call comparesrc!printf (0040123b)
11 00401079 83c408 add esp,8
12 0040107c 8b45fc mov eax,dword ptr [ebp-4]
12 0040107f 0faf45f8 imul eax,dword ptr [ebp-8]
12 00401083 99 cdq
12 00401084 f77df0 idiv eax,dword ptr [ebp-10h]
12 00401087 8945f4 mov dword ptr [ebp-0Ch],eax
13 0040108a 8b55f4 mov edx,dword ptr [ebp-0Ch]
13 0040108d 52 push edx
13 0040108e 68908c4100 push offset comparesrc!__xt_z+0x154 (00418c90)
13 00401093 e8a3010000 call comparesrc!printf (0040123b)
13 00401098 83c408 add esp,8
14 0040109b 8b45fc mov eax,dword ptr [ebp-4]
14 0040109e 0faf45f8 imul eax,dword ptr [ebp-8]
14 004010a2 99 cdq
14 004010a3 f77df0 idiv eax,dword ptr [ebp-10h]
14 004010a6 8955f4 mov dword ptr [ebp-0Ch],edx
15 004010a9 8b45f4 mov eax,dword ptr [ebp-0Ch]
15 004010ac 50 push eax
15 004010ad 68948c4100 push offset comparesrc!__xt_z+0x158 (00418c94)
15 004010b2 e884010000 call comparesrc!printf (0040123b)
15 004010b7 83c408 add esp,8
16 004010ba 33c0 xor eax,eax
17 004010bc 8be5 mov esp,ebp
17 004010be 5d pop ebp
17 004010bf c3 ret
You can use
Windbg -z <any image>
to perform disassembly or any inspection of that image (works with cdb \ kd as well).
You can see source lines, symbols, types - without having to actually run the program.
This is useful for looking at DLLs, but really necessary when you want to look at code compiled for another architecture or a device driver where you can't run in on your machine.
For example
cdb -z ntoskrnl.exe
will let you inspect the code of the windows kernel.
This is more powerful than a crashdump because you don't just see the code that is paged in - you can see all the code that is in the .exe

Linking C with Assembly in Visual Studio

I'm trying to link main.c program with procedure.asm. I have the following C program and Assembly program.
main.c
#include <Windows.h>
#include <iostream>
using namespace std;
extern "C" {
void ClearUsingIndex(int[], int);
}
static int Array[10] ={1, 2, 3, 4, 5, 6, 7, 8, 9, -1};
int main()
{
int size = 10;
ClearUsingIndex( Array, size);
system("pause");
return 0;
}
procedure.asm
; Listing generated by Microsoft (R) Optimizing Compiler Version 17.00.50727.1
TITLE c:\Users\GS\documents\visual studio 2012\Projects\ClearIndex\ClearIndex\procedure.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB MSVCRTD
INCLUDELIB OLDNAMES
PUBLIC ?ClearUsingIndex##YAXQAHH#Z ; ClearUsingIndex
EXTRN __RTC_InitBase:PROC
EXTRN __RTC_Shutdown:PROC
; COMDAT rtc$TMZ
rtc$TMZ SEGMENT
__RTC_Shutdown.rtc$TMZ DD FLAT:__RTC_Shutdown
rtc$TMZ ENDS
; COMDAT rtc$IMZ
rtc$IMZ SEGMENT
__RTC_InitBase.rtc$IMZ DD FLAT:__RTC_InitBase
rtc$IMZ ENDS
; Function compile flags: /Odtp /RTCsu /ZI
; COMDAT ?ClearUsingIndex##YAXQAHH#Z
_TEXT SEGMENT
_i$ = -8 ; size = 4
_Array$ = 8 ; size = 4
_size$ = 12 ; size = 4
?ClearUsingIndex##YAXQAHH#Z PROC ; ClearUsingIndex, COMDAT
; File c:\users\gs\documents\visual studio 2012\projects\clearindex\clearindex\procedure.cpp
; Line 2
push ebp
mov ebp, esp
sub esp, 204 ; 000000ccH
push ebx
push esi
push edi
lea edi, DWORD PTR [ebp-204]
mov ecx, 51 ; 00000033H
mov eax, -858993460 ; ccccccccH
rep stosd
; Line 4
mov DWORD PTR _i$[ebp], 0
jmp SHORT $LN3#ClearUsing
$LN2#ClearUsing:
mov eax, DWORD PTR _i$[ebp]
add eax, 1
mov DWORD PTR _i$[ebp], eax
$LN3#ClearUsing:
mov eax, DWORD PTR _i$[ebp]
cmp eax, DWORD PTR _size$[ebp]
jge SHORT $LN4#ClearUsing
; Line 5
mov eax, DWORD PTR _i$[ebp]
mov ecx, DWORD PTR _Array$[ebp]
mov DWORD PTR [ecx+eax*4], 0
jmp SHORT $LN2#ClearUsing
$LN4#ClearUsing:
; Line 6
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 0
?ClearUsingIndex##YAXQAHH#Z ENDP ; ClearUsingIndex
_TEXT ENDS
END
I did some research and I found the following source and followed the instructions: Guide to Using Assembly in Visual Studio .NET
But I get the following error when I compile the assembly file.
error A1017: missing source filename
I google the error but all the methods I tried don't seem to work.

How do I best use the const keyword in C?

I am trying to get a sense of how I should use const in C code. First I didn't really bother using it, but then I saw a quite a few examples of const being used throughout. Should I make an effort and go back and religiously make suitable variables const? Or will I just be waisting my time?
I suppose it makes it easier to read which variables that are expected to change, especially in function calls, both for humans and the compiler. Am I missing any other important points?
const is typed, #define macros are not.
const is scoped by C block, #define applies to a file (or more strictly, a compilation unit).
const is most useful with parameter passing. If you see const used on a prototype with pointers, you know it is safe to pass your array or struct because the function will not alter it. No const and it can.
Look at the definition for such as strcpy() and you will see what I mean. Apply "const-ness" to function prototypes at the outset. Retro-fitting const is not so much difficult as "a lot of work" (but OK if you get paid by the hour).
Also consider:
const char *s = "Hello World";
char *s = "Hello World";
which is correct, and why?
How do I best use the const keyword in C?
Use const when you want to make it "read-only". It's that simple :)
Using const is not only a good practice but improves the readability and comprehensibility of the code as well as helps prevent some common errors. Definitely do use const where appropriate.
Apart from producing a compiler error when attempting to modify the constant and passing the constant as a non-const parameter, therefore acting as a compiler guard, it also enables the compiler to perform certain optimisations knowing that the value will not change and therefore it can cache the value and not have to read it fresh from memory, because it won't have changed, and it allows it to be immediately substituted in the code.
C const
const and register are basically the opposite of volatile and using volatile will override the const optimisations at file and block scope and the register optimisations at block-scope. const register and register will produce identical outputs because const does nothing on C at block-scope on gcc C -O0, and is redundant on -O1 and onwards, so only the register optimisations apply at -O0, and are redundant from -O1 onwards.
#include<stdio.h>
int main() {
const int i = 1;
printf("%d", i);
}
.LC0:
.string "%d"
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 1
mov eax, DWORD PTR [rbp-4] //load from stack isn't eliminated for block-scope consts on gcc C unlike on gcc C++ and clang C, even though value will be the same
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave
ret
In this instance, with -O0, const, volatile and auto all produce the same code, with only register differing c.f.
#include<stdio.h>
const int i = 1;
int main() {
printf("%d", i);
}
i:
.long 1
.LC0:
.string "%d"
main:
push rbp
mov rbp, rsp
mov eax, DWORD PTR i[rip] //load from memory
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
pop rbp
ret
with const int i = 1; instead:
i:
.long 1
.LC0:
.string "%d"
main:
push rbp
mov rbp, rsp
mov eax, 1 //saves load from memory, now immediate
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
pop rbp
ret
C++ const
#include <iostream>
int main() {
int i = 1;
std::cout << i;
}
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 1 //stores on stack
mov eax, DWORD PTR [rbp-4] //loads the value stored on the stack
mov esi, eax
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov eax, 0
leave
ret
#include <iostream>
int main() {
const int i = 1;
std::cout << i;
}
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 1 //stores it on the stack
mov esi, 1 //but saves a load from memory here, unlike on C
//'register' would skip this store on the stack altogether
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov eax, 0
leave
ret
#include <iostream>
int i = 1;
int main() {
std::cout << i;
}
i:
.long 1
main:
push rbp
mov rbp, rsp
mov eax, DWORD PTR i[rip] //load from memory
mov esi, eax
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov eax, 0
pop rbp
ret
#include <iostream>
const int i = 1;
int main() {
std::cout << i;
}
main:
push rbp
mov rbp, rsp
mov esi, 1 //eliminated load from memory, now immediate
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov eax, 0
pop rbp
ret
C++ has the extra restriction of producing a compiler error if a const is not initialised (both at file-scope and block-scope). const also has internal linkage as a default on C++. volatile still overrides const and register but const register combines both optimisations on C++.
Even though all the above code is compiled using the default implicit -O0, when compiled with -Ofast, const surprisingly still isn't redundant on C or C++ on clang or gcc for file-scoped consts. The load from memory isn't optimised out unless const is used, even if the file-scope variable isn't modified in the code. https://godbolt.org/z/PhDdxk.

Resources