I just want to be sure that this C code:
while(flag==true)
{
}
foo();
does the same as this:
while(flag==true);
foo();
; alone is a null statement in C.
In your case, {} or ; are syntactically needed, but they do the same: nothing
Related: Use of null statement in C
In addition to the other answers: It's the same thing.
But I prefer this:
while (condition)
{
}
foo();
over this:
while (condition);
foo();
because if you forget the semicolon after the while, your code will compile fine but it won't do what you expect:
while(condition) // ; forgotten here
foo();
will actually be equivalent of:
while(condition)
{
foo();
}
Yes, having an empty body of the loop is equivaled to just while(<some condition>);
Yes. A ; following a control structure (e.g., while, for, etc.) that can be followed with a block is treated as if it was followed by an empty block.
Yes, because when put semicolon after while loop statement that indicate empty body and when the condition becomes false then it goes to the immediate next statement after that loop.
Yes, they are same.
You Can Generate The assembly of the code and see for yourself that they produce the same assembly. (Using gcc filename.c -S -masm=intel -o ouputfilename)
#include<stdio.h>
int foo(void);
int main(){
int flag;
scanf("%d" , &flag);
while(flag==1);
foo();
}
int foo(void){
int x = 2;
return x*x;
}
.LC0:
.ascii "%d\0"
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
push rbp
.seh_pushreg rbp
mov rbp, rsp
.seh_setframe rbp, 0
sub rsp, 48
.seh_stackalloc 48
.seh_endprologue
call __main
lea rax, -4[rbp]
mov rdx, rax
lea rcx, .LC0[rip]
call scanf
nop
.L2:
mov eax, DWORD PTR -4[rbp]
cmp eax, 1
je .L2
call foo
mov eax, 0
add rsp, 48
pop rbp
ret
.seh_endproc
.globl foo
.def foo; .scl 2; .type 32; .endef
.seh_proc foo
foo:
push rbp
.seh_pushreg rbp
mov rbp, rsp
.seh_setframe rbp, 0
sub rsp, 16
.seh_stackalloc 16
.seh_endprologue
mov DWORD PTR -4[rbp], 2
mov eax, DWORD PTR -4[rbp]
imul eax, DWORD PTR -4[rbp]
add rsp, 16
pop rbp
ret
.seh_endproc
.ident "GCC: (x86_64-posix-seh-rev1, Built by MinGW-W64 project) 6.3.0"
.def scanf; .scl 2; .type 32; .endef
And When I Changed while(flag == 1); to while(flag==1){} Assembly Code Generated is :
.LC0:
.ascii "%d\0"
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
push rbp
.seh_pushreg rbp
mov rbp, rsp
.seh_setframe rbp, 0
sub rsp, 48
.seh_stackalloc 48
.seh_endprologue
call __main
lea rax, -4[rbp]
mov rdx, rax
lea rcx, .LC0[rip]
call scanf
nop
.L2:
mov eax, DWORD PTR -4[rbp]
cmp eax, 1
je .L2
call foo
mov eax, 0
add rsp, 48
pop rbp
ret
.seh_endproc
.globl foo
.def foo; .scl 2; .type 32; .endef
.seh_proc foo
foo:
push rbp
.seh_pushreg rbp
mov rbp, rsp
.seh_setframe rbp, 0
sub rsp, 16
.seh_stackalloc 16
.seh_endprologue
mov DWORD PTR -4[rbp], 2
mov eax, DWORD PTR -4[rbp]
imul eax, DWORD PTR -4[rbp]
add rsp, 16
pop rbp
ret
.seh_endproc
.ident "GCC: (x86_64-posix-seh-rev1, Built by MinGW-W64 project) 6.3.0"
.def scanf; .scl 2; .type 32; .endef
You can see that the relevant portion is same in both cases.
//Below Portion is same in both cases.
.L2:
mov eax, DWORD PTR -4[rbp]
cmp eax, 1
je .L2
call foo
mov eax, 0
add rsp, 48
pop rbp
ret
.seh_endproc
.globl foo
.def foo; .scl 2; .type 32; .endef
.seh_proc foo
Related
Please take a look at following code snippet
#define HF_ND_SZ sizeof(struct huffman_node)
#define TSIZE_MAX 256
struct huffman_node * build_decomp_huffman_tree(uint64_t *table, int size) {
static struct huffman_node huffman_node_list2[TSIZE_MAX * 3];
int i = 0, j = 0;
int k = TSIZE_MAX * 2; // this is the case point 1
//...//
for (i = 0; i < size - 1; i++) {
huffman_node_list2[k + i] = huffman_node_list2[i + 1]; // point 2
huffman_node_list2[TSIZE_MAX + i].right = &huffman_node_list2[k+ i];
// ... //
}
return &huffman_node_list2[size - 1];
}
For simplicity I reduced the code and point out the locations where I want to highlight,also do not think algorithm and structure too deeply.
What I want to is that if we define point 1 as const int k = TSIZE_MAX * 2;,then is there any optimization happens at point 2 or 3 where assignment happens to contiguous data(array) huffman_node_list2[k + i] = huffman_node_list2[i + 1]; ?
(Please bear with and correct my assumption if it is wrong,I thought when we declare const in local or global scope it's being created as an immutable memory allocation, if we use that immutable memory and carried out math operation as in point 2 or 3([k + i]) in a loop structure ,during runtime program has to load immutable memory every iteration of the loop and store the result in temporary memory location,what if happend if that immutable memory has large chunk,hope you can grab my idea,Am I correct?)
const can be slower if the compiler puts it in the read only .text section far enough away that causes a cache miss.
This can happen with global consts or when the compiler hoists it out of a function rather than having to build it with instructions (a fairly common optimization for structs or arrays) This can reduce code size if multiple functions use the same constant, but also increases the distance from the code and thus the likeliness to cause a miss.
Since you aren't using any aggregate types, there should be no difference with a decent optimizing compiler.
There is a good article on how different data gets laid out here
Using Visual C, I compiled both versions of your code : with const int k and without const. The flag /FA produces code machine in a .asm file readable by (some) human. No optimization flags were used.
The result is : there's no optimization, no difference. The machine code produced is strictly the same :
; Listing generated by Microsoft (R) Optimizing Compiler Version 19.00.24231.0
TITLE opt_const.c
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
PUBLIC _main
_BSS SEGMENT
?huffman_node_list2#?1??main##9#9 DB 01fd4H DUP (?) ; `main'::`2'::huffman_node_list2
_BSS ENDS
; Function compile flags: /Odtp
; File c:\joël\tests\opt_const.c
_TEXT SEGMENT
_j$ = -16 ; size = 4
_size$ = -12 ; size = 4
_k$ = -8 ; size = 4
_i$ = -4 ; size = 4
_argc$ = 8 ; size = 4
_argv$ = 12 ; size = 4
_main PROC
; 10 : {
push ebp
mov ebp, esp
sub esp, 16 ; 00000010H
push esi
push edi
; 11 : static struct huffman_node huffman_node_list2[TSIZE_MAX * 3];
; 12 : int i = 0, j = 0, size = 17;
mov DWORD PTR _i$[ebp], 0
mov DWORD PTR _j$[ebp], 0
mov DWORD PTR _size$[ebp], 17 ; 00000011H
; 13 : int k = TSIZE_MAX * 2; // this is the case point 1
mov DWORD PTR _k$[ebp], 194 ; 000000c2H
; 14 : //...//
; 15 : for (i = 0; i < size - 1; i++) {
mov DWORD PTR _i$[ebp], 0
jmp SHORT $LN4#main
$LN2#main:
mov eax, DWORD PTR _i$[ebp]
add eax, 1
mov DWORD PTR _i$[ebp], eax
$LN4#main:
mov ecx, DWORD PTR _size$[ebp]
sub ecx, 1
cmp DWORD PTR _i$[ebp], ecx
jge SHORT $LN3#main
; 16 : huffman_node_list2[k + i] = huffman_node_list2[i + 1]; // point 2
mov edx, DWORD PTR _i$[ebp]
add edx, 1
imul esi, edx, 28
add esi, OFFSET ?huffman_node_list2#?1??main##9#9
mov eax, DWORD PTR _k$[ebp]
add eax, DWORD PTR _i$[ebp]
imul edi, eax, 28
add edi, OFFSET ?huffman_node_list2#?1??main##9#9
mov ecx, 7
rep movsd
; 17 : huffman_node_list2[TSIZE_MAX + i].right = &huffman_node_list2[k+ i];
mov ecx, DWORD PTR _k$[ebp]
add ecx, DWORD PTR _i$[ebp]
imul edx, ecx, 28
add edx, OFFSET ?huffman_node_list2#?1??main##9#9
mov eax, DWORD PTR _i$[ebp]
add eax, 97 ; 00000061H
imul ecx, eax, 28
mov DWORD PTR ?huffman_node_list2#?1??main##9#9[ecx], edx
; 18 : // ... //
; 19 : }
jmp SHORT $LN2#main
$LN3#main:
; 20 : return 0;
xor eax, eax
; 21 : }
pop edi
pop esi
mov esp, ebp
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END
EDIT : I did the same test with gcc, -O3 optimization flags.
And... same result : the generated assembler code is again stricly the same with and without the const keyword.
.file "opt_const.c"
.section .text.unlikely,"ax",#progbits
.LCOLDB0:
.section .text.startup,"ax",#progbits
.LHOTB0:
.p2align 4,,15
.globl main
.type main, #function
main:
.LFB23:
.cfi_startproc
movl $huffman_node_list2.2488+16384, %eax
.p2align 4,,10
.p2align 3
.L2:
movq -16352(%rax), %rdx
movq %rax, -8192(%rax)
addq $32, %rax
movq %rdx, -32(%rax)
movq -16376(%rax), %rdx
movq %rdx, -24(%rax)
movq -16368(%rax), %rdx
movq %rdx, -16(%rax)
movq -16360(%rax), %rdx
movq %rdx, -8(%rax)
cmpq $huffman_node_list2.2488+17088, %rax
jne .L2
xorl %eax, %eax
ret
.cfi_endproc
.LFE23:
.size main, .-main
.section .text.unlikely
.LCOLDE0:
.section .text.startup
.LHOTE0:
.local huffman_node_list2.2488
.comm huffman_node_list2.2488,24576,32
.ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609"
.section .note.GNU-stack,"",#progbits
const doesn't necessarily create a memory location at all, unless you take its address. They can just disappear into the instruction stream as immediate-mode constants, or be added into addresses at compile or link time.
For example, huffman_node_list2[k + i] = huffman_node_list2[i + 1] is almost certainly compiled as huffman_node_list2[TSIZE_MAX * 2 + i] = huffman_node_list2[i + 1], where not only is TSIZE_MAX * 2 evaluated at compile time but huffman_node_list2+TSIZE_MAX*2 is evaluated at link time.
I am in the beginning of learning intel's x86 assembly code and compiled this simple "hello world" c program (without the cfi additions for simplicity):
#include
int main(int argc, char* argv[]) {
printf("hello world!");
return 0;
}
The following x86 code came out:
.file "helloworld.c"
.intel_syntax noprefix
.section .rodata
.LC0:
.string "hello world!"
.text
.globl main
.type main, #function
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR -4[rbp], edi
mov QWORD PTR -16[rbp], rsi
lea rdi, .LC0[rip]
mov eax, 0
call printf#PLT
mov eax, 0
leave
ret
.size main, .-main
.ident "GCC: (Debian 7.2.0-19) 7.2.0"
.section .note.GNU-stack,"",#progbits
The question: Why are those 16 bytes for local variables reserved on the stack but aren't used in any way? The program even does the same, without those lines, so for which reason were they created?
I was studying one of my courses when I ran into a specific exercise that I cannot seem to resolve... It is pretty basic because I am VERY new to assembly. So lets begin.
I have a C function
unsigned int func(int *ptr, unsigned int j) {
unsigned int res = j;
int i = ptr[j+1];
for(; i<8; ++i) {
res >>= 1;
}
return res;
}
I translated it with gcc to assembly
.file "func.c"
.intel_syntax noprefix
.text
.globl func
.type func, #function
func:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
mov QWORD PTR [rbp-24], rdi
mov DWORD PTR [rbp-28], esi
mov eax, DWORD PTR [rbp-28]
mov DWORD PTR [rbp-8], eax
mov eax, DWORD PTR [rbp-28]
add eax, 1
mov eax, eax
lea rdx, [0+rax*4]
mov rax, QWORD PTR [rbp-24]
add rax, rdx
mov eax, DWORD PTR [rax]
mov DWORD PTR [rbp-4], eax
jmp .L2
.L3:
shr DWORD PTR [rbp-8]
add DWORD PTR [rbp-4], 1
.L2:
cmp DWORD PTR [rbp-4], 7
jle .L3
mov eax, DWORD PTR [rbp-8]
pop rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size func, .-func
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
.section .note.GNU-stack,"",#progbits
The question is as follow. what is the command that place j (variable in the c function) on top of the stack?
I sincerely cannot find out please enlighten me XD.
The variable j is the second parameter for func; it is stored in the register esi in the x86-64 System V ABI calling convention. This instruction mov DWORD PTR [rbp-28], esi put j into the stack.
You can see it very clearly by writing a simple function that calls "func" and compiling it with -O0 (or with -O2 and marking it as noinline, or only providing a prototype so there's nothing for the compiler to inline).
unsigned int func(int *ptr, unsigned int j) {
unsigned int res = j;
int i = ptr[j+1];
for(; i<8; ++i) {
res >>= 1;
}
return res;
}
int main()
{
int a = 1;
int array[10];
func (array, a);
return 0;
}
Using the Godbolt compiler explorer, we can easily get gcc -O0 -fverbose-asm assembly output.
Please focus on the following instructions:
# in main:
...
mov DWORD PTR [rbp-4], 1
mov edx, DWORD PTR [rbp-4]
...
mov esi, edx
...
func(int*, unsigned int):
...
mov DWORD PTR [rbp-28], esi # j, j
...
j, j is a comment added by gcc -fverbose-asm tell you that the source and destination operands are both the C variable j in that instruction.
The full assembly instructions:
func(int*, unsigned int):
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-24], rdi
mov DWORD PTR [rbp-28], esi
mov eax, DWORD PTR [rbp-28]
mov DWORD PTR [rbp-4], eax
mov eax, DWORD PTR [rbp-28]
add eax, 1
mov eax, eax
lea rdx, [0+rax*4]
mov rax, QWORD PTR [rbp-24]
add rax, rdx
mov eax, DWORD PTR [rax]
mov DWORD PTR [rbp-8], eax
jmp .L2
.L3:
shr DWORD PTR [rbp-4]
add DWORD PTR [rbp-8], 1
.L2:
cmp DWORD PTR [rbp-8], 7
jle .L3
mov eax, DWORD PTR [rbp-4]
pop rbp
ret
main:
push rbp
mov rbp, rsp
sub rsp, 48
mov DWORD PTR [rbp-4], 1
mov edx, DWORD PTR [rbp-4]
lea rax, [rbp-48]
mov esi, edx
mov rdi, rax
call func(int*, unsigned int)
mov eax, 0
leave
ret
Taking into account these instructions
mov eax, DWORD PTR [rbp-28]
add eax, 1
it seems that j is stored at address rbp-28 While ptr is stored at address rbp-24.
These are instructions where the values are stored in the stack
mov QWORD PTR [rbp-24], rdi
mov DWORD PTR [rbp-28], esi
It seems the arguments are passed to the function using registers rdi and esi.
Compilers can optimize their calls of functions and use registers instead of the stack to pass arguments of small sizes to functions. Within the functions they can use the stack to temporary store the arguments passed through registers.
Just a suggestion for further explorations on your own. Use gcc -O0 -g2 f.c -Wa,-adhln. It will turn off optimizations and generate assembly code intermixed with the source. It might give you better ideas about what it does.
As an alternative you can use the objdump -Sd f.o on the output '.o' or executable. Just make sure that you add debugging info and turn off optimizations at compilation.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm learning assembly by writing C programs and viewing the assembly output. I've included the C program at the bottom for the page to make it easier. I'm struggling to understand one line of assembly:
cdqe
movzx eax, BYTE PTR [rbp-32+rax] <--- what is this doing?
movsx eax, al
So I think cdqe extends eax into rax (64 bits). Its clear that the string I want to print fits into the al register but I don't understand what is happening deep down with rbp-32+rax. Can someone explain for me?
.file "string_manip.c"
.intel_syntax noprefix
.section .rodata
.LC0:
.string "Hello"
.string ""
.zero 3
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp
.cfi_def_cfa_register 6
sub rsp, 48
mov rax, QWORD PTR fs:40
mov QWORD PTR [rbp-8], rax
xor eax, eax
mov DWORD PTR [rbp-36], 0
mov eax, DWORD PTR .LC0[rip]
mov DWORD PTR [rbp-32], eax
movzx eax, WORD PTR .LC0[rip+4]
mov WORD PTR [rbp-28], ax
movzx eax, BYTE PTR .LC0[rip+6]
mov BYTE PTR [rbp-26], al
mov WORD PTR [rbp-25], 0
mov BYTE PTR [rbp-23], 0
mov DWORD PTR [rbp-36], 0
jmp .L2
.L3:
mov eax, DWORD PTR [rbp-36]
cdqe
movzx eax, BYTE PTR [rbp-32+rax] <--- what is this doing?
movsx eax, al
mov edi, eax
call putchar
add DWORD PTR [rbp-36], 1
.L2:
cmp DWORD PTR [rbp-36], 5
jle .L3
mov edi, 10
call putchar
mov eax, 0
mov rdx, QWORD PTR [rbp-8]
xor rdx, QWORD PTR fs:40
je .L5
call __stack_chk_fail
.L5:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4"
.section .note.GNU-stack,"",#progbits
#include <string.h>
#include <stdio.h>
int main()
{
int i = 0;
char array[10] = "Hello\0";
for(i=0; i<6; i++)
printf("%c", array[i]);
printf("\n");
return 0;
}
It's just calculating the address of one of the characters.
Presumably your string starts at rbp-32 and then the instruction does the C equivalent of ch = string[rax].
I guess this is unoptimized code, so the compiler does a few extra sign extend and zero extend that are not really needed.
A friend helped me come up with the following code to use inline assembly in GCC on a 64-bit Windows machine:
int main() {
char* str = "Hello World";
int ret;
asm volatile(
"call puts"
: "=a" (ret), "+c" (str)
:
: "rdx", "rdi", "rsi", "r8", "r9", "r10", "r11");
return 0;
}
After compiling with -S -masm=intel (I prefer Intel syntax), I get this assembly code:
.file "hello.c"
.intel_syntax noprefix
.def __main; .scl 2; .type 32; .endef
.section .rdata,"dr"
.LC0:
.ascii "Hello World\0"
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
push rbp
.seh_pushreg rbp
push rdi
.seh_pushreg rdi
push rsi
.seh_pushreg rsi
mov rbp, rsp
.seh_setframe rbp, 0
sub rsp, 48
.seh_stackalloc 48
.seh_endprologue
call __main
lea rax, .LC0[rip]
mov QWORD PTR -8[rbp], rax
mov rax, QWORD PTR -8[rbp]
mov rcx, rax
/APP
# 7 "hello.c" 1
call puts
# 0 "" 2
/NO_APP
mov DWORD PTR -12[rbp], eax
mov QWORD PTR -8[rbp], rcx
mov eax, 0
add rsp, 48
pop rsi
pop rdi
pop rbp
ret
.seh_endproc
.ident "GCC: (x86_64-posix-seh-rev1, Built by MinGW-W64 project) 4.9.2"
It works, but it sure looks messy with what appears to be superfluous code. Then again, my last experience with assembly was with the 65816 back in the 80s, and it wasn't inline. Anyway, I cleaned up the code, and the following accomplishes the exact same thing, as far as I can tell:
.intel_syntax noprefix
.data:
.ascii "Hello World\0"
.text
.globl main
main:
sub rsp, 48
lea rax, .data[rip]
mov rcx, rax
call puts
mov eax, 0
add rsp, 48
ret
Much simpler. What's all that extra stuff GCC added?
Edit: Not a duplicate because in addition to the structured exception handling, I'm also asking about the callee-saved registers, the call to __main, the explicit size directives, and the APP/NO_APP section.