What is the difference between these two segments of code:
1
scanf("%d%d", p1, p2);
2
scanf("%d", p1);
scanf("%d", p2);
Since you're not checking the return values, there is no difference in the behavior. If you were checking the return values, the second option potentially gives you a little more detail (two return values to check) for a little more work. If the input is a single number followed by an EOF, the second will return 1,EOF. If the input is a single number followed by a non-number, it will return 1,0. The first option will return 1 in either of the above cases, so you can't tell the difference without another call (if you care).
I am posting this because it will helpful for you to use scanf in different way::
if you input like::
below all input taking in different variable and array
8 5
2 3 1 2 3 2 3 3
0 2
0 1
6 7
3 5
0 7
then how to use scanf
if you want to take input from console as "space separated input" ex:: 1 2 3 4 5 5 6
then you can use scanf() as below
for(i=0; i < no_you_want;i++)
{
// single space before %d in below scanf function
scanf(" %d",&a[i]);
}
Firstly you need to add '&' in your inputs ;-) Like Chris mentioned there is no difference in working unless you handle the return value. I tried to compare the assembly code, of-course there are couple of extra instructions as below.
scanf("%d%d", &p1, &p2);
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl $.LC0, %eax
leaq -4(%rbp), %rdx
leaq -8(%rbp), %rcx
movq %rcx, %rsi
movq %rax, %rdi
movl $0, %eax
call __isoc99_scanf
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
scanf("%d", &p1);
scanf("%d", &p2);
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl $.LC0, %eax
leaq -8(%rbp), %rdx
movq %rdx, %rsi
movq %rax, %rdi
movl $0, %eax
call __isoc99_scanf
movl $.LC0, %eax
leaq -4(%rbp), %rdx
movq %rdx, %rsi
movq %rax, %rdi
movl $0, %eax
call __isoc99_scanf
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
An expected difference between scanf("%d%d", p1, p2); and scanf("%d", p1);
scanf("%d", p2); occurs when the input text is "-+2\n".
With scanf("%d%d", p1, p2);, scanf() read "-+" and seeing that it is not a valid int sequence, ungets the + for subsequent IO. *p1 and *p2 remain unchanged.
With scanf("%d", p1);, scanf() read "-+" and seeing that it is not a valid int sequence, (*p1 remains unchanged) ungets the + for subsequent IO which is scanf("%d", p2); which scans "+3\n", save 3 into *p2 and ungets the \n for subsequent IO.
C11dr §7.21.6.2 9 footnote about fscanf(), which applies to scanf(): "fscanf pushes back at most one input character onto the input stream. Therefore, some sequences that are acceptable to strtod, strtol, etc., are unacceptable to fscanf.
But I can not make it differ on my machine.
My cygwin Kepler gcc unexpectedly pushes back both "-+" causing both codes to operate the same. Your mileage may vary. My gcc appears to not comply with the C spec.
Related
The GCC compiler on CSLab translates the following C function:
int func(int x) {
return 13 + x;
}
Into the following Assembly code:
func:
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
addl $13, %eax
popq %rbp
ret
I have completed this code and was then asked the following question:
In the Assembly code for func shown in the previous question, suppose %rsp has the value
0x7fffffffe3e0
What is the address corresponding to the parameter (local variable) x? Include the 0x prefix.
(Note that the address has 12 significant hex digits, or 6 bytes. > The value for the top two hex digits is 0. Omit the 0s to the left just as shown above.)
I answered 0xd and it was incorrect.
Taking the given value for %rsp as 0x7fffffffe3e0, we have
movq %rsp, %rbp
which copies the value of %rsp to %rbp, then we have
movl -4(%rbp), %eax
addl $13, %eax
We copy something to %eax and add 13 to it, so that something must be x. That something is -4(%rbp), which translates to "an object 4 bytes below the address value stored in %rbp".
Thus, the address of x must be 0x7fffffffe3e0 - 4, or 0x7fffffffe3dc.
Read up on your x86 assembly addressing modes.
This question already has answers here:
Why does the compiler reserve a little stack space but not the whole array size?
(2 answers)
Stack allocation, padding, and alignment
(6 answers)
Closed 2 years ago.
How does gcc decides how much memory allocate for stack and why does it not decrement %rsp anymore when I remove printf() (or any function call) from my main?
1. I noticed when I played around with a code sample: https://godbolt.org/z/fQqkNE that the 6th line in gcc assembly viewer subq $48, %rsp gets removed if I remove printf() from my C code on line 22. It looks like when I don’t make any function calls from within my main, then the %rsp does not get decremented, but data still gets allocated based on %rbp and offsets. I thought %rsp changes only when stack grows. My theory is that since it won’t make any other function calls, it knows that it won’t need to keep stack for other nonexistent functions. But shouldn’t %rsp still grow as data is getting saved?
2. When adding variables to my rect struct, I also noticed that it sometimes allocates memory in steps greater than what the added data type size was. What is the convention it follows when deciding how much memory to allocate to stack?
3. Is there an online tool that would take assembly code as input, and then draw an image of stack and tell me state of every register at any point of execution? Godbolt.org is a very good tool, I just wish it had these 2 extra features.
I'll paste the code below in case the link to godbolt stops working in the future:
#include <stdio.h>
#include <stdint.h>
struct rect {
int a;
int b;
int* c;
int d[2];
uint8_t f;
};
int main() {
int arr[2] = {2, 3};
struct rect Rect;
Rect.a = 10;
Rect.b = 20;
Rect.c = arr;
Rect.d[0] = Rect.a;
Rect.d[1] = Rect.b;
Rect.f =255;
printf("%d and %d", Rect.a, Rect.b);
return 0;
}
.LC0:
.string "%d and %d"
main:
pushq %rbp
movq %rsp, %rbp
subq $48, %rsp
movl $2, -8(%rbp)
movl $3, -4(%rbp)
movl $10, -48(%rbp)
movl $20, -44(%rbp)
leaq -8(%rbp), %rax
movq %rax, -40(%rbp)
movl -48(%rbp), %eax
movl %eax, -32(%rbp)
movl -44(%rbp), %eax
movl %eax, -28(%rbp)
movb $-1, -24(%rbp)
movl -44(%rbp), %edx
movl -48(%rbp), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
leave
ret
P.S.: The book I follow uses AT&T syntax for teaching x86. Which is weird because it makes finding online tutorials much harder.
I'm reading Computer Systems: A Programmer's Perspective 3rd edition and the assembly in 3.10.5 Supporting Variable-Size Stack Frames, Figure 3.43 confuses me.
The part of the book is trying to explain how a variable-size stack frame is generated and it gives a C code and its assembly version as an example.
Here is the code of C and assembly(Figure 3.43 of the book):
I don't know what the use of line 8-10 in the assembly is. Why not just use movq %rsp, %r8after line 7?
(a) C code
long vframe(long n, long idx, long *q) {
long i;
long *p[n];
p[0] = &i;
for (i = 1; i < n; i++)
p[i] = q;
return *p[idx];
}
(b) Portions of generated assembly code
vframe:
2: pushq %rbp
3: movq %rsp, %rbp
4: subq $16, %rsp
5: leaq 22(, %rdi, 8), %rax
6: andq $-16, %rax
7: subq %rax, %rsp
8: leaq 7(%rsp), %rax
9: shrq $3, %rax
10: leaq 0(, %rax, 8), %r8
11: movq %r8, %rcx
................................
12: L3:
13: movq %rdx, (%rcx, %rax, 8)
14: addq $1, %rax
15: movq %rax, -8(%rbp)
16: L2:
17: movq -8(%rbp), %rax
18: cmpq %rdi, %rax
19: jl L3
20: leave
21: ret
Here is what I think:
After line 7, the %rsp should be a multiple of 16 (%rsp should be a multiple of 16 before vframe is called because of stack frame alignment. When vframe is called, %rsp is subtracted by 8 to hold the return address of the caller, and then the pushq instruction in line 2 subtracts %rsp by another 8, and in line 4 a 16. So at the start of line 7, %rsp is a multiple of 16. In line 7, %rsp is subtracted by %rax. Since line 6 makes %rax a multiple of 16, the result of line 7 is setting %rsp a multiple of 16) which means the lower 4 bits of %rsp are all zeros.
Then in line 8, %rsp+7 is stored in %rax, and in line 9 %rax is shifted right logically by 3 bits, and in line 10, %rax*8 is stored in %r8.
After line 7, the lower 4 bits of %rsp are all zeros. In line 8 %rsp+7 just makes the lower 3 bits all ones, and line 9 truncates these 3 ones, and in line 10 %rax*8 makes the result shift left by 3 bits. So the final result should just be the original %rsp (the result of line 7).
So I wonder whether line 8-10 are useless.
Why not just use movq %rsp, %r8 after line 7 and remove the original line 8-10?
I thought that a useful exploratory program would be to reduce your generated code to:
.globl _vframe
_vframe:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
leaq 22(, %rdi, 8), %rax
andq $-16, %rax
subq %rax, %rsp
leaq 7(%rsp), %rax
shrq $3, %rax
leaq 0(, %rax, 8), %r8
mov %r8, %rax
sub %rsp, %rax
leave
ret
Note that I just eliminated the code that did anything useful, and returned the difference between %r8 and %rsp.
Then wrote a driver:
extern void *vframe(unsigned long n);
#include <stdio.h>
int main(void) {
int i;
for (i = 0; i < (1<<18); i++) {
void *p = vframe(i);
if (p) {
printf("%d %p\n", i, p);
}
}
return 0;
}
to check it out. They were always the same. So, why? It may be that it is a standard code emission when confronted with a given construct (var len array). The compiler has to maintain certain standards, such as traceable call frames and alignment, os might just emit this code as the known solution to that. Variable length arrays are generally considered a mistake in the language; a tribute to c++, adding a half-working, half-thought-out mechanism to C; so compiler implementors might not give to much attention to the code generated on their behalf.
So I do have some assembly code which I wrote on my linux VM (Manjaro, x86_64). It looks like this:
.section .rodata
.LC0:
.string "The value of a is: %d, of b: %d"
.text
.globl main
.type main, #function
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl $15, -4(%rbp)
movl $20, -8(%rbp)
movl -8(%rbp), %edx
movl -4(%rbp), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
leave
ret
Basically I want to insert 2 values in registers, then somehow print them (formated like in .LC0). Well, I got stuck, so I just wrote C program, and used gcc -S to see how it looks. It gave me something similair to the code above. I don't understand two things:
If I store 20 in %edx and 15 in %eax, then why passing only %eax to %esi causes printf to print the values both from %eax and %edx?
Why do I have to put a zero constant everytime before and after printf (as gcc does?)
Why do I have to put a zero constant everytime before and after printf
These are two different issues.
Zero before printf conforms to x86-64 a.k.a. AMD64 SysV ABI to specify count of variable arguments in vector (XMMn, YMMn...) registers.
Zero after printf is this function return value (likely, return 0 at its end).
why passing only %eax to %esi causes printf to print the values both from %eax and %edx?
It does not.
The same ABI specifies: the first argument (printf format string pointer) in %rdi; the second argument (first variable argument) in %rsi, and so on. Additional move of arguments seems to be artifact of non-optimized (-O0) gcc output code. If you add any optimization (even -Og), youʼll see these senseless moves wiped out.
I have main function in C that runs code in assembly. I just want to make simple sum:
main.c
#include <stdio.h>
extern int addByAssembly(int first_number, int second_number);
int main (int argc, char **argv)
{
int sum=0;
sum = addByAssembly(5,4);
printf ("%d\n",sum);
return 0;
}
addByAssembly.s
.data
SYSREAD = 0
SYSWRITE = 1
SYSEXIT = 60
STDOUT = 1
STDIN = 0
EXIT_SUCCESS = 0
.text
#.global main
#main:
#call write
#movq $SYSEXIT, %rax
#movq $EXIT_SUCCESS, %rdi
#syscall
#********
.globl addByAssembly
addByAssembly:
pushq %rbp
movq %rsp, %rbp
movq 16(%rsp), %rax
addq 24(%rsp), %rax
movq %rbp, %rsp
popq %rbp
But i got mess in my sum. It looks like i badly pass arguments, beause if I do this:
movq $123, %rax
return value is 123. I 've tried many ways, but cannot find how to make this properly to sum.
Thanks 'Jester' for so much effort and time to get me this explained!
To sum up. Passing parameters from C to As ( and as well from As to C) has its own ABI convention.
As you can see there, params are send on order:
1) rdi
2) rsi
3) rdx
... and so on...
In case you have more parameters than in convention, it will be pushed to stack.
So in my case:
.globl addByAssembly
addByAssembly:
pushq %rbp
movq %rsp, %rbp
--movq 16(%rsp), %rax #this was wrong as my params are
--addq 24(%rsp), %rax # first in %rdi, second in %rsi
++lea (%rdi, %rsi), %rax # in my case this line will do
# %rdi+%rsi -> %rax (learn lea, usefull command)
# REMEMBER return value is always in %rax!
movq %rbp, %rsp
popq %rbp