Related
I came across a minimal HTTP server that is written without libc: https://github.com/Francesco149/nolibc-httpd
I can see that basic string handling functions are defined, leading to the write syscall:
#define fprint(fd, s) write(fd, s, strlen(s))
#define fprintn(fd, s, n) write(fd, s, n)
#define fprintl(fd, s) fprintn(fd, s, sizeof(s) - 1)
#define fprintln(fd, s) fprintl(fd, s "\n")
#define print(s) fprint(1, s)
#define printn(s, n) fprintn(1, s, n)
#define printl(s) fprintl(1, s)
#define println(s) fprintln(1, s)
And the basic syscalls are declared in the C file:
size_t read(int fd, void *buf, size_t nbyte);
ssize_t write(int fd, const void *buf, size_t nbyte);
int open(const char *path, int flags);
int close(int fd);
int socket(int domain, int type, int protocol);
int accept(int socket, sockaddr_in_t *restrict address,
socklen_t *restrict address_len);
int shutdown(int socket, int how);
int bind(int socket, const sockaddr_in_t *address, socklen_t address_len);
int listen(int socket, int backlog);
int setsockopt(int socket, int level, int option_name, const void *option_value,
socklen_t option_len);
int fork();
void exit(int status);
So I guess the magic happens in start.S, which contains _start and a special way of encoding syscalls by creating global labels which fall through and accumulating values in r9 to save bytes:
.intel_syntax noprefix
/* functions: rdi, rsi, rdx, rcx, r8, r9 */
/* syscalls: rdi, rsi, rdx, r10, r8, r9 */
/* ^^^ */
/* stack grows from a high address to a low address */
#define c(x, n) \
.global x; \
x:; \
add r9,n
c(exit, 3) /* 60 */
c(fork, 3) /* 57 */
c(setsockopt, 4) /* 54 */
c(listen, 1) /* 50 */
c(bind, 1) /* 49 */
c(shutdown, 5) /* 48 */
c(accept, 2) /* 43 */
c(socket, 38) /* 41 */
c(close, 1) /* 03 */
c(open, 1) /* 02 */
c(write, 1) /* 01 */
.global read /* 00 */
read:
mov r10,rcx
mov rax,r9
xor r9,r9
syscall
ret
.global _start
_start:
xor rbp,rbp
xor r9,r9
pop rdi /* argc */
mov rsi,rsp /* argv */
call main
call exit
Is this understanding correct? GCC use the symbols defined in start.S for the syscalls, then the program starts in _start and calls main from the C file?
Also how does the separate httpd.asm custom binary work? Just hand-optimized assembly combining the C source and start assembly?
(I cloned the repo and tweaked the .c and .S to compile better with clang -Oz: 992 bytes, down from the original 1208 with gcc. See the WIP-clang-tuning branch in my fork, until I get around to cleaning that up and sending a pull request. With clang, inline asm for the syscalls does save size overall, especially once main has no calls and no rets. IDK if I want to hand-golf the whole .asm after regenerating from compiler output; there are certainly chunks of it where significant savings are possible, e.g. using lodsb in loops.)
It looks like they need r9 to be 0 before a call to any of these labels, either with a register global var or maybe gcc -ffixed-r9 to tell GCC to keep its hands off that register permanently. Otherwise GCC would have left whatever garbage in r9, just like other registers.
Their functions are declared with normal prototypes, not 6 args with dummy 0 args to get every call site to actually zero r9, so that's not how they're doing it.
special way of encoding syscalls
I wouldn't describe that as "encoding syscalls". Maybe "defining syscall wrapper functions". They're defining their own wrapper function for each syscall, in an optimized way that falls through into one common handler at the bottom. In the C compiler's asm output, you'll still see call write.
(It might have been more compact for the final binary to use inline asm to let the compiler inline a syscall instruction with the args in the right registers, instead of making it look like a normal function that clobbers all the call-clobbered registers. Especially if compiled with clang -Oz which would use 3-byte push 2 / pop rax instead of 5-byte mov eax, 2 to set up the call number. push imm8/pop/syscall is the same size as call rel32.)
Yes, you can define functions in hand-written asm with .global foo / foo:. You could look at this as one large function with multiple entry points for different syscalls. In asm, execution always passes to the next instruction, regardless of labels, unless you use a jump/call/ret instruction. The CPU doesn't know about labels.
So it's just like a C switch(){} statement without break; between case: labels, or like C labels you can jump to with goto. Except of course in asm you can do this at global scope, while in C you can only goto within a function. And in asm you can call instead of just goto (jmp).
static long callnum = 0; // r9 = 0 before a call to any of these
...
socket:
callnum += 38;
close:
callnum++; // can use inc instead of add 1
open: // missed optimization in their asm
callnum++;
write:
callnum++;
read:
tmp=callnum;
callnum=0;
retval = syscall(tmp, args);
Or if you recast this as a chain of tailcalls, where we can omit even the jmp foo and instead just fall through: C like this truly could compile to the hand-written asm, if you had a smart enough compiler. (And you could solve the arg-type
register long callnum asm("r9"); // GCC extension
long open(args...) {
callnum++;
return write(args...);
}
long write(args...) {
callnum++;
return read(args...); // tailcall
}
long read(args...){
tmp=callnum;
callnum=0; // reset callnum for next call
return syscall(tmp, args...);
}
args... are the arg-passing registers (RDI, RSI, RDX, RCX, R8) which they simply leave unmodified. R9 is the last arg-passing register for x86-64 System V, but they didn't use any syscalls that take 6 args. setsockopt takes 5 args so they couldn't skip the mov r10, rcx. But they were able to use r9 for something else, instead of needing it to pass the 6th arg.
That's amusing that they're trying so hard to save bytes at the expense of performance, but still use xor rbp,rbp instead of xor ebp,ebp. Unless they build with gcc -Wa,-Os start.S, GAS won't optimize away the REX prefix for you. (Does GCC optimize assembly source file?)
They could save another byte with xchg rax, r9 (2 bytes including REX) instead of mov rax, r9 (REX + opcode + modrm). (Code golf.SE tips for x86 machine code)
I'd also have used xchg eax, r9d because I know Linux system call numbers fit in 32 bits, although it wouldn't save code size because a REX prefix is still needed to encode the r9d register number. Also, in the cases where they only need to add 1, inc r9d is only 3 bytes, vs. add r9d, 1 being 4 bytes (REX + opcode + modrm + imm8). (The no-modrm short-form encoding of inc is only available in 32-bit mode; in 64-bit mode it's repurposed as a REX prefix.)
mov rsi,rsp could also save a byte as push rsp / pop rsi (1 byte each) instead of 3-byte REX + mov. That would make room for returning main's return value with xchg edi, eax before call exit.
But since they're not using libc, they could inline that exit, or put the syscalls below _start so they can just fall into it, because exit happens to be the highest-numbered syscall! Or at least jmp exit since they don't need stack alignment, and jmp rel8 is more compact than call rel32.
Also how does the separate httpd.asm custom binary work? Just hand-optimized assembly combining the C source and start assembly?
No, that's fully stand-alone incorporating the start.S code (at the ?_017: label), and maybe hand-tweaked compiler output. Perhaps from hand-tweaking disassembly of a linked executable, hence not having nice label names even for the part from the hand-written asm. (Specifically, from Agner Fog's objconv, which uses that format for labels in its NASM-syntax disassembly.)
(Ruslan also pointed out stuff like jnz after cmp, instead of jne which has the more appropriate semantic meaning for humans, so another sign of it being compiler output, not hand-written.)
I don't know how they arranged to get the compiler not to touch r9. It seems just luck. The readme indicates that just compiling the .c and .S works for them, with their GCC version.
As far as the ELF headers, see the comment at the top of the file, which links A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux - you'd assemble this with nasm -fbin and the output is a complete ELF binary, ready to run. Not a .o that you need to link + strip, so you get to account for every single byte in the file.
You're pretty much correct about what's going on. Very interesting, I've never seen something like this before. But basically as you said, every time it calls the label, as you said, r9 keeps adding up until it reaches read, whose syscall number is 0. This is why the order is pretty clever. Assuming r9 is 0 before read is called (the read label itself zeroes r9 before calling the correct syscall), no adding is needed because r9 already has the correct syscall number that is needed. write's syscall number is 1, so it only needs to be added by 1 from 0, which is shown in the macro call. open's syscall number is 2, so first it is added by 1 at the open label, then again by 1 at the write label, and then the correct syscall number is put into rax at the read label. And so on. Parameter registers like rdi, rsi, rdx, etc. are also not touched so it basically acts like a normal function call.
Also how does the separate httpd.asm custom binary work? Just hand-optimized assembly combining the C source and start assembly?
I'm assuming you're talking about this file. Not sure exactly what's going on here, but it looks like an ELF file is manually being created, probably to reduce size further.
How Get arguments value using inline assembly in C without Glibc?
i require this code for Linux archecture x86_64 and i386.
if you know about MAC OS X or Windows , also submit and please guide.
void exit(int code)
{
//This function not important!
//...
}
void _start()
{
//How Get arguments value using inline assembly
//in C without Glibc?
//argc
//argv
exit(0);
}
New Update
https://gist.github.com/apsun/deccca33244471c1849d29cc6bb5c78e
and
#define ReadRdi(To) asm("movq %%rdi,%0" : "=r"(To));
#define ReadRsi(To) asm("movq %%rsi,%0" : "=r"(To));
long argcL;
long argvL;
ReadRdi(argcL);
ReadRsi(argvL);
int argc = (int) argcL;
//char **argv = (char **) argvL;
exit(argc);
But it still returns 0.
So this code is wrong!
please help.
As specified in the comment, argc and argv are provided on the stack, so you cannot use a regular C function to get them, even with inline assembly, as the compiler will touch the stack pointer to allocate the local variables, setup the stack frame & co.; hence, _start must be written in assembly, as it's done in glibc (x86; x86_64). A small stub can be written to just grab the stuff and forward it to your "real" C entrypoint according to the regular calling convention.
Here a minimal example of a program (both for x86 and x86_64) that reads argc and argv, prints all the values in argv on stdout (separated by newline) and exits using argc as status code; it can be compiled with the usual gcc -nostdlib (and -static to make sure ld.so isn't involved; not that it does any harm here).
#ifdef __x86_64__
asm(
".global _start\n"
"_start:\n"
" xorl %ebp,%ebp\n" // mark outermost stack frame
" movq 0(%rsp),%rdi\n" // get argc
" lea 8(%rsp),%rsi\n" // the arguments are pushed just below, so argv = %rbp + 8
" call bare_main\n" // call our bare_main
" movq %rax,%rdi\n" // take the main return code and use it as first argument for...
" movl $60,%eax\n" // ... the exit syscall
" syscall\n"
" int3\n"); // just in case
asm(
"bare_write:\n" // write syscall wrapper; the calling convention is pretty much ok as is
" movq $1,%rax\n" // 1 = write syscall on x86_64
" syscall\n"
" ret\n");
#endif
#ifdef __i386__
asm(
".global _start\n"
"_start:\n"
" xorl %ebp,%ebp\n" // mark outermost stack frame
" movl 0(%esp),%edi\n" // argc is on the top of the stack
" lea 4(%esp),%esi\n" // as above, but with 4-byte pointers
" sub $8,%esp\n" // the start starts 16-byte aligned, we have to push 2*4 bytes; "waste" 8 bytes
" pushl %esi\n" // to keep it aligned after pushing our arguments
" pushl %edi\n"
" call bare_main\n" // call our bare_main
" add $8,%esp\n" // fix the stack after call (actually useless here)
" movl %eax,%ebx\n" // take the main return code and use it as first argument for...
" movl $1,%eax\n" // ... the exit syscall
" int $0x80\n"
" int3\n"); // just in case
asm(
"bare_write:\n" // write syscall wrapper; convert the user-mode calling convention to the syscall convention
" pushl %ebx\n" // ebx is callee-preserved
" movl 8(%esp),%ebx\n" // just move stuff from the stack to the correct registers
" movl 12(%esp),%ecx\n"
" movl 16(%esp),%edx\n"
" mov $4,%eax\n" // 4 = write syscall on i386
" int $0x80\n"
" popl %ebx\n" // restore ebx
" ret\n"); // notice: the return value is already ok in %eax
#endif
int bare_write(int fd, const void *buf, unsigned count);
unsigned my_strlen(const char *ch) {
const char *ptr;
for(ptr = ch; *ptr; ++ptr);
return ptr-ch;
}
int bare_main(int argc, char *argv[]) {
for(int i = 0; i < argc; ++i) {
int len = my_strlen(argv[i]);
bare_write(1, argv[i], len);
bare_write(1, "\n", 1);
}
return argc;
}
Notice that here several subtleties are ignored - in particular, the atexit bit. All the documentation about the machine-specific startup state has been extracted from the comments in the two glibc files linked above.
This answer is for x86-64, 64-bit Linux ABI, only. All the other OSes and ABIs mentioned will be broadly similar, but different enough in the fine details that you will need to write your custom _start once for each.
You are looking for the specification of the initial process state in the "x86-64 psABI", or, to give it its full title, "System V Application Binary Interface, AMD64 Architecture Processor Supplement (With LP64 and ILP32 Programming Models)". I will reproduce figure 3.9, "Initial Process Stack", here:
Purpose Start Address Length
------------------------------------------------------------------------
Information block, including varies
argument strings, environment
strings, auxiliary information
...
------------------------------------------------------------------------
Null auxiliary vector entry 1 eightbyte
Auxiliary vector entries... 2 eightbytes each
0 eightbyte
Environment pointers... 1 eightbyte each
0 8+8*argc+%rsp eightbyte
Argument pointers... 8+%rsp argc eightbytes
Argument count %rsp eightbyte
It goes on to say that the initial registers are unspecified except
for %rsp, which is of course the stack pointer, and %rdx, which may contain "a function pointer to register with atexit".
So all the information you are looking for is already present in memory, but it hasn't been laid out according to the normal calling convention, which means you must write _start in assembly language. It is _start's responsibility to set everything up to call main with, based on the above. A minimal _start would look something like this:
_start:
xorl %ebp, %ebp # mark the deepest stack frame
# Current Linux doesn't pass an atexit function,
# so you could leave out this part of what the ABI doc says you should do
# You can't just keep the function pointer in a call-preserved register
# and call it manually, even if you know the program won't call exit
# directly, because atexit functions must be called in reverse order
# of registration; this one, if it exists, is meant to be called last.
testq %rdx, %rdx # is there "a function pointer to
je skip_atexit # register with atexit"?
movq %rdx, %rdi # if so, do it
call atexit
skip_atexit:
movq (%rsp), %rdi # load argc
leaq 8(%rsp), %rsi # calc argv (pointer to the array on the stack)
leaq 8(%rsp,%rdi,8), %rdx # calc envp (starts after the NULL terminator for argv[])
call main
movl %eax, %edi # pass return value of main to exit
call exit
hlt # should never get here
(Completely untested.)
(In case you're wondering why there's no adjustment to maintain stack pointer alignment, this is because upon a normal procedure call, 8(%rsp) is 16-byte aligned, but when _start is called, %rsp itself is 16-byte aligned. Each call instruction displaces %rsp down by eight, producing the alignment situation expected by normal compiled functions.)
A more thorough _start would do more things, such as clearing all the other registers, arranging for greater stack pointer alignment than the default if desired, calling into the C library's own initialization functions, setting up environ, initializing the state used by thread-local storage, doing something constructive with the auxiliary vector, etc.
You should also be aware that if there is a dynamic linker (PT_INTERP section in the executable), it receives control before _start does. Glibc's ld.so cannot be used with any C library other than glibc itself; if you are writing your own C library, and you want to support dynamic linkage, you will also need to write your own ld.so. (Yes, this is unfortunate; ideally, the dynamic linker would be a separate development project and its complete interface would be specified.)
As a quick and dirty hack, you can make an executable with a compiled C function as the ELF entry point. Just make sure you use exit or _exit instead of returning.
(Link with gcc -nostartfiles to omit CRT but still link other libraries, and write a _start() in C. Beware of ABI violations like stack alignment, e.g. use -mincoming-stack-boundary=2 or an __attribte__ on _start, as in Compiling without libc)
If it's dynamically linked, you can still use glibc functions on Linux (because the dynamic linker runs glibc's init functions). Not all systems are like this, e.g. on cygwin you definitely can't call libc functions if you (or the CRT start code) hasn't called the libc init functions in the correct order. I'm not sure it's even guaranteed that this works on Linux, so don't depend on it except for experimentation on your own system.
I have used a C _start(void){ ... } + calling _exit() for making a static executable to microbenchmark some compiler-generated code with less startup overhead for perf stat ./a.out.
Glibc's _exit() works even if glibc wasn't initialized (gcc -O3 -static), or use inline asm to run xor %edi,%edi / mov $60, %eax / syscall (sys_exit(0) on Linux) so you don't have to even statically link libc. (gcc -O3 -nostdlib)
With even more dirty hacking and UB, you can access argc and argv by knowing the x86-64 System V ABI that you're compiling for (see #zwol's answer for a quote from ABI doc), and how the process startup state differers from the function calling convention:
argc is where the return address would be for a normal function (pointed to by RSP). GNU C has a builtin for accessing the return address of the current function (or for walking up the stack.)
argv[0] is where the 7th integer/pointer arg should be (the first stack arg, just above the return address). It happens to / seems to work to take its address and use that as an array!
// Works only for the x86-64 SystemV ABI; only tested on Linux.
// DO NOT USE THIS EXCEPT FOR EXPERIMENTS ON YOUR OWN COMPUTER.
#include <stdio.h>
#include <stdlib.h>
// tell gcc *this* function is called with a misaligned RSP
__attribute__((force_align_arg_pointer))
void _start(int dummy1, int dummy2, int dummy3, int dummy4, int dummy5, int dummy6, // register args
char *argv0) {
int argc = (int)(long)__builtin_return_address(0); // load (%rsp), casts to silence gcc warnings.
char **argv = &argv0;
printf("argc = %d, argv[argc-1] = %s\n", argc, argv[argc-1]);
printf("%f\n", 1.234); // segfaults if RSP is misaligned
exit(0);
//_exit(0); // without flushing stdio buffers!
}
# with a version without the FP printf
peter#volta:~/src/SO$ gcc -nostartfiles _start.c -o bare_start
peter#volta:~/src/SO$ ./bare_start
argc = 1, argv[argc-1] = ./bare_start
peter#volta:~/src/SO$ ./bare_start abc def hij
argc = 4, argv[argc-1] = hij
peter#volta:~/src/SO$ file bare_start
bare_start: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=af27c8416b31bb74628ef9eec51a8fc84e49550c, not stripped
# I could have used -fno-pie -no-pie to make a non-PIE executable
This works with or without optimization, with gcc7.3. I was worried that without optimization, the address of argv0 would be below rbp where it copies the arg, rather than its original location. But apparently it works.
gcc -nostartfiles links glibc but not the CRT start files.
gcc -nostdlib omits both libraries and CRT startup files.
Very little of this is guaranteed to work, but it does in practice work with current gcc on current x86-64 Linux, and has worked in the past for years. If it breaks, you get to keep both pieces. IDK what C features are broken by omitting the CRT startup code and just relying on the dynamic linker to run glibc init functions. Also, taking the address of an arg and accessing pointers above it is UB, so you could maybe get broken code-gen. gcc7.3 happens to do what you'd expect in this case.
Things that definitely break
atexit() cleanup, e.g. flushing stdio buffers.
static destructors for static objects in dynamically-linked libraries. (On entry to _start, RDX is a function pointer you should register with atexit for this reason. In a dynamically linked executable, the dynamic linker runs before your _start and sets RDX before jumping to your _start. Statically linked executables have RDX=0 under Linux.)
gcc -mincoming-stack-boundary=3 (i.e. 2^3 = 8 bytes) is another way to get gcc to realign the stack, because the -mpreferred-stack-boundary=4 default of 2^4 = 16 is still in place. But that makes gcc assume under-aligned RSP for all functions, not just for _start, which is why I looked in the docs and found an attribute that was intended for 32-bit when the ABI transitioned from only requiring 4-byte stack alignment to the current requirement of 16-byte alignment for ESP in 32-bit mode.
The SysV ABI requirement for 64-bit mode has always been 16-byte alignment, but gcc options let you make code that doesn't follow the ABI.
// test call to a function the compiler can't inline
// to see if gcc emits extra code to re-align the stack
// like it would if we'd used -mincoming-stack-boundary=3 to assume *all* functions
// have only 8-byte (2^3) aligned RSP on entry, with the default -mpreferred-stack-boundary=4
void foo() {
int i = 0;
atoi(NULL);
}
With -mincoming-stack-boundary=3, we get stack-realignment code there, where we don't need it. gcc's stack-realignment code is pretty clunky, so we'd like to avoid that. (Not that you'd really ever use this to compile a significant program where you care about efficiency, please only use this stupid computer trick as a learning experiment.)
But anyway, see the code on the Godbolt compiler explorer with and without -mpreferred-stack-boundary=3.
I have an assembly application for Linux x64 where I pass arguments to the functions via registers, thus I'm using a certain a certain calling convention, in this case fastcall. Now I want to call a C function from the assembly application which, say, expects 10 arguments. Do I have to switch to cdecl for that and pass the arguments via stack regardless of the fact everywhere else in my application I'm passing them via registers? Is it allowed to mix calling conventions in one application?
I assume that by fastcall, you mean the amd64 calling convention used by the SysV ABI (i.e. what Linux uses) where the first few arguments are passed in rdi, rsi, and rdx.
The ABI is slightly complicated, the following is a simplification. You might want to read the specification for details.
Generally speaking, the first few (leftmost) integer or pointer arguments are placed into the registers rdi, rsi, rdx, rcx, r8, and r9. Floating point arguments are passed in xmm0 to xmm7. If the register space is exhausted, additional arguments are passed through the stack from right to left. For example, to call a function with 10 integer arguments:
foo(a, b, c, d, e, f, g, h, i, k);
you would need code like this:
mov $a,%edi
mov $b,%esi
mov $c,%edx
mov $d,%ecx
mov $e,%r8d
mov $f,%r9d
push $k
push $i
push $h
push $g
call foo
add $32,%rsp
For your concrete example, of getnameinfo:
int getnameinfo(
const struct sockaddr *sa,
socklen_t salen,
char *host,
size_t hostlen,
char *serv,
size_t servlen,
int flags);
You would pass sa in rdi, salen in rsi, host in rdx, hostlen in rcx, serv in r8, servlen in r9 and flags on the stack.
Yes of course. Calling convention is applied on per-function basis. This is a perfectly valid application:
int __stdcall func1()
{
return(1);
}
int __fastcall func2()
{
return(2);
}
int __cdecl main(void)
{
func1();
func2();
return(0);
}
You can, but you don't need to.
__attribute__((fastcall)) only asks for the first two parameters to be passed in registers - everything else will anyhow automatically be passed on the stack, just like with cdecl. This is done in order to not limit the number of parameters that can be given to a function by chosing a certain calling convention.
In your example with 10 parameters for a function that is called with the fastcall calling convention, the first two parameters will be passed in registers, the remaining 8 automatically on the stack, just like with standard calling convention.
As you have chosen to use fastcall for all your other functions, I do not see a reason why you'd want to change this for one specific function.
I have a variadic function in C to write in a log file, but as soon as it is invoked, it gives a segmentation fault in the header.
In the main process, the call has this format:
mqbLog("LOG_INFORMATION",0,0,"Connect",0,"","Parameter received");
and the function is defined this way:
void mqbLog(char *type,
int numContext,
double sec,
char *service,
int sizeData,
char *data,
char *fmt,
...
)
{
//write the log in the archive
}
It compiles OK. When I debug the process, the call to the mqbLog function is done, and it gives me the segmentation fault in the open bracket of the function, so I can ask about the function values:
(gdb) p type
$1 = 0x40205e "LOG_INFORMATION"
(gdb) p numContext
$2 = 0
(gdb) p sec
$3 = 0
(gdb) p service
$4 = 0x0
(gdb) p sizeData
$5 = 4202649
(gdb) p data
$6 = 0x0
Any ideas will be gratefully received.
Based on the gdb output, it looks like the caller didn't have a prototype for the function it was calling. As #JonathanLeffler noticed, you wrote 0 instead of 0.0, so it's passing an integer where the callee is expecting a double.
Judging from the pointer value, this is probably on x86-64 Linux with the System V calling convention, where the register assigned for an arg is determined by it being e.g. the third integer arg. (See the x86 wiki for ABI/calling convention docs).
So if the caller and callee disagree about the function signature, they will disagree about which arg goes in which register, which I think explains why gdb is showing args that don't match the caller.
In this case, the caller puts "Connect" (the address) in RCX, because it's the 4th integer/pointer arg with that implicit declaration.
The caller looks for the value of service in RDX, because its caller's 3rd integer/pointer arg.
sec is 0.0 in the callee apparently by chance. It's just using whatever was sitting in XMM0. Or maybe possibly uninitialized stack space, since the caller would have set AL=0 to indicate that no FP args were passed in registers (necessary for variadic functions only). Note al = number of fp register args includes the fixed non-variadic args when the prototype is available. Compiling your call with the prototype available includes a mov eax, 1 before the call. See the source+asm for compiling with/without the prototype on the Godbolt compiler explorer.
In a different calling convention (e.g. -m32 with stack args), things would break at least a badly because those args would be passed on the stack, but int and double are different sizes.
Writing 0.0 for the FP args would make the implicit declaration match the definition. But don't do this, it's still a terrible idea to call undeclared functions. Use -Wall to have the compiler tell you when your code does bad things.
You function might still crash; who knows what other bugs you have in code that's not shown?
When your code crashes, you should look at the asm instruction it crashed on to figure out which pointer was bad — e.g. run disas in gdb. Even if you don't understand it yourself, including that in a debugging-help question (along with register values) can help a lot.
I am very new to assembly language, and fairly new to C. I have looked at an example that creates a calls a function from the c code and the assembly code has a function that does the calculation and return the value (This is an assignment)
C code:
#include <stdio.h>
int Func(int);
int main()
{
int Arg;
Arg = 5;
printf("Value returned is %d when %d sent\n",Func(Arg), Arg);
}
Assembly Code:
.global Func
Func: save %sp,-800, %sp
add %i0, -45 , %l0
mov %l0, %i0
ret
restore
It takes the value from the C code, adds the value to the number in the assembly code, and outputs the new number. I understand this instance for the most part. Our assignment (Modifying the code): "Write a C source file that calls Func1 with 2 parameters A and B, and and assembly source file which contains two methods, Func1 and Func2. Have Func1 call Func2 as though it were Func2(Q). Func2 should double its input argument and send that doubled value back to Func1. Func1 should return to the C main the value 2*A + 2*B." I have attempted this, and came out with this solution (Please forgive me I am new to this as of today)
#include <stdio.h>
int Func1(int, int);
void Func2(int, int);
int main()
{
int Arg1 = 20;
int Arg2 = 4;
printf("Value returned is %d ",Func1(Arg1,Arg2));
}
Assembly:
.global Func1
Func1: save %sp,-800, %sp
mov %l0, %i0
mov %l1, %i1
call Func2
nop
ret
restore
Func2: save %sp,-800, %sp
umul %i0, 2 , %l0
umul %i1, 2 , %l1
call Func1
nop
It is not working, and I'm not surprised one bit. I'm sure there are many things wrong with this code, but a thorough explanation of what is going on here or what I am doing wrong would really help.
Do I see this correctly:
In Func1, you call Func2
which calls Func1 again
which calls Func2 again
which calls Func1 again
which calls Func2 again
...
Stack overflow, resulting in bad memory access and segmentation fault
Obviously, don't do that :). What do you want to do, exactly? Return result of multiplication from Func2? Then return it, just like you return result of addition from Func1.
Then the assignment clearly says:
call Func2 as though it were Func2(Q). Func2 should double its input
argument and send that doubled value back
So why do you give Func2 two arguments? If we assume valid assignment, then you can work on it small pieces, like this piece I quoted. It says Func2 needs 1 argument, so trust that and make Func2 with one argument, and you have one piece of assigment done (then if it turns out assignemnt is invalid or tries to trick you, you need to get back to it, of course, but above is pretty clear).
But to help you, you have working code, right?
.global Func
Func: save %sp,-800, %sp
add %i0, -45 , %l0
mov %l0, %i0
ret
restore
And for Func2, you need to change that code so it multiplies by two, instead of adding -45? Have you tried changing the add instruction to:
imul %i0, 2 , %l0
(or umul, but in your C code you specify int and not unsigned int, so I presume it is signed...).
I'm not going to write your Func1 for you, but you see how you get your inputs, which I assume is right. Then you need to produce result in %i0 before returning. Work in small steps: first make Func1 which returns just %i0 + %i1 without calling Func2 at all. Then try 2 * %i0 + %i1, calling Func2 once. Then finally write requested version of 2 * %i0 + 2 * %i1 calling Func2 twice (or for less and simpler code, extract the common factor so you still need to call Func2 just once).
To pass the value to back to func1, func2 shouldn't be calling func1 again. Have function return a value to func1. A function's return value should be saved in register i0, which is the ABI for most processors.
main calls func1 with value
func1 reads the argument form i0
func1 calls func2 with argument in i0
func2 multiplies the argument and saves in %l0
Move the value back to i0 and return