Combining c and assembler code - c

This is my C code:
#include <stdio.h>
void sum();
int newAlphabet;
int main(void)
{
sum();
printf("%d\n",newAlphabet);
}
And this is my assembler code:
.globl _sum
_sum:
movq $1, %rax
movq %rax, _newAlphabet
ret
I'm trying to call the sum function, from my main function, to set newAlphabet equal to 1, but when I compile it (gcc -o test assembler.c assembler.s, compiled on a 64-bit OSX laptop) I get the following errors:
32-bit absolute addressing is not supported for x86-64
cannot do signed 4 byte relocation
both caused by the line "movq %rax, _newAlphabet"
I'm sure I'm making a very basic mistake. Can anyone help? Thanks in advance.
EDIT:
Here are the relevant portions of the C code once it has been translated to assembler:
.comm _newAlphabet,4,2
...
movq _newAlphabet#GOTPCREL(%rip), %rax

Mac OS X uses position-independent executables by default, which means your code can't use constant global addresses for variables. Instead you'll need to access globals in an IP-relative way. Just change:
movq %rax, _newAlphabet
to:
mov %eax, _newAlphabet(%rip)
and you'll be set (I changed from 64 to 32 bit registers to match sizeof(int) on Mac OS X. Note that you also need a .globl _newAlphabet in there somewhere. Here's an example I just made based on your code (note that I initialized newAlphabet to prove it works):
example.c:
#include <stdio.h>
void sum(void);
int newAlphabet = 2;
int main(void)
{
printf("%d\n",newAlphabet);
sum();
printf("%d\n",newAlphabet);
return 0;
}
assembly.s:
.globl _sum
.globl _newAlphabet
_sum:
movl $1, _newAlphabet(%rip)
ret
Build & run:
$ cc -c -o example.o example.c
$ cc -c -o assembly.o assembly.s
$ cc -o example example.o assembly.o
$ ./example
2
1

Related

How to run a mixed language program

I want to write a mixed language program where part of the code will be written in C, and part of the code in assembly. I was given a sample code, so i know what should my work look like.
.globl _addArrayinA
_addArrayinA:
pushl %ebp
movl %esp,%ebp
subl $8,%esp
movl 8(%ebp), %ebx
xorl %esi,%esi
xor %eax,%eax
bak:
addl (%ebx),%eax
addl $4,%ebx
incl %esi
cmpl $10, %esi
jne bak
movl %ebp, %esp
popl %ebp
ret
# Return value is in %ea
Above is the assembly part.
int addArrayinC(int *myArray, int num)
{
int c;
int i;
c = 0;
for (i=0; i<num; i++)
{c += *myArray;
myArray++;
}
return (c);
}
This is the second function written in C.
And below is the main file, which is supposed to use two functions above.
#include <stdio.h>
#include <stdlib.h>
extern int addArrayinC(int *numbers,int count);
extern int addArrayinA(int *numbers, int count);
int main(void) {
int mynumbers[10]={1,2,3,4,5,6,7,8,9,0};
int sum;
sum = addArrayinC(mynumbers, 10);
printf("\nThe sum of array computed in C is : %d ",sum);
sum = addArrayinA(mynumbers, 10);
printf("\nThe sum of array computed in assembly is : %d ",sum);
return EXIT_SUCCESS;
}
I tried to open these three files in codeblocks, but could not get to run them. I have no idea how to run a mixed language program. Generally, I use cloud9 for compilations of code. Anyways... How can i run code like this?
No problem here. Please note: the extension of the assembler source file has to be .s or .S(upper case if you want the file to be preprocessed e.g. for #define).
fun.c
unsigned int fun ( unsigned int x )
{
return(x+1);
}
build and examine
gcc -c -O2 fun.c -o fun.o
objdump -D fun.o
producing
0000000000000000 <fun>:
0: 8d 47 01 lea 0x1(%rdi),%eax
3: c3 retq
So we can make fun.s
.globl fun
fun:
lea 0x1(%rdi),%eax
retq
as fun.s -o fun.o
objdump -D fun.o
0000000000000000 <fun>:
0: 8d 47 01 lea 0x1(%rdi),%eax
3: c3 retq
C code so.c
#include <stdio.h>
unsigned int fun ( unsigned int x );
int main ( void )
{
printf("%u\n",fun(1));
printf("%u\n",fun(2));
printf("%u\n",fun(3));
return(0);
}
gcc lets you feed it assembly language
gcc so.c fun.s -o so
./so
2
3
4
as well as objects
gcc so.c fun.o
./so
2
3
4
so you dont have to mess with the linker directly

Mixing C and Assembly. `Hello World` on 64-bit Linux

Based on this tutorial, I am trying to write Hello World to the console on 64 bit Linux. Compilation raises no errors, but I get no text on console either. I don't know what is wrong.
write.s:
.data
SYSREAD = 0
SYSWRITE = 1
SYSEXIT = 60
STDOUT = 1
STDIN = 0
EXIT_SUCCESS = 0
message: .ascii "Hello, world!\n"
message_len = .-message
.text
.globl _write
_write:
pushq %rbp
movq %rsp, %rbp
movq $SYSWRITE, %rax
movq $STDOUT, %rdi
movq $message, %rsi
movq $message_len, %rdx
syscall
popq %rbp
ret
main.c:
extern void write(void);
int main (int argc, char **argv)
{
write();
return 0;
}
Compiling:
as write.s -o write.o
gcc main.c -c -o main.o
gcc main.o write.o -o program
./program
Okay, so my code had two mistakes:
1) I named my as function 'write' that is common c name and i needed to rename it.
2) in function name, i shouldn't put underscores.
Proper code:
writehello.s
.data
SYSREAD = 0
SYSWRITE = 1
SYSEXIT = 60
STDOUT = 1
STDIN = 0
EXIT_SUCCESS = 0
message: .ascii "Hello, world!\n"
message_len = .-message
.text
#.global main
#main:
#call write
#movq $SYSEXIT, %rax
#movq $EXIT_SUCCESS, %rdi
#syscall
#********
.global writehello
writehello:
pushq %rbp
movq %rsp, %rbp
movq $SYSWRITE, %rax
movq $STDOUT, %rdi
movq $message, %rsi
movq $message_len, %rdx
syscall
popq %rbp
ret
main.c
extern void writehello(void);
int main (int argc, char **argv)
{
writehello();
return 0;
}
Compilation stays as is :) Thanks to everyone that helped!
The tutorial you're reading is not quite right. There has been two differing conventions for global symbols in the ELF (Executable and Linkable Format) executables. One convention says that all global C symbols should be prefixed with _, the other convention does not prefix the C symbols. In GNU/Linux, especially in x86-64 ABI, the global symbols are not prefixed with _. However the tutorial that you linked might be right for some other compiler for Linux/ELF that didn't use the GNU libc.
Now, what happens in your original code is that your assembler function would be visible as _write in C code, not write. Instead, the write symbol is found in the libc (the wrapper for write(2) system call):
ssize_t write(int fd, const void *buf, size_t count);
Now you declared this write as a function void write(void);, which leads to undefined behaviour as such when you call it. You can use strace ./program to find out what system calls it makes:
% strace ./program
...
write(1, "\246^P\313\374\177\0\0\0\0\0\0\0\0"..., 140723719521144) = -1 EFAULT (Bad address)
...
So it called the write system call not with your intended arguments, but with whatever garbage there was in the registers provided to glibc write wrapper. (actually the "garbage" is known here - first argument is the argc, and the second argument is the value of argv and the 3rd argument is the value of char **environ). And as the kernel noticed that a buffer starting at (void*)argv and 140723719521144 bytes long wasn't completely contained within the mapped address space, it returned EFAULT from that system call. Result: no crash, no message.
write is not a reserved word as such in C. It is a function and possibly a macro in POSIX. You could overwrite it, the linking order matters - if you program defines write, other code would be linked against this definition instead of the one found in glibc. However this would mean that other code calling write would end up calling your incompatible function instead.
Thus the solution is to not use a name that is a function in the GNU libc or in any other libraries that you've linked against. Thus in assembler you can use:
.global writehello
writehello:
and then
extern void writehello(void);
as you yourself have found out.

Problems compiling assembly file - Error: undefined reference to `function name'

I am trying to take a look at a test program my professor gave us, but I am having trouble compiling it. I am on Ubuntu 14.04. I am compiling it with
gcc -Wall test.c AssemblyFunction.S -m32 -o test
I was having problems running the code on a 64-bit machine and read that adding -Wall and -m32 will allow it to work. Doing that fixed the first problem I had, but now I am getting the error: undefined reference to `addnumbersinAssembly'.
Here is the C file
#include <stdio.h>
#include <stdlib.h>
extern int addnumbersinAssembly(int, int);
int main(void)
{
int a, b;
int res;
a = 5;
b = 6;
// Call the assembly function to add the numbers
res = addnumbersinAssembly(a,b);
printf("\nThe sum as computed in assembly is : %d", res);
return(0);
}
And here is the assembly file
.global _addnumbersinAssembly
_addnumbersinAssembly:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp), %eax
addl 12(%ebp), %eax # Add the args
movl %ebp,%esp
popl %ebp
ret
Thank you for your time. I have been trying to figure this out for hours, so I appreciate any help.
I believe that with GCC you are going to want to remove the _ in your assembler file. So these lines:
.global _addnumbersinAssembly
_addnumbersinAssembly:
Should be:
.global addnumbersinAssembly
addnumbersinAssembly:
More information on this issue can be found in this StackOverflow question/answer.
The -m32 compile parameter is needed because the assembly code you have needs to be rewritten to support some 64 bit operations. In your case it was stack operations. The -Wall isn't needed to compile but it does turn on many more warnings.

thread local storage in assembly

I want to increment a TLS variable in assembly but is gives a segmentation fault in the assembly code. I don't want to let compiler change any other register or memory. Is there a way to do this without using gcc input and output syntax?
__thread unsigned val;
int main() {
val = 0;
asm("incl %gs:val");
return 0;
}
If you really really need to be able to do this for some reason, you should access a thread-local variable from assembly language by preloading its address in C, like this:
__thread unsigned val;
void incval(void)
{
unsigned *vp = &val;
asm ("incl\t%0" : "+m" (*vp));
}
This is because the code sequence required to access a thread-local variable is different for just about every OS and CPU combination supported by GCC, and also varies if you're compiling for a shared library rather than an executable (i.e. with -fPIC). The above construct allows the compiler to emit the correct code sequence for you. In cases where it is possible to access the thread-local variable without any extra instructions, the address generation will be folded into the assembly operation. By way of illustration, here is how gcc 4.7 for x86/Linux compiles the above in several different possible modes (I've stripped out a bunch of assembler directives in all cases, for clarity)...
# -S -O2 -m32 -fomit-frame-pointer
incval:
incl %gs:val#ntpoff
ret
# -S -O2 -m64
incval:
incl %fs:val#tpoff
ret
# -S -O2 -m32 -fomit-frame-pointer -fpic
incval:
pushl %ebx
call __x86.get_pc_thunk.bx
addl $_GLOBAL_OFFSET_TABLE_, %ebx
leal val#tlsgd(,%ebx,1), %eax
call ___tls_get_addr#PLT
incl (%eax)
popl %ebx
ret
# -S -O2 -m64 -fpic
incval:
.byte 0x66
leaq val#tlsgd(%rip), %rdi
.value 0x6666
rex64
call __tls_get_addr#PLT
incl (%rax)
ret
Do realize that all four examples would be different if I'd compiled for x86/OSX, and different yet again for x86/Windows.

Unable to printf floating point numbers from executable shared library

I'm developing a shared library which can be executed independently to print it's own version number.
I've defined a custom entry point as:
const char my_interp[] __attribute__((section(".interp"))) = "/lib64/ld-linux-x86-64.so.2";
void my_main() {
printf("VERSION: %d\n", 0);
_exit(0);
}
and I compile with
gcc -o list.os -c -g -Wall -fPIC list.c
gcc -o liblist.so -g -Wl,-e,my_main -shared list.os -lc
This code compiles and runs perfectly.
My issue is when I change the parameter of the printf to be a float or double (%f or %lf). The library will then compile but segfault when run.
Anyone have any ideas?
edit1:
Here is the code that segfaults:
const char my_interp[] __attribute__((section(".interp"))) = "/lib64/ld-linux-x86-64.so.2";
void my_main() {
printf("VERSION: %f\n", 0.1f);
_exit(0);
}
edit2:
Additional environmental details:
uname -a
Linux mjolnir.site 3.1.10-1.16-desktop #1 SMP PREEMPT Wed Jun 27 05:21:40 UTC 2012 (d016078) x86_64 x86_64 x86_64 GNU/Linux
gcc --version
gcc (SUSE Linux) 4.6.2
/lib64/libc.so.6
Configured for x86_64-suse-linux.
Compiled by GNU CC version 4.6.2.
Compiled on a Linux 3.1.0 system on 2012-03-30.
edit 3:
Output in /var/log/messages upon segfault:
Aug 11 08:27:45 mjolnir kernel: [10560.068741] liblist.so[11222] general protection ip:7fc2b3cb2314 sp:7fff4f5c7de8 error:0 in libc-2.14.1.so[7fc2b3c63000+187000]
Figured it out. :)
The floating point operations on x86_64 use the xmm vector registers. Access to these must be aligned on 16byte boundaries. This explains why 32bit platforms were unaffected and integer and character printing worked.
I've compiled my code to assembly with:
gcc -W list.c -o list.S -shared -Wl,-e,my_main -S -fPIC
then altered the "my_main" function to be have more stack space.
Before:
my_main:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %eax
movsd .LC1(%rip), %xmm0
movq %rax, %rdi
movl $1, %eax
call printf
movl $0, %edi
call _exit
.cfi_endproc
After:
my_main:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
subq $8, %rsp ;;;;;;;;;;;;;;; ADDED THIS LINE
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %eax
movsd .LC1(%rip), %xmm0
movq %rax, %rdi
movl $1, %eax
call printf
movl $0, %edi
call _exit
.cfi_endproc
Then I compiled this .S file by:
gcc list.S -o liblist.so -Wl,-e,my_main -shared
This fixes the issue, but I will forward this thread to the GCC and GLIBC mailing lists, as it looks like a bug.
edit1:
According to noshadow in gcc irc, this is a non standard way to do this. He said if one is to use gcc -e option, either initialize the C runtime manually, or don't use libc functions. Makes sense.

Resources