Linkage error when compiling C file and Assembly file - c

I have .s (x86 assembly at&t syntax) file, .h (header) file with struct defenition and functions decleration which are implemented in the assembly file, and main.c file with function call (from the .s file).
when trying to compile it all together, i get the following error:
main.o: In function `main':
/home/user/workspace/Assembly/main.c:7: undefined reference to `pstrlen'
collect2: error: ld returned 1 exit status
make: *** [a.out] Error 1
pstring.h:
typedef struct {
char len;
char str[255];
} Pstring;
char pstrlen(Pstring* pstr);
main.c:
#include <stdio.h>
#include "pstring.h"
int main() {
Pstring a;
a.len = 4;
printf("Length: %d", pstrlen(&a));
return 0;
}
pstring.s:
.file "pstring.s"
.section .rodata
invalid_input: .string "invalid input!\n"
.text
.type pstrlen, #function
pstrlen:
pushl %ebp
movl %ebp, %esp
movl 8(%ebp), %eax # assign given pstring ptr address to eax
movzbl (%eax), %ecx # set ecx with the value of the first byte (length) of the address at eax
movl %ecx, %eax # set return value as the value at ecx
ret
.type pstrcpy, #function
makefile:
a.out: main.o pstring.o
gcc -m32 -g -o a.out main.o pstring.o
main.o: main.c pstring.h
gcc -m32 -g -c -o main.o main.c
pstring.o: pstring.s
gcc -m32 -g -c -o pstring.o pstring.s
clean:
rm -f *.o a.out
Thank you.

I resolved the problem by declaring pstrlen as global as follows:
.text
.globl pstrlen
.type pstrlen, #function
pstrlen:
pushl %ebp
movl %ebp, %esp
movl 8(%ebp), %eax # assign given pstring ptr address to eax
movzbl (%eax), %ecx # set ecx with the value of the first byte (length) of the address at eax
movl %ecx, %eax # set return value as the value at ecx
ret

Related

How can I get this .c file to read a .s file and run in VS code?

I have two files, one being main.c and the other prog2.s that contains assembly code. The commands I am running are:
gcc -Wall -g -m32 -c main.c
gcc -Wall -g -m32 -c prog2.s
gcc -Wall -g -m32 -o xtest main.o prog2.o
but on that last command, I am getting an error "undefined reference to 'prog2'" twice because in main.c, I try to call prog2 twice. I tried running the main.c file without the terminal but it produced the same error.
I'm not sure code is needed but so you get an idea of what I am trying to do, here are two code samples from each of the files up until the end of the first task which is to return j-i+2.
main.c
int prog2(int i, int j, int *k, int a[5], int *l);
int main() {
int k = 6;
int l = 0, res;
int a[5] = {7, 0, 8, 0, 3};
res = prog2(6,9,&k,a,&l);
if(res != 9-6+2) {
printf("return value should be=%d; got=%d\n", 9-6+2, res);
assert(0);
}
.globl prog2
prog2:
#Setup Code
pushl %ebp
movl %esp, %ebp
pushl %ebx
# j - i + 2
movl 12(%ebp), %eax
movl 8(%ebp), %ecx
subl %ecx, %eax
addl $2, %eax

C to ASM increment causes Segfault

I have the following C program:
#include <stdio.h>
int main() {
int i = 0;
int N = 10;
while(i < N) {
printf("counting to %d: %d", N, i);
//i = i + 1;
}
return 0;
}
I would like to compile this first to assembly, then to binary for instructional purposes. So, I issue the following commands:
$ gcc -S count.c -o count.s
$ as -o count.o count.s
$ ld -o count -e main -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/libc.so count.o -lc
These compile the C to assembly, assemble the assembly to binary, and then link the library containing the printf function, respectively.
This works. Output:
counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0counting to 10: 0
etc until I ctrl-c the program.
However, when I uncomment the i = i + 1 line:
Segmentation fault (core dumped)
What is going wrong here?
UPDATE: Here is count.s (with the i = i + 1 line included)
.file "count.c"
.text
.section .rodata
.LC0:
.string "counting to %d: %d"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl $0, -8(%rbp)
movl $10, -4(%rbp)
jmp .L2
.L3:
movl -8(%rbp), %edx
movl -4(%rbp), %eax
movl %eax, %esi
leaq .LC0(%rip), %rdi
movl $0, %eax
call printf#PLT
addl $1, -8(%rbp)
.L2:
movl -8(%rbp), %eax
cmpl -4(%rbp), %eax
jl .L3
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",#progbits
The below works perfectly fine for me on Ubuntu 20 (taken from Ciro Santilli's answer at Linking a C program directly with ld fails with undefined reference to `__libc_csu_fini`).
gcc -S count.c -o count.s
as -o count.o count.s
ld -o count -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o -L/usr/lib/gcc/x86_64-linux-gnu/4.8/ -lc count.o /usr/lib/x86_64-linux-gnu/crtn.o
If you on Linux 64 add at the end of the main function:
mov eax, 60
xor edi, edi
syscall
On linux 32
mov eax 1
xor ebx,ebx
int 0x80

Cant run compiled file in Ubuntu

I have a problem I cant fix with a simple exercise my teacher assigned us.
I have this main.c that takes in a simple assembly function and I compile it with a make file, when I hit make run I get the following error :
"make: execvp: ./main: invalid argument make:***
[makefile:12:run] Error 127"
This is my make file:
main: main.o asm.o
gcc main.o asm.o -o main
main.o: main.c asm.h
gcc -Wall -g -c main.c -o main.o
asm.o: asm.s
gcc -Wall -g -c asm.s -o asm.o
run: main
./main
clean:
rm *.o main
My main.c file:
#include "asm.h"
int op1 = 0, op2 = 0, res = 0;
int main()
{
printf("Valor op1:");
scanf("%d", &op1);
printf("Valor op2:");
scanf("%d", &op2);
sum();
printf("sum = %d:0x%x\n", res, res);
return 0;
}
My asm.s:
.section .data
.global op1
.global op2
.global res
.section .text
.global sum # void sum(void)
sum:
movl op1(%rip), %ecx #place op1 in ecx
movl op2(%rip), %eax #place op2 in eax
addl %ecx, %eax #add ecx to eax. Result is in eax
movl %eax, res(%rip) # copy the result to res
ret
my asm.h:
#ifndef ASM_H
#define ASM_H
void sum();
#endif

relocation R_X86_64_32 against `.data' can not be used when making a shared object;

I write the below assembler code, and it can build pass by as and ld directly.
as cpuid.s -o cpuid.o
ld cpuid.o -o cpuid
But when I used gcc to do the whole procedure. I meet the below error.
$ gcc cpuid.s -o cpuid
/tmp/cctNMsIU.o: In function `_start':
(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: /tmp/cctNMsIU.o: relocation R_X86_64_32 against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
/usr/bin/ld: final link failed: Invalid operation
collect2: error: ld returned 1 exit status
Then I modify _start to main, and also add -fPIC to gcc parameter. But it doesn't fix my ld error. the error msg is changed to below.
$ gcc cpuid.s -o cpuid
/usr/bin/ld: /tmp/ccYCG80T.o: relocation R_X86_64_32 against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
I don't understand the meaning for that due to I don't make a shared object. I just want to make an executable binary.
.section .data
output:
.ascii "The processor Vendor ID is 'xxxxxxxxxxxx'\n"
.section .text
.global _start
_start:
movl $0, %eax
cpuid
movl $output, %edi
movl %ebx, 28(%edi)
movl %edx, 32(%edi)
movl %ecx, 36(%edi)
movl $4, %eax
movl $1, %ebx
movl $output, %ecx
movl $42, %edx
int $0x80
movl $1, %eax
movl $0, %ebx
int $0x80
If i modify the above code to below, whether it is correct or having some side effect on 64bit asm programming ?
.section .data
output:
.ascii "The processor Vendor ID is 'xxxxxxxxxxxx'\n"
.section .text
.global main
main:
movq $0, %rax
cpuid
lea output(%rip), %rdi
movl %ebx, 28(%rdi)
movl %edx, 32(%rdi)
movl %ecx, 36(%rdi)
movq %rdi, %r10
movq $1, %rax
movq $1, %rdi
movq %r10, %rsi
movq $42, %rdx
syscall
As comments have noted, you could work around this by linking your program as non-PIE, but it would be better to fix your asm to be position-independent. If it's 32-bit x86 code that's a bit ugly. This instruction:
movl $output, %edi
would become:
call 1f
1: pop %edi
add $output-1b, %edi
for 64-bit it's much cleaner. Instead of:
movq $output, %rdi
you'd write:
lea output(%rip), %rdi
With NASM I fixed this by putting the line "DEFAULT REL" in the source file (check nasmdoc.pdf p.76).

Problems with compiled gcc .s Code when linking

First time here, Im running Kali linux 64bits ,Im a linux rookie and a new to ASM aswell.... So I pulled a code in C ,the wich works perfectly fine..... here is the code:
#include<stdio.h>
#include<string.h> //strlen
#include<sys/socket.h>
#include<arpa/inet.h> //inet_addr
int main(int argc , char *argv[])
{
int socket_desc;
struct sockaddr_in server;
char *message , server_reply[2000];
//Create socket
socket_desc = socket(AF_INET , SOCK_STREAM , 0);
if (socket_desc == -1)
{
printf("Could not create socket");
}
server.sin_addr.s_addr = inet_addr("127.0.0.1");
server.sin_family = AF_INET;
server.sin_port = htons( 2000 );
//Connect to remote server
if (connect(socket_desc , (struct sockaddr *)&server , sizeof(server)) <0)
{
puts("connect error");
return 1;
}
puts("Connected\n");
//Send some data
message = "Hola!!!!\n\r\n";
if( send(socket_desc , message , strlen(message) , 0) < 0)
{
puts("Send failed");
return 1;
}
puts("Data Send\n");
//Receive a reply from the server
if( recv(socket_desc, server_reply , 2000 , 0) < 0)
{
puts("recv failed");
}
puts("Reply received\n");
puts(server_reply);
return 0;
}
So ... I use gcc -S -o example.s example.c , to get the ASM code... wich is:
.file "test.c"
.section .rodata
.LC0:
.string "Could not create socket"
.LC1:
.string "127.0.0.1"
.LC2:
.string "connect error"
.LC3:
.string "Connected\n"
.align 8
.LC4:
.string "Hola!! , \n\r\n"
.LC5:
.string "Send failed"
.LC6:
.string "Data Send\n"
.LC7:
.string "recv failed"
.LC8:
.string "Reply received\n"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $2048, %rsp
movl %edi, -2036(%rbp)
movq %rsi, -2048(%rbp)
movl $0, %edx
movl $1, %esi
movl $2, %edi
call socket
movl %eax, -4(%rbp)
cmpl $-1, -4(%rbp)
jne .L2
movl $.LC0, %edi
movl $0, %eax
call printf
.L2:
movl $.LC1, %edi
call inet_addr
movl %eax, -28(%rbp)
movw $2, -32(%rbp)
movl $2000, %edi
call htons
movw %ax, -30(%rbp)
leaq -32(%rbp), %rcx
movl -4(%rbp), %eax
movl $16, %edx
movq %rcx, %rsi
movl %eax, %edi
call connect
testl %eax, %eax
jns .L3
movl $.LC2, %edi
call puts
movl $1, %eax
jmp .L7
.L3:
movl $.LC3, %edi
call puts
movq $.LC4, -16(%rbp)
movq -16(%rbp), %rax
movq %rax, %rdi
call strlen
movq %rax, %rdx
movq -16(%rbp), %rsi
movl -4(%rbp), %eax
movl $0, %ecx
movl %eax, %edi
call send
testq %rax, %rax
jns .L5
movl $.LC5, %edi
call puts
movl $1, %eax
jmp .L7
.L5:
movl $.LC6, %edi
call puts
leaq -2032(%rbp), %rsi
movl -4(%rbp), %eax
movl $0, %ecx
movl $2000, %edx
movl %eax, %edi
call recv
testq %rax, %rax
jns .L6
movl $.LC7, %edi
call puts
.L6:
movl $.LC8, %edi
call puts
leaq -2032(%rbp), %rax
movq %rax, %rdi
call puts
movl $0, %eax
.L7:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
.ident "GCC: (Debian 4.9.2-10) 4.9.2"
So after using as example.s -o example.o, I use ld example.o -o example, and thats where I get these following errors:
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
test.o: In function main':
test.c:(.text+0x28): undefined reference tosocket'
test.c:(.text+0x40): undefined reference to printf'
test.c:(.text+0x4a): undefined reference toinet_addr'
test.c:(.text+0x5d): undefined reference to htons'
test.c:(.text+0x77): undefined reference toconnect'
test.c:(.text+0x85): undefined reference to puts'
test.c:(.text+0x99): undefined reference toputs'
test.c:(.text+0xad): undefined reference to strlen'
test.c:(.text+0xc3): undefined reference tosend'
test.c:(.text+0xd2): undefined reference to puts'
test.c:(.text+0xe3): undefined reference toputs'
test.c:(.text+0xfe): undefined reference to recv'
test.c:(.text+0x10d): undefined reference toputs'
test.c:(.text+0x117): undefined reference to puts'
test.c:(.text+0x126): undefined reference toputs'
it seems to me that gcc is not usingn correctly .start, global main, etc. but to be honest I wouldnt know how to fix it., if this is correct then why?
Any help Will be appreciate.
Thank you.
The problem is that ld example.o -o example tries to link just example.o and nothing else. To get missing symbols you need to link much more (e.g. startup code, standard library, C runtime, etc). Try gcc -v example.c to see how the linker should be invoked.
The commands given in Harry's answer are the good ones:
gcc -Wall -O -fverbose-asm -S example.c
gcc -c example.s -o example.o
gcc example.o -o example
Basically, you should be aware that GCC would link your code with :
startup code like crt0 (actually, that is several object files today)
the C standard library (libc.so) (which will do some system calls)
the libgcc providing a few low level, processor specific, functions (e.g. 64 bits arithmetic on 32 bits machine); it has a permissive but ad-hoc license.
and you often need some dynamic linker like ld-linux(8)
the kernel would provide vdso(7)
How all this is linked together is known by the gcc command, which will start some ld. Replace gcc with gcc -v in your compilation commands to understand what exactly is happening. If you want to issue your own ld command you should add the options providing what I have listed above. The errors you are getting are notably because of the lack of crt0 & libc
BTW on Linux most C standard libraries (e.g. GNU libc or musl-libc) are free software (and so is GCC), so you can study their source code.
Try also gcc -dumpspecs which describes what gcc knows about issuing various commands (notice that gcc is only a driving program; the real C compiler is some cc1). Read also the wikipage on GCC. Some slides and references on the documentation of GCC MELT gives a lot more information. See also this and the picture there.
I strongly recommend to also use gcc to assemble (some assembler code of yours) and to link stuff (because you don't want to handle all the gory details mentioned above, plus some other ones I did not mention).
Try this
gcc -Wall -O -fverbose-asm -S example.c
gcc -c example.s -o example.o
gcc example.o -o example
This is an important part:
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtbegin.o
-lgcc
--as-needed -lgcc_s
--no-as-needed -lc -lgcc
--as-needed -lgcc_s
--no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.9/crtend.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crtn.o
crt1, crti, crtbegin supply the startup code where the _start entry point is actually defined (later on the control is passed to your main), stdio is initialized, etc. Similarly strand and crtn handle the cleanup after main return. lc supplies the standard library (like puts and other missing symbols). lgcc and lgcc_s have the gcc-specific runtime support.
The bottomline is, you need all that to be linked in.

Resources