Problems with compiled gcc .s Code when linking

Problems with compiled gcc .s Code when linking - c

First time here, Im running Kali linux 64bits ,Im a linux rookie and a new to ASM aswell.... So I pulled a code in C ,the wich works perfectly fine..... here is the code:
#include<stdio.h>
#include<string.h> //strlen
#include<sys/socket.h>
#include<arpa/inet.h> //inet_addr
int main(int argc , char *argv[])
{
int socket_desc;
struct sockaddr_in server;
char *message , server_reply[2000];
//Create socket
socket_desc = socket(AF_INET , SOCK_STREAM , 0);
if (socket_desc == -1)
{
printf("Could not create socket");
}
server.sin_addr.s_addr = inet_addr("127.0.0.1");
server.sin_family = AF_INET;
server.sin_port = htons( 2000 );
//Connect to remote server
if (connect(socket_desc , (struct sockaddr *)&server , sizeof(server)) <0)
{
puts("connect error");
return 1;
}
puts("Connected\n");
//Send some data
message = "Hola!!!!\n\r\n";
if( send(socket_desc , message , strlen(message) , 0) < 0)
{
puts("Send failed");
return 1;
}
puts("Data Send\n");
//Receive a reply from the server
if( recv(socket_desc, server_reply , 2000 , 0) < 0)
{
puts("recv failed");
}
puts("Reply received\n");
puts(server_reply);
return 0;
}
So ... I use gcc -S -o example.s example.c , to get the ASM code... wich is:
.file "test.c"
.section .rodata
.LC0:
.string "Could not create socket"
.LC1:
.string "127.0.0.1"
.LC2:
.string "connect error"
.LC3:
.string "Connected\n"
.align 8
.LC4:
.string "Hola!! , \n\r\n"
.LC5:
.string "Send failed"
.LC6:
.string "Data Send\n"
.LC7:
.string "recv failed"
.LC8:
.string "Reply received\n"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $2048, %rsp
movl %edi, -2036(%rbp)
movq %rsi, -2048(%rbp)
movl $0, %edx
movl $1, %esi
movl $2, %edi
call socket
movl %eax, -4(%rbp)
cmpl $-1, -4(%rbp)
jne .L2
movl $.LC0, %edi
movl $0, %eax
call printf
.L2:
movl $.LC1, %edi
call inet_addr
movl %eax, -28(%rbp)
movw $2, -32(%rbp)
movl $2000, %edi
call htons
movw %ax, -30(%rbp)
leaq -32(%rbp), %rcx
movl -4(%rbp), %eax
movl $16, %edx
movq %rcx, %rsi
movl %eax, %edi
call connect
testl %eax, %eax
jns .L3
movl $.LC2, %edi
call puts
movl $1, %eax
jmp .L7
.L3:
movl $.LC3, %edi
call puts
movq $.LC4, -16(%rbp)
movq -16(%rbp), %rax
movq %rax, %rdi
call strlen
movq %rax, %rdx
movq -16(%rbp), %rsi
movl -4(%rbp), %eax
movl $0, %ecx
movl %eax, %edi
call send
testq %rax, %rax
jns .L5
movl $.LC5, %edi
call puts
movl $1, %eax
jmp .L7
.L5:
movl $.LC6, %edi
call puts
leaq -2032(%rbp), %rsi
movl -4(%rbp), %eax
movl $0, %ecx
movl $2000, %edx
movl %eax, %edi
call recv
testq %rax, %rax
jns .L6
movl $.LC7, %edi
call puts
.L6:
movl $.LC8, %edi
call puts
leaq -2032(%rbp), %rax
movq %rax, %rdi
call puts
movl $0, %eax
.L7:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
.ident "GCC: (Debian 4.9.2-10) 4.9.2"
So after using as example.s -o example.o, I use ld example.o -o example, and thats where I get these following errors:
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
test.o: In function main':
test.c:(.text+0x28): undefined reference tosocket'
test.c:(.text+0x40): undefined reference to printf'
test.c:(.text+0x4a): undefined reference toinet_addr'
test.c:(.text+0x5d): undefined reference to htons'
test.c:(.text+0x77): undefined reference toconnect'
test.c:(.text+0x85): undefined reference to puts'
test.c:(.text+0x99): undefined reference toputs'
test.c:(.text+0xad): undefined reference to strlen'
test.c:(.text+0xc3): undefined reference tosend'
test.c:(.text+0xd2): undefined reference to puts'
test.c:(.text+0xe3): undefined reference toputs'
test.c:(.text+0xfe): undefined reference to recv'
test.c:(.text+0x10d): undefined reference toputs'
test.c:(.text+0x117): undefined reference to puts'
test.c:(.text+0x126): undefined reference toputs'
it seems to me that gcc is not usingn correctly .start, global main, etc. but to be honest I wouldnt know how to fix it., if this is correct then why?
Any help Will be appreciate.
Thank you.

The problem is that ld example.o -o example tries to link just example.o and nothing else. To get missing symbols you need to link much more (e.g. startup code, standard library, C runtime, etc). Try gcc -v example.c to see how the linker should be invoked.

The commands given in Harry's answer are the good ones:
gcc -Wall -O -fverbose-asm -S example.c
gcc -c example.s -o example.o
gcc example.o -o example
Basically, you should be aware that GCC would link your code with :
startup code like crt0 (actually, that is several object files today)
the C standard library (libc.so) (which will do some system calls)
the libgcc providing a few low level, processor specific, functions (e.g. 64 bits arithmetic on 32 bits machine); it has a permissive but ad-hoc license.
and you often need some dynamic linker like ld-linux(8)
the kernel would provide vdso(7)
How all this is linked together is known by the gcc command, which will start some ld. Replace gcc with gcc -v in your compilation commands to understand what exactly is happening. If you want to issue your own ld command you should add the options providing what I have listed above. The errors you are getting are notably because of the lack of crt0 & libc
BTW on Linux most C standard libraries (e.g. GNU libc or musl-libc) are free software (and so is GCC), so you can study their source code.
Try also gcc -dumpspecs which describes what gcc knows about issuing various commands (notice that gcc is only a driving program; the real C compiler is some cc1). Read also the wikipage on GCC. Some slides and references on the documentation of GCC MELT gives a lot more information. See also this and the picture there.
I strongly recommend to also use gcc to assemble (some assembler code of yours) and to link stuff (because you don't want to handle all the gory details mentioned above, plus some other ones I did not mention).

Try this
gcc -Wall -O -fverbose-asm -S example.c
gcc -c example.s -o example.o
gcc example.o -o example

This is an important part:
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtbegin.o
-lgcc
--as-needed -lgcc_s
--no-as-needed -lc -lgcc
--as-needed -lgcc_s
--no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.9/crtend.o
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crtn.o
crt1, crti, crtbegin supply the startup code where the _start entry point is actually defined (later on the control is passed to your main), stdio is initialized, etc. Similarly strand and crtn handle the cleanup after main return. lc supplies the standard library (like puts and other missing symbols). lgcc and lgcc_s have the gcc-specific runtime support.
The bottomline is, you need all that to be linked in.

Related

How is struct organized in assembly?

I am trying to figure out, how does compiler pad space between each struct members. In this example:
struct s{
int a,b,c;
};
struct s get(int a){
struct s foo = {.a=a,.b=a+1,.c=a+2};
return foo;
}
is compiled with cc -S a.c:
.file "a.c"
.text
.globl get
.type get, #function
get:
.LFB0:
pushq %rbp
movq %rsp, %rbp
movl %edi, -36(%rbp)
movl -36(%rbp), %eax
movl %eax, -24(%rbp)
movl -36(%rbp), %eax
addl $1, %eax
movl %eax, -20(%rbp)
movl -36(%rbp), %eax
addl $2, %eax
movl %eax, -16(%rbp)
movq -24(%rbp), %rax
movq %rax, -12(%rbp)
movl -16(%rbp), %eax
movl %eax, -4(%rbp)
movq -12(%rbp), %rax
movl -4(%rbp), %ecx
movq %rcx, %rdx
popq %rbp
ret
.LFE0:
.size get, .-get
.ident "GCC: (Debian 8.3.0-6) 8.3.0"
.section .note.GNU-stack,"",#progbits
No optimization is used. The question is why is there -36(%rbp) used as first member "reference", when they are arranged sequentially in
.a == -24(%rbp)
.b == -20(%rbp)
.c == -16(%rbp)
There is no need to make room with -36(%rbp) which compiler uses here. Is it intentionally (as a room or compiler uses the -36(%rbp) as a "reference" to the first member)?
Also, at the end,
movq -24(%rbp), %rax #take first member
movq %rax, -12(%rbp) #place it randomly
movl -16(%rbp), %eax #take third member
movl %eax, -4(%rbp) #place it randomly
Does not make sense, it is not sequential with the initial struct, and the first and third member of the struct are copied randomly in the space the function get had allocated.
What is the convention for structs?

The code you observe is a jumble of three different things: the actual layout of a struct s, the ABI specification of how to return structs from functions, and the anti-optimizations inserted by many compilers in their default mode (equivalent to -O0) to ensure that unsophisticated debuggers can find and change the values of variables while stopped at any breakpoint (see Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for more about this).
You can cut out the second of these factors by having get write into a struct s * argument, instead of returning a struct by value, and the third by compiling with gcc -O2 -S instead of just gcc -S. (Also try -Og and -O1; the complex optimizations applied at -O2 can be confusing, too.) For instance:
$ cat test.c
struct s {
int a,b,c;
};
void get(int a, struct s *s)
{
s->a = a;
s->b = a+1;
s->c = a+2;
}
$ gcc -O2 -S test.c
$ cat test.s
.file "test.c"
.text
.p2align 4
.globl get
.type get, #function
get:
.LFB0:
.cfi_startproc
leal 1(%rdi), %eax
movl %edi, (%rsi)
addl $2, %edi
movl %eax, 4(%rsi)
movl %edi, 8(%rsi)
ret
.cfi_endproc
.LFE0:
.size get, .-get
.ident "GCC: (Debian 9.3.0-13) 9.3.0"
.section .note.GNU-stack,"",#progbits
From this assembly language it should be clearer that a is at offset 0 within struct s, b is at offset 4, and c at offset 8.
Struct layout is specified by the "psABI" (processor-specific application binary interface) for each CPU architecture. You can read the psABI specs for x86 at https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI. These also explain how structs are returned from functions. It's also important to know that the layout of a stack frame is only partially specified by the psABI. Some of the "random" offsets in your assembly dump are, in fact, arbitrarily chosen by the compiler.

Unexpected behaviour in simple pointer arithmetics in kernel space C code [duplicate]

I am currently following this workbook on build an operating system.
My intention is to write a 64-bit kernel. I have got as far as loading the "kernel" code and writing individual characters to the frame buffer while in text mode.
My problem appears when I add a level of indirection to writing a single character to the frame buffer by wrapping the code in a function. It would appear that the char value passed into the function is being corrupted in some way.
I have three files:
bootloader.asm
; bootloader.asm
[org 0x7c00]
KERNEL_OFFSET equ 0x1000
mov bp, 0x9000
mov sp, bp
; load the kernel from boot disk
mov bx, KERNEL_OFFSET
mov dl, dl ; boot drive is set to dl
mov ah, 0x02 ; bios read sector
mov al, 15 ; read 15 sectors
mov ch, 0x00 ; cylinder 0
mov cl, 0x02 ; read from 2nd sector
mov dh, 0x00 ; select head 0
int 0x13
; THERE COULD BE ERRORS HERE BUT FOR NOW ASSUME IT WORKS
; switch to protected mode
cli
lgdt [gdt.descriptor]
mov eax, cr0
or eax, 1
mov cr0, eax
jmp CODE_SEGMENT:start_protected_mode
[bits 32]
start_protected_mode:
mov ax, DATA_SEGMENT
mov ds, ax
mov ss, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ebp, 0x90000
mov esp, ebp
call KERNEL_OFFSET
jmp $
[bits 16]
gdt: ; Super Simple Global Descriptor Table
.start:
.null:
dd 0x0
dd 0x0
.code:
dw 0xffff
dw 0x0
db 0x0
db 10011010b
db 11001111b
db 0x0
.data:
dw 0xffff
dw 0x0
db 0x0
db 10010010b
db 11001111b
db 0x0
.end:
.descriptor:
dw .end - .start
dd .start
CODE_SEGMENT equ gdt.code - gdt.start
DATA_SEGMENT equ gdt.data - gdt.start
times 510-($-$$) db 0
dw 0xaa55
bootkernel.asm
[bits 32]
[extern main]
[global _start]
_start:
call main
jmp $
kernel.c
// LEGACY MODE VIDEO DRIVER
#define FRAME_BUFFER_ADDRESS 0xb8002
#define GREY_ON_BLACK 0x07
#define WHITE_ON_BLACK 0x0f
void write_memory(unsigned long address, unsigned int index, unsigned char value)
{
unsigned char * memory = (unsigned char *) address;
memory[index] = value;
}
unsigned int frame_buffer_offset(unsigned int col, unsigned int row)
{
return 2 * ((row * 80u) + col);
}
void write_frame_buffer_cell(unsigned char c, unsigned char a, unsigned int col, unsigned int row)
{
unsigned int offset = frame_buffer_offset(col, row);
write_memory(FRAME_BUFFER_ADDRESS, offset, c);
write_memory(FRAME_BUFFER_ADDRESS, offset + 1, a);
}
void main()
{
unsigned int offset = frame_buffer_offset(0, 1);
write_memory(FRAME_BUFFER_ADDRESS, offset, 'A');
write_memory(FRAME_BUFFER_ADDRESS, offset + 1, GREY_ON_BLACK);
write_frame_buffer_cell('B', GREY_ON_BLACK, 0, 1);
}
The .text section is linked to start from 0x1000 which is where the bootloader expects the kernel to start.
The linker.ld script is
SECTIONS
{
. = 0x1000;
.text : { *(.text) } /* Kernel is expected at 0x1000 */
}
The Make file that puts this all together is:
bootloader.bin: bootloader.asm
nasm -f bin bootloader.asm -o bootloader.bin
bootkernel.o: bootkernel.asm
nasm -f elf64 bootkernel.asm -o bootkernel.o
kernel.o: kernel.c
gcc-6 -Wextra -Wall -ffreestanding -c kernel.c -o kernel.o
kernel.bin: bootkernel.o kernel.o linker.ld
ld -o kernel.bin -T linker.ld bootkernel.o kernel.o --oformat binary
os-image: bootloader.bin kernel.bin
cat bootloader.bin kernel.bin > os-image
qemu: os-image
qemu-system-x86_64 -d guest_errors -fda os-image -boot a
I've taken a screen shot of the output that I am getting. I expect 'A' to appear in the 0th column of the 1st row and for 'B' to appear on the 1st column of the 0th row. For some reason I am getting another character.
Output of gcc-6 -S kernel.c
.file "kernel.c"
.text
.globl write_memory
.type write_memory, #function
write_memory:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -24(%rbp)
movl %esi, -28(%rbp)
movl %edx, %eax
movb %al, -32(%rbp)
movq -24(%rbp), %rax
movq %rax, -8(%rbp)
movl -28(%rbp), %edx
movq -8(%rbp), %rax
addq %rax, %rdx
movzbl -32(%rbp), %eax
movb %al, (%rdx)
nop
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size write_memory, .-write_memory
.globl frame_buffer_offset
.type frame_buffer_offset, #function
frame_buffer_offset:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl -8(%rbp), %edx
movl %edx, %eax
sall $2, %eax
addl %edx, %eax
sall $4, %eax
movl %eax, %edx
movl -4(%rbp), %eax
addl %edx, %eax
addl %eax, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size frame_buffer_offset, .-frame_buffer_offset
.globl write_frame_buffer_cell
.type write_frame_buffer_cell, #function
write_frame_buffer_cell:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %esi, %eax
movl %edx, -28(%rbp)
movl %ecx, -32(%rbp)
movb %dil, -20(%rbp)
movb %al, -24(%rbp)
movl -32(%rbp), %edx
movl -28(%rbp), %eax
movl %edx, %esi
movl %eax, %edi
call frame_buffer_offset
movl %eax, -4(%rbp)
movzbl -20(%rbp), %edx
movl -4(%rbp), %eax
movl %eax, %esi
movl $753666, %edi
call write_memory
movzbl -24(%rbp), %eax
movl -4(%rbp), %edx
leal 1(%rdx), %ecx
movl %eax, %edx
movl %ecx, %esi
movl $753666, %edi
call write_memory
nop
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size write_frame_buffer_cell, .-write_frame_buffer_cell
.globl main
.type main, #function
main:
.LFB3:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl $1, %esi
movl $0, %edi
call frame_buffer_offset
movl %eax, -4(%rbp)
movl -4(%rbp), %eax
movl $65, %edx
movl %eax, %esi
movl $753666, %edi
call write_memory
movl -4(%rbp), %eax
addl $1, %eax
movl $7, %edx
movl %eax, %esi
movl $753666, %edi
call write_memory
movl $0, %ecx
movl $1, %edx
movl $7, %esi
movl $66, %edi
call write_frame_buffer_cell
nop
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE3:
.size main, .-main
.ident "GCC: (Ubuntu 6.2.0-3ubuntu11~16.04) 6.2.0 20160901"
.section .note.GNU-stack,"",#progbits

I can reproduce your exact output if the code is modified to be:
unsigned int offset = frame_buffer_offset(0, 1);
write_memory(FRAME_BUFFER_ADDRESS, offset, 'A');
write_memory(FRAME_BUFFER_ADDRESS, offset + 1, GREY_ON_BLACK);
write_frame_buffer_cell('B', GREY_ON_BLACK, 1, 0);
The difference being in the last line ('B', GREY_ON_BLACK, 1, 0);. Originally you had ('B', GREY_ON_BLACK, 0, 1); . This is in line with what you described you were trying to do when you said:
I've taken a screen shot of the output that I am getting. I expect 'A' to appear in the 0th column of the 1st row and for 'B' to appear on the 1st column of the 0th row.
I gather you may have posted the wrong code in this question. This is the output I get:
It seems you are new to OS development. Your bootloader code only places the CPU into 32-bit protected mode, but to run a 64-bit kernel you need to be in 64-bit longmode. If you are just getting started I'd suggest falling back to writing a 32-bit kernel for purposes of learning at this early stage. At the bottom I have a 64-bit long mode section with a link to a longmode tutorial that could be used to modify your bootloader to run 64-bit code.
Primary Issue Causing Unusual Behaviour
You are experiencing an issue primarily related to the fact that you are generating 64-bit code with GCC but you are running it in 32-bit protected mode according to your bootloader code. 64-bit code generation running in 32-bit protected mode may appear to execute, but it will do it incorrectly. In simple OSes where you are simply displaying to the video display you may often see unexpected output as a side effect. Your program could triple fault the machine, but you got unlucky that the side effect seemed to display something on the video display. You may have been under the false impression that things were working as they should when they really weren't.
This question is somewhat similar to another Stackoverflow question. After the original poster of that question made available a complete example it became clear that it was his issue. Part of my answer to him to resolve the issue was as follows:
Likely Cause of Undefined Behavior
After all the code and the make file were made available in EDIT 2 it became clear that one significant problem was that most of the code was compiled and linked to 64-bit objects and executables. That code won't work in 32-bit protected mode.
In the make file make these adjustments:
When compiling with GCC you need to add -m32 option
When assembling with GNU Assembler (as) targeting 32-bit objects you need to use --32
When linking with LD you need to add the -melf_i386 option
When assembling with NASM targeting 32-bit objects you need to change -f elf64 to -f elf32
With that in mind you can alter your Makefile to generate 32-bit code. It could look like:
bootloader.bin: bootloader.asm
nasm -f bin bootloader.asm -o bootloader.bin
bootkernel.o: bootkernel.asm
nasm -f elf32 bootkernel.asm -o bootkernel.o
kernel.o: kernel.c
gcc-6 -m32 -Wextra -Wall -ffreestanding -c kernel.c -o kernel.o
kernel.bin: bootkernel.o kernel.o linker.ld
ld -melf_i386 -o kernel.bin -T linker.ld bootkernel.o kernel.o --oformat binary
os-image: bootloader.bin kernel.bin
cat bootloader.bin kernel.bin > os-image
qemu: os-image
qemu-system-x86_64 -d guest_errors -fda os-image -boot a
I gather when you started having issues with your code you ended up trying 0xb8002 as the address for your video memory. It should be 0xb8000. You'll need to modify:
#define FRAME_BUFFER_ADDRESS 0xb8002
To be:
#define FRAME_BUFFER_ADDRESS 0xb8000
Making all these changes should resolve your issues. This is what the output I got looked like after the changes mentioned above:
Other observations
In write_memory you use:
unsigned char * memory = (unsigned char *) address;
Since you are using 0xb8000 that is memory mapped to the video display you should mark it as volatile since a compiler could optimize things away not knowing that there is a side effect to writing to that memory (namely displaying characters on a display). You might wish to use:
volatile unsigned char * memory = (unsigned char *) address;
In your bootloader.asm You really should explicitly set the A20 line on. You can find information about doing that in this OSDev Wiki article. The status of the A20 line at the point a bootloader starts executing may vary between emulators. Failure to set it on could cause issues if you try to access memory areas on an odd numbered megabyte boundary (like 0x100000 to 0x1fffff, 0x300000 to 0x1fffff etc). Accesses to the odd numbered megabyte memory regions will actually read data from the even numbered memory region just below it. This is usually not behaviour you want.
64-bit long mode
If you want to run 64-bit code you will need to place the processor into 64-bit long mode. This is a bit more involved than entering 32-bit protected mode. Information on 64-bit longmode can be found in the OSDev wiki. Once properly in 64-bit longmode you can use 64-bit instructions generated by GCC.

relocation R_X86_64_32 against `.data' can not be used when making a shared object;

I write the below assembler code, and it can build pass by as and ld directly.
as cpuid.s -o cpuid.o
ld cpuid.o -o cpuid
But when I used gcc to do the whole procedure. I meet the below error.
$ gcc cpuid.s -o cpuid
/tmp/cctNMsIU.o: In function `_start':
(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: /tmp/cctNMsIU.o: relocation R_X86_64_32 against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
/usr/bin/ld: final link failed: Invalid operation
collect2: error: ld returned 1 exit status
Then I modify _start to main, and also add -fPIC to gcc parameter. But it doesn't fix my ld error. the error msg is changed to below.
$ gcc cpuid.s -o cpuid
/usr/bin/ld: /tmp/ccYCG80T.o: relocation R_X86_64_32 against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
I don't understand the meaning for that due to I don't make a shared object. I just want to make an executable binary.
.section .data
output:
.ascii "The processor Vendor ID is 'xxxxxxxxxxxx'\n"
.section .text
.global _start
_start:
movl $0, %eax
cpuid
movl $output, %edi
movl %ebx, 28(%edi)
movl %edx, 32(%edi)
movl %ecx, 36(%edi)
movl $4, %eax
movl $1, %ebx
movl $output, %ecx
movl $42, %edx
int $0x80
movl $1, %eax
movl $0, %ebx
int $0x80
If i modify the above code to below, whether it is correct or having some side effect on 64bit asm programming ?
.section .data
output:
.ascii "The processor Vendor ID is 'xxxxxxxxxxxx'\n"
.section .text
.global main
main:
movq $0, %rax
cpuid
lea output(%rip), %rdi
movl %ebx, 28(%rdi)
movl %edx, 32(%rdi)
movl %ecx, 36(%rdi)
movq %rdi, %r10
movq $1, %rax
movq $1, %rdi
movq %r10, %rsi
movq $42, %rdx
syscall

As comments have noted, you could work around this by linking your program as non-PIE, but it would be better to fix your asm to be position-independent. If it's 32-bit x86 code that's a bit ugly. This instruction:
movl $output, %edi
would become:
call 1f
1: pop %edi
add $output-1b, %edi
for 64-bit it's much cleaner. Instead of:
movq $output, %rdi
you'd write:
lea output(%rip), %rdi

With NASM I fixed this by putting the line "DEFAULT REL" in the source file (check nasmdoc.pdf p.76).

In x86, why do I have the same instruction two times, with reversed operands?

I am doing several experiments with x86 asm trying to see how common language constructs map into assembly. In my current experiment, I am trying to see specifically how C language pointers map to register-indirect addressing. I have written a fairly hello-world like pointer program:
#include <stdio.h>
int
main (void)
{
int value = 5;
int *int_val = &value;
printf ("The value we have is %d\n", *int_val);
return 0;
}
and compiled it to the following asm using: gcc -o pointer.s -fno-asynchronous-unwind-tables pointer.c:[1][2]
.file "pointer.c"
.section .rodata
.LC0:
.string "The value we have is %d\n"
.text
.globl main
.type main, #function
main:
;------- function prologue
pushq %rbp
movq %rsp, %rbp
;---------------------------------
subq $32, %rsp
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
;----------------------------------
movl $5, -20(%rbp) ; This is where the value 5 is stored in `value` (automatic allocation)
;----------------------------------
leaq -20(%rbp), %rax ;; (GUESS) If I have understood correctly, this is where the address of `value` is
;; extracted, and stored into %rax
;----------------------------------
movq %rax, -16(%rbp) ;;
movq -16(%rbp), %rax ;; Why do I have two times the same instructions, with reversed operands???
;----------------------------------
movl (%rax), %eax
movl %eax, %esi
movl $.LC0, %edi
movl $0, %eax
call printf
;----------------------------------
movl $0, %eax
movq -8(%rbp), %rdx
xorq %fs:40, %rdx
je .L3
call __stack_chk_fail
.L3:
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.9.1-16ubuntu6) 4.9.1"
.section .note.GNU-stack,"",#progbits
My issue is that I don't understand why it contains the instruction movq two times, with reversed operands. Could someone explain it to me?
[1]: I want to avoid having my asm code interspersed with cfi directives when I don't need them at all.
[2]: My environment is Ubuntu 14.10, gcc 4.9.1 (modified by ubuntu), and Gnu assembler (GNU Binutils for Ubuntu) 2.24.90.20141014, configured to target x86_64-linux-gnu

Maybe it will be clearer if you reorganize your blocks:
;----------------------------------
leaq -20(%rbp), %rax ; &value
movq %rax, -16(%rbp) ; int_val
;----------------------------------
movq -16(%rbp), %rax ; int_val
movl (%rax), %eax ; *int_val
movl %eax, %esi ; printf-argument
movl $.LC0, %edi ; printf-argument (format-string)
movl $0, %eax ; no floating-point numbers
call printf
;----------------------------------
The first block performs int *int_val = &value;, the second block performs printf .... Without optimization, the blocks are independent.

Since you're not doing any optimization, gcc creates very simple-minded code that does each statement in the program one at a time without looking at any other statement. So in your example, it stores a value into the variable int_val, and then the very next instruction reads that variable again as part of the next statement. In both cases, it is using %rax as the temporary to hold value, as that's the first register generally used for things.

Creating a directory in linux assembly language

I am trying to create a small assembly program to create a folder. I looked up the system call for creating a directory on this page. It says that it is identified by 27h. How would I go about implementing the mkdir somename in assembly?
I am aware that the program should move 27 into eax but I am unsure where to go next. I have googled quite a bit and no one seems to have posted anthing about this online.
This is my current code (I don't know in which register to put filename and so on):
section .data
section .text
global _start
mov eax, 27
mov ????????
....
int 80h
Thanks

One way of finding out, is using GCC to translate the following C code:
#include <stdio.h>
#include <sys/stat.h>
int main()
{
if (mkdir("testdir", 0777) != 0)
{
return -1;
}
return 0;
}
to assembly, with: gcc mkdir.c -S
.file "mkdir.c"
.section .rodata
.LC0:
.string "testdir"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
movl $511, 4(%esp)
movl $.LC0, (%esp)
call mkdir ; interesting call
testl %eax, %eax
setne %al
testb %al, %al
je .L2
movl $-1, %eax
jmp .L3
.L2:
movl $0, %eax
.L3:
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.5.1 20100924 (Red Hat 4.5.1-4)"
.section .note.GNU-stack,"",#progbits
Anyway, ProgrammingGroundUp page 272 lists important syscalls, including mkdir:
%eax Name %ebx %ecx %edx Notes
------------------------------------------------------------------
39 mkdir NULL terminated Permission Creates the given
directory name directory. Assumes all
directories leading up
to it already exist.

You could also do like the Assembly Howto is suggesting. But indeed, calling mkdir from Libc is more portable. You need to look into asm/unistd.h to get the syscall number.