How to call std library functions in lldb - lldb

How does one print out an expression with a std library function in lldb? For example, suppose I want to use std::string::c_str() in an print expression. I can see the symbol and disassemble it just fine but cannot seem to use it in an expression call
(lldb) image lookup -v -r -n "c_str\("
2 matches found in /usr/lib/libc++.1.dylib:
Address: libc++.1.dylib[0x0000000000041da6] (libc++.1.dylib.__TEXT.__text + 264214)
Summary: libc++.1.dylib`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str() const
Module: file = "/usr/lib/libc++.1.dylib", arch = "x86_64"
Symbol: id = {0x000002ec}, range = [0x00007fff8ec8cda6-0x00007fff8ec8cdbe), name="std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str() const", mangled="_ZNKSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE5c_strEv"
(lldb) dis -n "std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str()"
libc++.1.dylib`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str() const:
0x7fff8ec8cda6: pushq %rbp
0x7fff8ec8cda7: movq %rsp, %rbp
0x7fff8ec8cdaa: testb $0x1, (%rdi)
0x7fff8ec8cdad: je 0x7fff8ec8cdb5 ; std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str() const + 15
0x7fff8ec8cdaf: movq 0x10(%rdi), %rdi
0x7fff8ec8cdb3: jmp 0x7fff8ec8cdb8 ; std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str() const + 18
0x7fff8ec8cdb5: incq %rdi
0x7fff8ec8cdb8: movq %rdi, %rax
0x7fff8ec8cdbb: popq %rbp
0x7fff8ec8cdbc: retq
0x7fff8ec8cdbd: nop
But cannot seem to use it in an expression. Seems to always choke on the std namespace identifier
(lldb) expr std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::c_str($rax)
error: use of undeclared identifier 'std'
error: expected unqualified-id
error: 2 errors parsing expression

That's an instance method, you have to call it on an object. In your case, it looks like you want to do something like:
(lldb) expr ((std::__1::string *) $rax)->c_str()
(const value_type *) $1 = 0x00007fff5fbff661 "some string here"
You really shouldn't have to explicitly name the __1, but lldb doesn't support "inline namespaces" yet (and the clang debug information doesn't say __1 IS an inline namespace, so for now you do.

Related

How to call a C function from assembly file?

I have the source code of a static library. I'm trying to compile it into a dynamic library. The source code has .c files and one .S file. While compiling I'm getting a relocation error from the assembly code. Going through the assembly code, I find out that this error is generated while calling a function from one of the C files. Assembly code segment is,
.extern dune_syscall_handler // I added it
__dune_syscall:
testq $1, %gs:IN_USERMODE
jnz 1f
pushq %r11
popfq
vmcall
jmp *%rcx
1:
/* first switch to the kernel stack */
movq %rsp, %gs:TMP
movq %gs:TRAP_STACK, %rsp
/* now push the trap frame onto the stack */
subq $TF_END, %rsp
movq %rcx, RIP(%rsp)
movq %r11, RFLAGS(%rsp)
movq %r10, RCX(%rsp) /* fixup to standard 64-bit calling ABI */
SAVE_REGS 0, 1
movq %gs:TMP, %rax
movq %rax, RSP(%rsp)
SET_G0_FS_BASE
/* re-enable interrupts and jump to the handler */
sti
movq %rsp, %rdi /* argument 0 */
lea dune_syscall_handler, %rax <-------------------- Causing error
call *%rax
SET_G3_FS_BASE
RESTORE_REGS 0, 1
movq RCX(%rsp), %r10
movq RFLAGS(%rsp), %r11
movq RIP(%rsp), %rcx
/* switch to the user stack and return to ring 3 */
movq RSP(%rsp), %rsp
sysretq
Causing the following error,
ld: dune.o: relocation R_X86_64_32S against `dune_syscall_handler' can not be used when
making a shared object; recompile with -fPIC
dune.o: error adding symbols: Bad value
dune_syscall_handler is defined in a separate C file. I use -fPIC flag while compiling. readelf shows the following,
$readelf -r dune.o
Relocation section '.rela.text' at offset 0x18c8 contains 2 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000219 00120000000b R_X86_64_32S 0000000000000000 dune_syscall_handler + 0
I guess Sym. Value suppose to filledup by linker during runtime? That's why it is empty? Listing symbols from the object shows,
$nm dune.o
0000000000000070 t __dune_retry
00000000000002e5 T dune_jump_to_user
000000000000016c T __dune_syscall
000000000000028e T __dune_syscall_end
U dune_syscall_handler
0000000000000100 a I
I'm new to assembly. My understanding is while trying to load the effective address of the C function dune_syscall_handler, the linker could not find it. I just can't figure out why. Can someone please tell me how can I call a C function from an assembly file? There is a similar post with a same tile How to call a C function from Assembly code. But I guess my problem is different.

Operand type mismatch for `push' [duplicate]

This question already has answers here:
Reading program counter directly
(7 answers)
Closed 4 years ago.
I need help with C code that uses ASSEMBLY parts. GCC has a problem compiling the assembly, error:
$ make
gcc -Wall -g -std=c99 -pedantic -c -o sthread.o sthread.c
sthread.c: In function ‘sthread_create’:
sthread.c:159:57: warning: pointer of type ‘void *’ used in arithmetic [-Wpointer-arith]
t->context = __sthread_initialize_context(t->memory + DEFAULT_STACKSIZE, f, arg);
^
gcc -Wall -g -std=c99 -pedantic -c -o queue.o queue.c
as -g -o glue.o glue.s
glue.s: Assembler messages:
glue.s:32: Error: operand type mismatch for `push'
<wbudowane>: polecenia dla obiektu 'glue.o' nie powiodły się
make: *** [glue.o] Błąd 1
Code in question:
__sthread_switch:
# preserve CPU state on the stack, with exception of stack pointer, instruction pointer first, reverse order
pushq %rip #line 32
pushf
pushq %rdi
pushq %rsi
pushq %rbp
pushq %rbx
pushq %rdx
pushq %rcx
pushq %rax
# Call the high-level scheduler with the current context as an argument
movq %rsp, %rdi
movq scheduler_context, %rsp
call __sthread_scheduler
With X86_64 you cannot push the %rip, in fact you cannot access it directly at all.
If you still need to do it, you can do
leaq 0(%rip), %rax # Or any other GPR that is free
pushq %rax
OR
callq . + 5 # no label, hard-code instruction length
# or
callq 1f ; 1: # with a local numbered label
Although I am not sure why you would want to stash the %rip, if you restore it from here, the execution will continue from the push instruction. Is there any value to that? You need to rethink your thread switching logic.
To push RIP, simply execute call with a zero displacement.
call next_insn
next_insn:
So the jump part of call is a no-op, so you just get the effect of pushing a return address (i.e. the current RIP).
Fun fact: call rel32=0 is a special case and doesn't unbalance the return address predictor stack on CPUs more recent than PPro. So call next_insn / pop eax is useful in 32-bit mode as an equivalent to lea (%rip), %rax.
It's still a branch instruction, and still decodes to multiple uops (unlike a push of a GPR which is 1 micro-fused uop), so lea (%rip), %rax ; push %rax may be more efficient.

Passing 128 bit register to C function from Assembly [duplicate]

This question already has an answer here:
Printing floating point numbers from x86-64 seems to require %rbp to be saved
(1 answer)
Closed 5 years ago.
I am attempting to test passing a floating point value to a C function from assembly on 64-bit Linux. The C file containing my C function looks like this:
#include <stdio.h>
extern void printer(double k){
printf("%f\n",k);
}
Its expected behavior is to simply print the floating point number passed to it. I am trying to accomplish this from an AT&T-syntax assembly file. If I am not mistaken, in 64-bit linux, the calling convention is to pass floating point arguments on the XMM registers. My .s file is the following:
.extern printer
.data
var:
.double 120.1
.global main
main:
movups (var),%xmm0
call printer
mov $60,%rax
syscall
What I'm hoping this could do is have a variable (var) with value 120.1. This is then moved to the xmm0 register, which I expect is what is used to pass the argument k. This understanding of the calling convention is also backed up by the assembly code generated from the C file, a portion of which is below:
printer:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movsd %xmm0, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, -16(%rbp)
movsd -16(%rbp), %xmm0
movl $.LC0, %edi
movl $1, %eax
call printf
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
My .s file assembles to an executable, but running it only gives a segmentation fault, and doesn't print the floating point value. I can only assume this is because I'm not properly moving the value to xmm0 and/or using the register to pass it to the function. Can somebody explain how I should pass the value to the function?
You have defined main in the data section, which makes it non-executable. Add a .text directive before main.

gcc - structures as label for debugging

Context:
Linux 64.
I would like a way to tell gcc to keep the structure as they are when generating assembly with gcc -O0 -S -g myprog.c
By that, I mean: instead of referencing the structure by address, I would like them to be referenced by label. That would ease the parsing without reading the source code again.
So, for example:
struct mystruct{
int32_t a;
char * b;
}
would become something like:
label_mystruct:
-4(label_mystruct)
-12(label_mystruct)
and for example, referenced by:
add $56, -4(label_mystruct)
Currently, it is referenced like
.globl _main
_main:
LFB13:
LM157:
pushq %rbp #
LCFI27:
movq %rsp, %rbp#,
LCFI28:
subq $80, %rsp#,
movl %edi,-68(%rbp) # argc, argc,
movq %rsi,-80(%rbp) # argv, argv
Next line is the culprit:
movq -56(%rbp), %rdx # list, D.3781
movq -16(%rbp), %rax # arr, D.3780
movq %rdx, %rsi # D.3781,
movq %rax, %rdi # D.3780,
call _myaddhu #
I would like it to be
label_mystruct:
-4(label_mystruct)
-12(label_mystruct)
.globl _main
_main:
LFB13:
LM157:
pushq %rbp #
LCFI27:
movq %rsp, %rbp#,
LCFI28:
subq $80, %rsp#,
movl %edi,-68(%rbp) # argc, argc,
movq %rsi,-80(%rbp) # argv, argv
Now it is fine:
movq label_mystruct, %rdx # list, D.3781
movq -16(%rbp), %rax # arr, D.3780
movq %rdx, %rsi # D.3781,
movq %rax, %rdi # D.3780,
call _myaddhu #
Question:
Is that possible with gcc and without using external tools?
I think it's not possible, and this is by the setup used in GCC.
The problem here is that the struct here is stored on the stack and you cannot really have a label referring to something on the stack. If the struct was not on the stack it would have had a label referring to it (for example if it were a global variable).
What you have on the other hand is that GCC would generate debugging info which has information about what data is placed when running specific code. In your example it would in essense say that "when executing this code -56(%ebp) points to mystruct".
On the other hand if you would write assembler code by hand you could certainly have symbolic references to a variable. You could for example do:
#define MYSTRUCT -56(%ebp)
...
movq MYSTRUCT, %rdx
however the MYSTRUCT will be expanded and that symbol being lost during assembling the code. It would be of no help if GCC did this (except maybe that the assembler code generated by -s could be more readable), in addition GCC does not pass the assembler through preprocessor anyway (because it don't do this).
You get that if you put your struct into static storage. This of course alters the meaning of the code. For example, this code
struct {
int a, b;
} test;
int settest(int a, b) {
test.a = a;
test.b = b;
}
compiles to (cleaned up):
settest:
movl %edi, test(%rip)
movl %esi, test+4(%rip)
ret
.comm test,8,4
You could also try to pass the option -fverbose-asm to gcc which instructs gcc to add some annotations that might make the assembly easier to read.

Why no corresponding subroutine call in machine code dump when calling strcpy()?

I am working through the book 'Hacking: The Art of Exploitation' by Jon Erickson.
In one part of the book he gives C code and then walks through the corresponding assembly using gdb, explaining the instructions and memory activity.
I am working along in Mac OS X, so things are a bit different than he presents in the book (he is using Linux).
Anyway, I have this C program:
1 #include <stdio.h>
2 #include <string.h>
3
4 int main()
5 {
6 char str_a[20];
7
8 strcpy(str_a, "Hello, world!\n");
9 printf(str_a);
10 }
Here is the corresponding otool object dump (I have just included main):
_main:
0000000100000ea0 pushq %rbp
0000000100000ea1 movq %rsp,%rbp
0000000100000ea4 subq $0x30,%rsp
0000000100000ea8 movq 0x00000189(%rip),%rax
0000000100000eaf movq (%rax),%rax
0000000100000eb2 movq %rax,0xf8(%rbp)
0000000100000eb6 leaq 0xd4(%rbp),%rax
0000000100000eba movq %rax,%rcx
0000000100000ebd movq $0x77202c6f6c6c6548,%rdx
0000000100000ec7 movq %rdx,(%rcx)
0000000100000eca movb $0x00,0x0e(%rcx)
0000000100000ece movw $0x0a21,0x0c(%rcx)
0000000100000ed4 movl $0x646c726f,0x08(%rcx)
0000000100000edb movq %rcx,0xe8(%rbp)
0000000100000edf xorb %cl,%cl
0000000100000ee1 movq %rax,%rdi
0000000100000ee4 movb %cl,%al
0000000100000ee6 callq 0x100000f1e ; symbol stub for: _printf
0000000100000eeb movl 0xf4(%rbp),%eax
0000000100000eee movq 0x00000143(%rip),%rcx
0000000100000ef5 movq (%rcx),%rcx
0000000100000ef8 movq 0xf8(%rbp),%rdx
0000000100000efc cmpq %rdx,%rcx
0000000100000eff movl %eax,0xd0(%rbp)
0000000100000f02 jne 0x100000f0d
0000000100000f04 movl 0xd0(%rbp),%eax
0000000100000f07 addq $0x30,%rsp
0000000100000f0b popq %rbp
0000000100000f0c ret
0000000100000f0d callq 0x100000f12 ; symbol stub for: ___stack_chk_fail
OK. You will notice the subroutine call to printf():
0000000100000ee6 callq 0x100000f1e ; symbol stub for: _printf
But where is the call to strcpy()?
There are two further anomalies. First of all, if I set a breakpoint in gdp for strcpy():
break strcpy
The program zips through its execution without stopping. It seems as if strcpy() isn't actually getting called.
Second, when I compiled the code:
gcc -g -o char_array2 char_array2.c
I got a warning:
char_array2.c: In function ‘main’:
char_array2.c:9: warning: format not a string literal and no format arguments
char_array2.c:9: warning: format not a string literal and no format arguments
I'm not sure if that is related to the missing subroutine call, but I thought I would include it as a data point anyway.
It almost seems to me as if the compiler has decided strcpy() is not necessary and has optimised the code to work without it. The program does work as expected, printing 'Hello, world!' to the standard output, but this missing call to strcpy() has me wondering exactly what is happening.
In Erickson's example in the book there is a call to strcpy() so perhaps there is a difference in how his compiler and my compiler are working. I am on LLVM:
i686-apple-darwin11-llvm-gcc-4.2
Any ideas would be gratefully received!
Thanks in advance, and I hope you find this one interesting.
Tom
It's right here:
0000000100000ebd movq $0x77202c6f6c6c6548,%rdx
0000000100000ec7 movq %rdx,(%rcx)
0000000100000eca movb $0x00,0x0e(%rcx)
0000000100000ece movw $0x0a21,0x0c(%rcx)
0000000100000ed4 movl $0x646c726f,0x08(%rcx)

Resources