Why is LLDB generating EXC_BAD_INSTRUCTION with user compiled library on MacOS?

Why is LLDB generating EXC_BAD_INSTRUCTION with user compiled library on MacOS? - c

I want to debug OpenSSL on MacOS to see how it creates an elliptic curve point. So, I compiled OpenSSL with debug symbols and no optimizations. However, when I run with lldb, it doesn't work
$ cat ec.c
#include <crypto/ec.h>
#include <stdio.h>
int main() {
EC_GROUP *group = EC_GROUP_new_by_curve_name(NID_secp384r1);
EC_POINT *point = EC_POINT_new(group);
printf("done!\n");
return 0;
}
Here is how I compiled and ran the program:
$ gcc ec.c -o ec -I../openssl/include ../openssl/libcrypto.dylib -ggdb3 -O0
$ DYLD_INSERT_LIBRARIES=../openssl/libcrypto.dylib ./ec
done!
Here is what happens when I run lldb and try to break at main:
$ lldb ./ec
(lldb) process launch --environment DYLD_INSERT_LIBRARIES=../openssl/libcrypto.dylib ./ec
Process 3948 launched: '/Users/seanthomas/repos/ec/ec' (arm64)
Process 3948 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
frame #0: 0x00000001009651a8 libcrypto.3.dylib`_armv8_sve_probe
libcrypto.3.dylib`:
-> 0x1009651a8 <+0>: eor z0.d, z0.d, z0.d
0x1009651ac <+4>: ret
libcrypto.3.dylib`:
0x1009651b0 <+0>: xar z0.d, z0.d, z0.d, #0x20
0x1009651b4 <+4>: ret
Target 0: (ec) stopped.
(lldb)
Does anyone know how to fix this?

I'm not sure whether this can help you or not. But perhaps, there're bugs in libcrypto.3.dylib on Arm arch.
I met this problem also. It works when I run the program by cmd line in shell. But it meets this problem when I want to debug this program in VSCode using lldb.
How ever, when I delete the libcrypto.3.dylib and libssl.3.dylib, build the openssl using tag OpenSSL_1_1_1m, and rebuild the program. It works!

Yes, confirm the issue on Mac M1/M2. Reverted back to the old version of openssl and the problem got fixed. I use openssl lib in MacOs app of mine and on app launch in Debug mode it gets crashing right away.

Related

Macbook M1 assembly lldb displays only 3 source lines then switches to object code display

First attempt at ARM64 (apple M1) assembly coding. Have basic 'hello world' code which assembles and runs correctly but when I run it in lldb, only the first three lines are displayed in full source code format like this:
Abenaki:hello jiml$ ~/llvm/clang+llvm-15.0.2-arm64-apple-darwin21.0/bin/lldb hello
(lldb) target create "hello"
Current executable set to '/Users/jiml/Projects/GitRepos/ARM/hello/hello/hello/hello' (arm64).
(lldb) b main
Breakpoint 1: where = hello`main + 4, address = 0x0000000100003f7c
(lldb) r
Process 5017 launched: '/Users/jiml/Projects/GitRepos/ARM/hello/hello/hello/hello' (arm64)
Process 5017 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003f7c hello`main at hello.s:19
16
17 _main:
18 mov x0, #0x0 // stdout
-> 19 adrp x1, msg#PAGE // pointer to string
20 add x1, x1, msg#PAGEOFF
21 ldr x2, =msg_len // bytes to output
22 mov x16, #0x04 // sys_write
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
(lldb)
After three steps, the display reverts to bare object code like this:
(lldb) s
Process 5017 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x0000000100003f88 hello`main + 16
hello`main:
-> 0x100003f88 <+16>: mov x16, #0x4
0x100003f8c <+20>: svc #0x80
0x100003f90 <+24>: adrp x1, 1
0x100003f94 <+28>: mov x2, #0x0
dwarfdump -a shows that all source lines are present in the .o; same behavior for .dSYM assembly. Using the 'list' command in lldb however displays all source lines correctly.
Is this a known issue for LLVM (clang, lldb) development? Any help appreciated...
I have tried LLVM version 14 and 15, same behavior, searched for similar issues but no help.
I did find this https://stackoverflow.com/questions/73778648/why-is-it-that-assembling-linking-in-one-step-loses-debug-info-for-my-assembly-s but it did not solve my issue.

So I think I have this resolved but not sure if it is actual compiler bug.
I wrote hello world in C, compiled and confirmed complete source display in lldb. I then reran clang with -S to generate the assembler source.
I then assembled that source...
clang -g -c -o hello.o hello.s
clang -o hello hello.o -lSystem -arch arm64
and confirmed it also runs in lldb with complete source display. Then I moved my hand written code line-by-line in order to figure out where the problem occurs. Seems my string data and length calculation is problematic. In the data section I originally had:
msg: ascii "Hello ARM64\n"
msg_len = . - msg
Coming from Intel world this seems perfectly natural ;-) Adding that length calculation caused some sort of corruption of the debug data. However, the executable has a proper OSO statement pointing at hello.o (nm -ap hello) and further the object file has references for all source statements in the source file (dwarfdump --debug-line hello.o) but still doesn't display source code after the third step. Curious that 'source info -f hello.s' within lldb only listed four lines.
I found three work-arounds. First adding a label between the two statements seems to allow correct behavior:
msg: ascii "Hello ARM64\n"
nothing:
msg_len = . - msg
Second, using equate:
msg: ascii "Hello ARM64\n"
.equ msg_len, . - msg
Third, using two labels:
msg: ascii "Hello ARM64\n"
msg_end:
msg_len = msg_end - msg
Will file report with llvm and see what they say.

Aarch64 baremetal Hello World program on QEMU

I have compiled and ran a simple hello world program on ARM Foundation Platform. The code is given below.
#include <stdio.h>
int main(int argc, char *argv[])
{
printf("Hello world!\n");
return 0;
}
This program is compiled and linked using ARM GNU tool chain as given below.
$aarch64-none-elf-gcc -c -march=armv8-a -g hello.c hello.o
aarch64-none-elf-gcc: warning: hello.o: linker input file unused because linking not done
$aarch64-none-elf-gcc -specs=aem-ve.specs -Wl,-Map=linkmap.txt hello.o -o hello.axf
I could successfully execute this program on ARM Foundation Platform (I think, foundation platform is similar to ARM Fixed Virtual Platform) and it prints "Hello world!"
The contents of 'aem-ve.specs' file is given below:
cat ./aarch64-none-elf/lib/aem-ve.specs
# aem-ve.specs
#
# Spec file for AArch64 baremetal newlib, libgloss on VE platform with version 2
# of AngelAPI semi-hosting.
#
# This Spec file is also appropriate for the foundation model.
%rename link old_link
*link:
-Ttext-segment 0x80000000 %(old_link)
%rename lib libc
*libgloss:
-lrdimon
*lib:
cpu-init/rdimon-aem-el3.o%s --start-group %(libc) %(libgloss) --end-group
*startfile:
crti%O%s crtbegin%O%s %{!pg:rdimon-crt0%O%s} %{pg:rdimon-crt0%O%s}
Is it possible to execute the same binary on QEMU? If so could you please share the sample command. If not, could you please share the right and best way to execute it on QEMU?
I have traced instruction execution of Foundation platform using the "trace" option and analysed using 'tarmac-profile' tool and 'Tarmac-calltree' tool. The outputs are given below:
$ ./tarmac-profile hello.trace --image hello.axf
Address Count Time Function name
0x80000000 1 120001
0x80001030 1 30001 register_fini
0x80001050 1 60001 deregister_tm_clones
0x800010c0 1 210001 __do_global_dtors_aux
0x80001148 1 23500001 frame_dummy
0x800012c0 1 10480001 main
0x80002000 1 40001 main
0x80003784 1 360001 main
0x80003818 1 460001 _cpu_init_hook
0x80003870 1 590001 _write_r
0x800038d0 1 1090001 __call_exitprocs
...
...
./tarmac-calltree hello.trace --image hello.axf
o t:10000 l:2288 pc:0x80001148 - t:23510000 l:7819 pc:0x80008060 : frame_dummy
- t:240000 l:2338 pc:0x800011a8 - t:720000 l:2443 pc:0x800011ac
o t:250000 l:2340 pc:0x80003818 - t:710000 l:2442 pc:0x80003828 : _cpu_init_hook
- t:260000 l:2343 pc:0x8000381c - t:320000 l:2354 pc:0x80003820
o t:270000 l:2345 pc:0x80002000 - t:310000 l:2353 pc:0x80002010 : main
- t:320000 l:2354 pc:0x80003820 - t:700000 l:2436 pc:0x80003824
o t:330000 l:2356 pc:0x80003784 - t:690000 l:2435 pc:0x80003814 : main
- t:760000 l:2453 pc:0x800011bc - t:2010000 l:2970 pc:0x800011c0
o t:770000 l:2455 pc:0x80004200 - t:2000000 l:2969 pc:0x800042c0 : memset
- t:2010000 l:2970 pc:0x800011c0 - t:4870000 l:3587 pc:0x800011c4
o t:2020000 l:2972 pc:0x80007970 - t:4860000 l:3586 pc:0x80007b04 : initialise_monitor_handles
- t:2960000 l:3165 pc:0x80007b24 - t:4340000 l:3465 pc:0x80007b28
I have tried the following method to execute on QEMU, without any success.
I have started the QEMU with gdb based debugging through TCP Port
$ qemu-system-aarch64 -semihosting -m 128M -nographic -monitor none -serial stdio -machine virt,gic-version=2,secure=on,virtualization=on -cpu cortex-a53 -kernel hello.axf -S -gdb tcp::9000
The result of debug session is given below:
gdb) target remote localhost:9000
Remote debugging using localhost:9000
_start () at /data/jenkins/workspace/GNU-toolchain/arm-11/src/newlib-cygwin/libgloss/aarch64/crt0.S:90
90 /data/jenkins/workspace/GNU-toolchain/arm-11/src/newlib-cygwin/libgloss/aarch64/crt0.S: No such file or directory.
(gdb) si
<The system hangs here>
I have tried to disassemble the code using gdb and its output is given below. It looks like the code is not loaded correctly
(gdb) disas frame_dummy
Dump of assembler code for function frame_dummy:
0x0000000080001110 <+0>: udf #0
0x0000000080001114 <+4>: udf #0
0x0000000080001118 <+8>: udf #0
0x000000008000111c <+12>: udf #0
0x0000000080001120 <+16>: udf #0
0x0000000080001124 <+20>: udf #0
0x0000000080001128 <+24>: udf #0
0x000000008000112c <+28>: udf #0
0x0000000080001130 <+32>: udf #0
0x0000000080001134 <+36>: udf #0
0x0000000080001138 <+40>: udf #0
0x000000008000113c <+44>: udf #0
0x0000000080001140 <+48>: udf #0
0x0000000080001144 <+52>: udf #0
End of assembler dump.
Could you please shed some light on this. Any hint given is appreciated very much.

The Foundation Model and the QEMU 'virt' board are not the same machine type. They have different devices at different physical addresses, and in particular they do not have RAM at the same address. To run bare metal code on the 'virt' machine type you will need to adjust your code. This is normal for bare metal -- the whole point is that you are running directly on some piece of (emulated) hardware and need to match the details of it.
Specifically, the minimum change you need to make here is that the RAM on the 'virt' board starts at 0x4000_0000, not the 0x8000_0000 that the Foundation Model uses. There may be others as well, but that is the immediate cause of what you're seeing.

Unable to execute assembly code located in the program's data segment in WSL

This program overrides the return address of the main function to point to a character array of encoded x86-64 assembly instructions. The instructions simply encode a NOP followed by the syscall exit(1).
char shellcode[] = "\x90\x48\x31\xc0\x48\x89\xc7\x48\xff\xc7\x04\x3c\x0f\x05";
void main() {
long int *ret;
ret = (long int *)&ret + 2;
(*ret) = (long int)shellcode;
}
It is compiled with
gcc program.c -o program.exe -fno-stack-protector -z execstack
on a WSL. (The Ubuntu app from the Microsoft Store)
The -z execstack flag should cause the stack, as well as all other writable portions of the program, to be executable.
Since the global character array resides in the .data of the assembly code it should be executable. This is true when running the program on a Linux VM, but apparently not on the WSL? On WSL, control is still properly redirected to the NOP, but it segfaults if you try to execute that instruction (presumably on code-fetch from a non-executable page). Running gdb shows this after returning from main:
=> 0x8004010 <shellcode>: nop
0x8004011 <shellcode+1>: xor %rax,%rax
(gdb) si
Program received signal SIGSEGV, Segmentation fault.
What could be causing this inconsistency between WSL and real Linux? Does WSL enforce strict W^X, overriding even -z execstack?
Are other Windows security features at play? I do not know if it is a WSL 1 or 2, but the uname -a output is:
Linux LAPTOP-M91FQN9V 4.4.0-18362-Microsoft #836-Microsoft Mon May 05 16:04:00 PST 2020 x86_64 x86_64 x86_64 GNU/Linux
What could be preventing the instructions from running, when the same does not occur on a Linux VM?

Cannot set breakpoints with GDB and OpenOCD for STM32F4 with ST-Link

I'm trying to use OpenOCD with GDB to debug the STM32F4 Cortex-M4 on my STM32F4Discovery board.
Setup:
Ubuntu 16.04
OpenOCD 0.9.0 (also tested with 0.10-dev)
arm-none-eabi-gdb 7.10
STM32F4Discovery with ST-Link v2 (V2J28S0)
Project code generated with STM32CubeMX
I made sure debug wire is enabled in STM32CubeMX (this keeps the debug wire pins in default state)
GCC flags are:
-mcpu=cortex-m4 -mthumb -mthumb-interwork -mfloat-abi=hard -mfpu=fpv4-sp-d16 -ffunction-sections -fdata-sections -g -fno-common -fmessage-length=0
I added simple blinking LED code to the main loop, to test the debugging.
I start OpenOCD with openocd -f board/stm32f4discovery.cfg -c "program build/discovery_simple_test.elf verify reset". OpenOCD flashes the chip and resets it. (Output of OpenOCD can be found here)
Now I connect to it with GDB:
(gdb) target remote localhost:3333
Remote debugging using localhost:3333
0x08001450 in ?? ()
(gdb) set verbose on
(gdb) file "/abs_path/build/discovery_simple_test.elf"
A program is being debugged already.
Are you sure you want to change the file? (y or n) y
Load new symbol table from "/abs_path/build/discovery_simple_test.elf"? (y or n) y
Reading symbols from /abs_path/build/discovery_simple_test.elf...done.
Reading in symbols for /abs_path/Src/main.c...done.
(gdb) monitor reset
(gdb) break main
Breakpoint 1 at 0x8000232: file /abs_path/Src/main.c, line 71.
(gdb) break /abs_path/Src/main.c:93
Breakpoint 2 at 0x8000258: file /abs_path/Src/main.c, line 93.
The program should break at line 93, but it doesn't.
When I halt the execution and try to continue it, it doesn't continue:
(gdb) monitor halt
target state: halted
target halted due to debug-request, current mode: Thread
xPSR: 0x21000000 pc: 0x080006d8 msp: 0x2001ffe8
(gdb) monitor continue
//Program doesn't continue
What is going on and how can I fix it?

To fix this issue i add the following command in openocd :
-c gdb_breakpoint_override hard

May be another GDB-Instance is Running ?
"A program is being debugged already."
Look for a "arm-none-eabi-gdb" Process and kill it.

I guess the command to use is just continue and not monitor continue since you have to tell GDB that the underlying software is running.
monitor continue tells the embedded system to continue but GDB does not know about it (it does not interpret the meaning of the monitor command, openocd does).

No Debug Symbols cross compile ARM on BusyBox

i'm trying to debug a C program, which runs on an ARM926EJ-S rev 5 (v5l). The software was cross-compiled (and is statically linked) with the std. arm-linux-gnueabi compiler (intalled via synaptic). I run Ubuntu 13.04 64bit. On the device is a Busybox v1.18.2. I successfully compiled gdbserver (with host=arm-linux-gnueabi) and gdb (with target=arm-linux-gnueabi) and can start my program on the embedded device via the locally running gdb...
My problem now is, that i don't have a proper backtrace output.
Message of gdb:
Remote debugging using 192.168.21.127:2345
0x0000a79c in ?? ()
(gdb) run
The "remote" target does not support "run". Try "help target" or "continue".
(gdb) continue
Continuing.
Cannot access memory at address 0x0
Program received signal SIGINT, Interrupt.
0x00026628 in ?? ()
(gdb) backtrace
#0 0x00026628 in ?? ()
#1 0x00036204 in ?? ()
#2 0x00036204 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
I try to compile the software with -g, -g3 -gdwarf-2, -ggdb, -ggdb3 without any difference.
Has anybody an idea what i am missing here?
Is this a problem maybe with the BusyBox or do i need additional libs on my host system?
I also tried the function backtrace_symbols from execinfo.h with nearly the same output...
Thanks in advance for any reply.

Another way for debugging is use gdb inside board follow below steps.
1)Run gdb process and attach your process to gdb using attach <pid> command
2)Continue your process using c command in gdb
Whenever you find any SIGINT or SIGSEGV then refer stack of your process using bt command in gdb.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why is LLDB generating EXC_BAD_INSTRUCTION with user compiled library on MacOS? - c

Yes, confirm the issue on Mac M1/M2. Reverted back to the old version of openssl and the problem got fixed. I use openssl lib in MacOs app of mine and on app launch in Debug mode it gets crashing right away.

Related

Macbook M1 assembly lldb displays only 3 source lines then switches to object code display

Aarch64 baremetal Hello World program on QEMU

Unable to execute assembly code located in the program's data segment in WSL

Cannot set breakpoints with GDB and OpenOCD for STM32F4 with ST-Link

No Debug Symbols cross compile ARM on BusyBox

Categories

Resources