AArch64 - calling sprintf from EL2 and EL1

AArch64 - calling sprintf from EL2 and EL1 - arm

I have written a simple arm64 bare-metal program to switch between EL2 and EL1. In EL2 and EL1, I am calling sprintf function as shown below.
void EL2Handler()
{
char buffer[100];
sprintf(buffer, "Current Exception Level is 2");
...
}
void EL1Handler()
{
char buffer[100];
sprintf(buffer, "Current Exception Level is 1");
...
}
When the sprintf called from EL1, it throws a synchronous exception. What could be the reason? Could it be related to memory access permissions? I am emulating this code on QEMU using virt machine.
The code is compile using
aarch64-none-elf-gcc -I. -nostartfiles -ffreestanding --specs=rdimon.specs -L. -Wl,-T,qemu-virt-aarch64.ld -o test.elf startup.s test.c
The execute line is
qemu-system-aarch64 -semihosting -m 1024M -nographic -machine virt,gic-version=2,virtualization=on -cpu max -kernel test.elf -S -gdb tcp::9000
This instruction causes the exception,
0x400012cc <sprintf+60> ldp q16, q17, [x1].
(gdb) p /x $x1 $2 = 0x47effdf8
(gdb) p /x $sp $3 = 0x47effdc0
(gdb) p /x $ESR_EL1 $5 = 0x1fe00000
The exception class in ESR_EL1 is 0b000111 "Access to SVE, Advanced SIMD or floating-point functionality trapped by CPACR_EL1.FPEN, CPTR_EL2.FPEN, CPTR_EL2.TFP, or CPTR_EL3.TFP control......". Is there a chance for floating point/sve/simd not implemented in EL1?

Related

gcc can't compile with float after hardware changed in ubuntu 18.04

gcc7 can't compile float type after I change the ubuntu 18.04's hardware (include cpu and motherboard), even I reinstall gcc7 and build-essential etc packages.
cross build like arm-linux-gnueabihf-gcc failed too.
but clang is work fine at that time.
Before:
CPU: i7-8700k,
Code Name: Coffee Lake
Motherboard: ASUS prime z370A https://www.asus.com/Motherboards-Components/Motherboards/PRIME/PRIME-Z370-A/
After:
CPU: i7-4770K
Code Name: haswell
Motherboard: ASUS Z87-DELUXE
https://www.asus.com/SupportOnly/Z87DELUXE/HelpDesk_Manual/
more details:
cat a.c
int main()
{
float a=10000.0;
return a/111;
}
gcc -c a.c
a.c: In function ‘main’:
a.c:29:5: internal compiler error: Illegal instruction
float a=10000.0;
^~~~~
I try to debug it:
>gcc -O0 -c a.c -wrapper gdb,--args
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
...
Reading symbols from /usr/lib/gcc/x86_64-linux-gnu/7/cc1...(no debugging symbols found)...done.
(gdb) r
Starting program: /usr/lib/gcc/x86_64-linux-gnu/7/cc1 -quiet -imultiarch x86_64-linux-gnu a.c -quiet -dumpbase a.c -mtune=generic -march=x86-64 -auxbase a -O0 -fstack-protector-strong -Wformat -Wformat-security -o /tmp/ccWQLBBi.s
Program received signal SIGILL, Illegal instruction.
0x00007ffff736cd42 in __gmpn_mul_basecase () from /usr/local/lib/libgmp.so.10
(gdb) bt
#0 0x00007ffff736cd42 in __gmpn_mul_basecase () from /usr/local/lib/libgmp.so.10
#1 0xffffffffffffff01 in ?? ()
#2 0x00007fffffffcc70 in ?? ()
#3 0x00007fffffffcc60 in ?? ()
#4 0x0000000000000004 in ?? ()
#5 0x00007ffff736c7a6 in __gmpn_mul_n () from /usr/local/lib/libgmp.so.10
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
It seems failed with libgmp.so.10, so I reinstall libgmp10 package, but it still failed with same issue.
All gcc version not work, gcc 7,8,9 and cross version for mips/powerpc, but gcc work fine if run in docker.
I update system to 20.04.4 now by system upgrade, its still not work.
I guess the libgmp detect wrong CPU type when calling.

GDB-remote + qemu reports unexpected memory address for static C variable

Remote debugging a code running in Qemu with GDB, based on an os-dev tutorial.
My version is here. The problem only happens when remote-debugging code inside qemu, not when building a normal executable to run directly inside GDB under the normal OS.
Code looks something like this:
#define BUFSIZE 255
static char buf[BUFSIZE];
void foo() {
// Making sure it's all zero.
for (int i = 0; i < BUFSIZE; i++) buf[i] = 0;
// Setting first char:
buf[0] = 'a';
// >> insert breakpoint right after setting the char <<
// Prints 'a'.
printf("%s", buf);
}
If I place a breakpoint at the marked spot and print the buffer with p buf I get random values from random places, seemingly from my code section. If I get the address by p &buf I get something that does not look correct, for two things:
If I do a char* p_buf = buf and I check the address with p p_buf it gives me a totally different address, which is stable across executions (the other was not). Then I inspect that memory section with x /255b 0x____ and I can see the a and then zeros (97 0 0 0 ... 0).
The next command (printf("%s", buf);) does actually prints a.
This leaves me believing it might be GDB not knowing the correct location if I only inspect the static variable.
Where should I start debugging this?
Details about the compile conditions:
Compile flags: -g -Wall -Wextra -pedantic -nostdlib -nostdinc -fno-builtin -fno-stack-protector -nostartfiles -nodefaultlibs -m32
qemu-system-i386
Gcc: i386 elf target
Example output from GDB:
(gdb) p buf
$1 = "dfghjkl;'`\000\\zxcvbnm,./\000*\000 ", '\000' <repeats 198 times>...
(gdb) p p_buf
$2 = 0x40c0 <buf+224> "a"
(gdb) p &buf
$3 = (char (*)[255]) 0x3fe0 <buf>
(gdb) info address buf
Symbol "buf" is static storage at address 0x3fe0.
Update 2:
Disassembled a version of the code that shows the discrepancy:
; void foo
0x19f1 <foo> push %ebp
0x19f2 <foo+1> mov %esp,%ebp
0x19f4 <foo+3> sub $0x10,%esp
; char* p_buf = char_buf; --> `p &char_buf` is 0x4040 (incorrect) but `p p_buf` is 0x4100
0x19f7 <foo+6> movl $0x4100,-0x4(%ebp)
; void* p_p_buf = (void*)p_buf; --> `p p_p_buf` gives 0x4100
0x19fe <foo+13> mov -0x4(%ebp),%eax
0x1a01 <foo+16> mov %eax,-0x8(%ebp)
; void* p_char_buf = (void*)&char_buf; --> `p p_char_buf` gives 0x4100
0x1a04 <foo+19> movl $0x4100,-0xc(%ebp)
; char_buf[0] = 'a'; --> correct address
0x1a0b <foo+26> movb $0x61,0x4100
; char_buf[1] = 'b'; --> correct address (asking `p &char_buf` here is still incorrectly 0x4040)
0x1a12 <foo+33> movb $0x62,0x4101
; void foo return
0x1a19 <foo+40> nop
0x1a1a <foo+41> leave
0x1a1b <foo+42> ret
My Makefile for building the project looks like:
C_SOURCES = $(wildcard kernel/*.c drivers/*.c)
C_HEADERS = $(wildcard kernel/*.h drivers/*.h)
OBJ = ${C_SOURCES:.c=.o kernel/interrupt_table.o}
CC = /home/itarato/code/os/i386elfgcc/bin/i386-elf-gcc
# GDB = /home/itarato/code/os/i386elfgcc/bin/i386-elf-gdb
GDB = /usr/bin/gdb
CFLAGS = -g -Wall -Wextra -ffreestanding -fno-exceptions -pedantic -fno-builtin -fno-stack-protector -nostartfiles -nodefaultlibs -m32
QEMU = qemu-system-i386
os-image.bin: boot/boot.bin kernel.bin
cat $^ > $#
kernel.bin: boot/kernel_entry.o ${OBJ}
i386-elf-ld -o $# -Ttext 0x1000 $^ --oformat binary
kernel.elf: boot/kernel_entry.o ${OBJ}
i386-elf-ld -o $# -Ttext 0x1000 $^
kernel.dis: kernel.bin
ndisasm -b 32 $< > $#
run: os-image.bin
${QEMU} -drive format=raw,media=disk,file=$<,index=0,if=floppy
debug: os-image.bin kernel.elf
${QEMU} -s -S -drive format=raw,media=disk,file=$<,index=0,if=floppy &
${GDB} -ex "target remote localhost:1234" -ex "symbol-file kernel.elf" -ex "tui enable" -ex "layout split" -ex "focus cmd"
%.o: %.c ${C_HEADERS}
${CC} ${CFLAGS} -c $< -o $#
%.o: %.asm
nasm $< -f elf -o $#
%.bin: %.asm
nasm $< -f bin -o $#
build: os-image.bin
echo Pass
clean:
rm -rf *.bin *.o *.dis *.elf
rm -rf kernel/*.o boot/*.bin boot/*.o

For me, this doesn't seem to happen:
Breakpoint 1, main () at test65.c:16
16 printf("%s", buf);
(gdb) p buf
$2 = "a", '\000' <repeats 253 times>
Where should I start debugging this?
It seems like there are two things that might go wrong:
1. GDB might be reading from wrong location
I'm not sure what could cause this, but it is easy enough to verify. Check what address p &buf gives you. Then compare it to what you get from p_buf and also to what info address buf shows you.
Note that due to address space layout randomization the address of static variables will change at the point when you start the process. So before run command the address could be e.g. 0x4040 and then change to 0x555555558040 once the code is running:
(gdb) info address buf
Symbol "buf" is static storage at address 0x4040.
(gdb) run
....
Breakpoint 1, main () at test65.c:16
16 printf("%s", buf);
(gdb) p &buf
$1 = (char (*)[255]) 0x555555558040 <buf>
(gdb) info address buf
Symbol "buf" is static storage at address 0x555555558040.
2. GDB is reading correct place, but data is not there yet
It sounds like a typical debugging problem caused by compiler optimizations. For example, the compiler might move the setting of buf[0] = a after the point where your breakpoint lands, though it must set it before printf() gets called. You could try compiling with -O0 to see if it changes anything.
You can also check the disassembly with disas command, to see what has executed up to that point:
(gdb) disas
Dump of assembler code for function main:
0x000055555555517b <+50>: movb $0x61,0x2ebe(%rip) # 0x555555558040 <buf>
=> 0x0000555555555182 <+57>: lea 0x2eb7(%rip),%rsi # 0x555555558040 <buf>
0x0000555555555189 <+64>: lea 0xe74(%rip),%rdi # 0x555555556004
0x0000555555555190 <+71>: mov $0x0,%eax
0x0000555555555195 <+76>: callq 0x555555555050 <printf#plt>
For me the breakpoint lands at the point right after movb sets 0x61 (letter a) to buf.
If you use stepi command until you are at callq printf instruction, you can be sure you see the buffer exactly like printf would see it.

This is an interesting problem. It comes down to the fact that the code generated by LD (linker) for the ELF executable kernel.elf is different from that of the code generated by LD for kernel.bin when using the --oformat binary option. While one would expect these to be the same, they are not.
More simply put these Makefile rules do not produce the same code as you might expect:
kernel.elf: boot/kernel_entry.o ${OBJ}
i386-elf-ld -o $# -Ttext 0x1000 $^
and
kernel.bin: boot/kernel_entry.o ${OBJ}
i386-elf-ld -o $# -Ttext 0x1000 $^ --oformat binary
It appears the difference is in how the linker is aligning the sections when used with and without --oformat binary. The ELF file (and the symbols used for debugging) are seen to be in one place while the binary file that is actually running in QEMU had code and data generated at different offsets.
I hadn't ever observed this issue because I use my own linker scripts and I always generate the binary file from the ELF executable with OBJCOPY rather than using LD to link twice. OBJCOPY can take an ELF executable and convert it to a binary file. The Makefile rules could be amended to look like:
kernel.bin: kernel.elf
i386-elf-objcopy -O binary $^ $#
kernel.elf: boot/kernel_entry.o ${OBJ}
i386-elf-ld -o $# -Ttext 0x1000 $^
Doing it this way will ensure the binary file that is generated matches what was produced for the ELF executable.

Bare-metal ARM Raspberry Pi + qemu strange behavior with floating point division

I'm currently teaching myself bare-metal ARM kernel development, I've settled on using the Raspberry Pi 2 as a target platform on the basis of being well documented. I'm currently emulating the device using qemu.
In a function called by my kernel I'm required to divide a numerical constant by a function argument and store the result as a floating point number for future calculations.
Calling this function causes qemu to go off the rails. Here's the function itself ( setting PL011 baud rate ):
void pl011_set_baud_rate(pl011_uart_t *uart, uint32_t baud_rate) {
float divider = PL011_UART_CLOCK / (16.0f * baud_rate);
uint16_t integer_divider = (uint16_t)divider;
uint8_t fractional_divider = ((divider - integer_divider) * 64) + 0.5;
mmio_write(uart->IBRD, integer_divider); // Integer baud rate divider
mmio_write(uart->FBRD, fractional_divider); // Fractional baud rate divider
};
I'd post a minimal verifiable example, but just about anything will trigger the issue. If you even use:
void test(uint32_t test_var) {
float test_div = test_var / 16;
(void)test_div; // squash [-Wunused-variable] warnings
// goes off the rails here
};
You'll get the same result.
Stepping through the function in gdb, stepping past float divider... will cause qemu to jump out of the function and head straight to the halt loop in my bootloader code ( for when the kernel main returns )
Checking info args in gdb shows the correct arguments. Checking info locals will show the correct value for float divider. Checking info stack shows the correct stack trace and arguments.
Initially I suspected sp might be in the wrong place, but that doesn't check out since the stack trace looks normal. ( for bare-metal )
(gdb) info stack
#0 pl011_set_baud_rate (uart=0x3f201000, baud_rate=115200) at kernel/uart/pl011.c:23
#1 0x0000837c in pl011_init (uart=0x3f201000) at kernel/uart/pl011.c:49
#2 0x0000806c in uart_init () at kernel/uart/uart.c:12
#3 0x00008030 in kernel_init (r0=0, r1=0, atags=0) at kernel/boot/start.c:10
#4 0x00008008 in _start () at kernel/boot/boot.S:6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
Here's the register dump from right before the line that causes the unpredictable behavior:
r0 0x3f201000 1059065856
r1 0x1c200 115200
r2 0x7ff 2047
r3 0x0 0
r4 0x0 0
r5 0x0 0
r6 0x0 0
r7 0x0 0
r8 0x0 0
r9 0x0 0
r10 0x0 0
r11 0x7fcc 32716
r12 0x0 0
sp 0x7fb0 0x7fb0
lr 0x837c 33660
pc 0x8248 0x8248 <pl011_set_baud_rate+20>
cpsr 0x600001d3 1610613203
My Makefile is:
INCLUDES=include
INCLUDE_PARAMS=$(foreach d, $(INCLUDES), -I$d)
CC=arm-none-eabi-gcc
C_SOURCES:=kernel/boot/start.c kernel/uart/uart.c kernel/uart/pl011.c
AS_SOURCES:=kernel/boot/boot.S
SOURCES=$(C_SOURCES)
SOURCES+=$(AS_SOURCES)
OBJECTS=
OBJECTS+=$(C_SOURCES:.c=.o)
OBJECTS+=$(AS_SOURCES:.S=.o)
CFLAGS=-std=gnu99 -Wall -Wextra -fpic -ffreestanding -mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard
LDFLAGS=-ffreestanding -nostdlib
LIBS=-lgcc
DEBUG_FLAGS=
BINARY=kernel.bin
.PHONY: all clean debug
all: $(BINARY)
debug: DEBUG_FLAGS += -ggdb
debug: $(BINARY)
$(BINARY): $(OBJECTS)
$(CC) -T linker.ld $(LDFLAGS) $(LIBS) $(OBJECTS) -o $(BINARY)
%.o: %.c
$(CC) $(INCLUDE_PARAMS) $(CFLAGS) $(DEBUG_FLAGS) -c $< -o $#
%.o: %.S
$(CC) $(INCLUDE_PARAMS) $(CFLAGS) $(DEBUG_FLAGS) -c $< -o $#
clean:
rm $(BINARY) $(OBJECTS)
As you can see I'm linking against lgcc, and using -mfpu=neon-vfpv4 -mfloat-abi=hard, so at very least gcc should supply it's own floating point division functions from lgcc.
Can anyone point me in the right direction for debugging this issue?
I suspect I'm either using the incorrect compiler arguments and not loading the correct function for floating-point division, or there's some issue with the stack.
Can anyone shed any insight here?

Did you check to see that the fpu coprocessor(s) were enabled?
On the original pi1/pi-zero I use this
;# enable fpu
mrc p15, 0, r0, c1, c0, 2
orr r0,r0,#0x300000 ;# single precision
orr r0,r0,#0xC00000 ;# double precision
mcr p15, 0, r0, c1, c0, 2
mov r0,#0x40000000
fmxr fpexc,r0
the last couple of lines were probably there to intentionally crash if it didnt work.
you may have an armv7 or an armv8 core in the pi2 unfortunately as there are two variations. I suspect either way the specific register and instructions may vary from those above for the armv6 based raspberry pi.

By going off the rails, do you mean your exception handlers are invoked?
If so, qemu has debug options which can help find the exception which is raised. Check qemu-system-arm -M raspi2 -d help.
We can start by enabling int,cpu_reset,unimp,guest_errors.

Homemade Kernel linker global variables and inline Strings cannot be accessed

I have followed some tutorials on the web and created my own kernel. It is booting on GRUB with QEMU succesfully. But I have the problem described in this SO question, and I cannot solve it. I can have that workaround described, but I also need to use global variables, it would make the job easier, but I do not understand what should I change in linker to properly use global variables and inline strings.
main.c
struct grub_signature {
unsigned int magic;
unsigned int flags;
unsigned int checksum;
};
#define GRUB_MAGIC 0x1BADB002
#define GRUB_FLAGS 0x0
#define GRUB_CHECKSUM (-1 * (GRUB_MAGIC + GRUB_FLAGS))
struct grub_signature gs __attribute__ ((section (".grub_sig"))) =
{ GRUB_MAGIC, GRUB_FLAGS, GRUB_CHECKSUM };
void putc(unsigned int pos, char c){
char* video = (char*)0xB8000;
video[2 * pos ] = c;
video[2 * pos + 1] = 0x3F;
}
void puts(char* str){
int i = 0;
while(*str){
putc(i++, *(str++));
}
}
void main (void)
{
char txt[] = "MyOS";
puts("where is this text"); // does not work, puts(txt) works.
while(1){};
}
Makefile:
CC = gcc
LD = ld
CFLAGS = -Wall -nostdlib -ffreestanding -m32 -g
LDFLAGS = -T linker.ld -nostdlib -n -melf_i386
SRC = main.c
OBJ = ${SRC:.c=.o}
all: kernel
.c.o:
#echo CC $<
#${CC} -c ${CFLAGS} $<
kernel: ${OBJ} linker.ld
#echo CC -c -o $#
#${LD} ${LDFLAGS} -o kernel ${OBJ}
clean:
#echo cleaning
#rm -f ${OBJ} kernel
.PHONY: all
linker.ld
OUTPUT_FORMAT("elf32-i386")
ENTRY(main)
SECTIONS
{
.grub_sig 0xC0100000 : AT(0x100000)
{
*(.grub_sig)
}
.text :
{
*(.text)
}
.data :
{
*(.data)void main (void)
}
.bss :
{
*(.bss)
}
/DISCARD/ :
{
*(.comment)
*(.eh_frame)
}
}
What works:
void main (void)
{
char txt[] = "MyOS";
puts(txt);
while(1) {}
}
What does not work:
1)
char txt[] = "MyOS";
void main (void)
{
puts(txt);
while(1) {}
}
2)
void main (void)
{
puts("MyOS");
while(1) {}
}
Output of assembly: (external link, because it is a little long) http://hastebin.com/gidebefuga.pl

If you look at objdump -h output, you'll see that virtual and linear addresses do not match for any of the sections. If you look at objdump -d output, you'll see that the addresses are all in the 0xC0100000 range.
However, you do not provide any addressing information in the multiboot header structure; you only provide the minimum three fields. Instead, the boot loader will pick a good address (1M on x86, i.e. 0x00100000, for both virtual and linear addresses), and load the code there.
One might think that that kind of discrepancy should cause the kernel to not run at all, but it just happens that the code generated by the above main.c does not use the addresses for anything except read-only constants. In particular, GCC generates jumps and calls that use relative addresses (signed offsets relative to the address of the next instruction on x86), so the code still runs.
There are two solutions, first one trivial.
Most bootloaders on x86 load the image at the smallest allowed virtual and linear address, 1M (= 0x00100000 = 1048576). Therefore, if you tell your linker script to use both virtual and linear addresses starting at 0x00100000, i.e.
.grub_sig 0x00100000 : AT(0x100000)
{
*(.grub_sig)
}
your kernel will Just Work. I have verified this fixes the issue you are having, after removing the extra void main(void) from your linker script, of course. To be specific, I constructed an 33 MB virtual disk, containing one ext2 partition, installed grub2 on it (using 1.99-21ubuntu3.10) and the above kernel, and ran the image successfully under qemu-kvm 1.0 (1.0+noroms-0ubuntu14.11).
The second option is to set the bit 16 in the multiboot flags, and supply the five additional words necessary to tell the bootloader where the code expects to be resident. However, 0xC0100000 will not work -- at least grub2 will just freak out and reboot --, whereas something like 0x00200000 does work fine. This is because multiboot is really designed to use virtual == linear addresses, and there may be other stuff already present at the highest addresses (similar to why addresses below 1M is avoided).
Note that the boot loader does not provide you with a stack, so it's a bit of a surprise the code works at all.
I personally recommend you use a simple assembler file to construct the signature, and reserve some stack space. For example, start.asm simplified from here,
BITS 32
EXTERN main
GLOBAL start
SECTION .grub_sig
signature:
MAGIC equ 0x1BADB002
FLAGS equ 0
dd MAGIC, FLAGS, -(MAGIC+FLAGS)
SECTION .text
start:
mov esp, _sys_stack ; End of stack area
call main
jmp $ ; Infinite loop
SECTION .bss
resb 16384 ; reserve 16384 bytes for stack
_sys_stack: ; end of stack
compile using
nasm -f elf start.asm -o start.o
and modify your linker script to use start instead of main as the entry point,
ENTRY(start)
Remove the multiboot stuff from your main.c, then compile and link to kernel using e.g.
gcc -Wall -nostdlib -ffreestanding -fno-stack-protector -O3 -fomit-frame-pointer -m32 -c main.c -o main.o
ld -T linker.ld -nostdlib -n -melf_i386 start.o main.o -o kernel
and you have a good start to work on your own kernel.
Questions? Comments?

dladdr doesn't return the function name

I'm trying to use dladdr. It correctly locates the library, but it does not find the function name. I can call objdump, do a little math, and get the address of the function that I pass dladdr. If objdump can see it, why can't dladdr?
Here is my function:
const char *FuncName(const void *pFunc)
{
Dl_info DlInfo;
int nRet;
// Lookup the name of the function given the function pointer
if ((nRet = dladdr(pFunc, &DlInfo)) != 0)
return DlInfo.dli_sname;
return NULL;
}
Here is a gdb transcript showing what I get.
Program received signal SIGINT, Interrupt.
[Switching to Thread 0xf7f4c6c0 (LWP 28365)]
0xffffe410 in __kernel_vsyscall ()
(gdb) p MatchRec8Cmp
$2 = {void (TCmp *, TWork *, TThread *)} 0xf1b62e73 <MatchRec8Cmp>
(gdb) call FuncName(MatchRec8Cmp)
$3 = 0x0
(gdb) call FuncName(0xf1b62e73)
$4 = 0x0
(gdb) b FuncName
Breakpoint 1 at 0xf44bdddb: file threads.c, line 3420.
(gdb) call FuncName(MatchRec8Cmp)
Breakpoint 1, FuncName (pFunc=0xf1b62e73) at threads.c:3420
3420 {
The program being debugged stopped while in a function called from GDB.
When the function (FuncName) is done executing, GDB will silently
stop (instead of continuing to evaluate the expression containing
the function call).
(gdb) s
3426 if ((nRet = dladdr(pFunc, &DlInfo)) != 0)
(gdb)
3427 return DlInfo.dli_sname;
(gdb) p DlInfo
$5 = {dli_fname = 0x8302e08 "/xxx/libdata.so", dli_fbase = 0xf1a43000, dli_sname = 0x0, dli_saddr = 0x0}
(gdb) p nRet
$6 = 1
(gdb) p MatchRec8Cmp - 0xf1a43000
$7 = (void (*)(TCmp *, TWork *, TThread *)) 0x11fe73
(gdb) q
The program is running. Exit anyway? (y or n) y
Here is what I get from objdmp
$ objdump --syms /xxx/libdata.so | grep MatchRec8Cmp
0011fe73 l F .text 00000a98 MatchRec8Cmp
Sure enough, 0011fe73 = MatchRec8Cmp - 0xf1a43000. Anyone know why dladdr can't return dli_sname = "MatchRec8Cmp" ???
I'm running Red Hat Enterprise Linux Server release 5.4 (Tikanga). I have seen this work before. Maybe it's my compile switches:
CFLAGS = -m32 -march=i686 -msse3 -ggdb3 -pipe -fno-common -fomit-frame-pointer \
-Ispio -fms-extensions -Wmissing-declarations -Wstrict-prototypes -Wunused -Wall \
-Wno-multichar -Wdisabled-optimization -Wmissing-prototypes -Wnested-externs \
-Wpointer-arith -Wextra -Wno-sign-compare -Wno-sequence-point \
-I../../../include -I/usr/local/include -fPIC \
-D$(Uname) -D_REENTRANT -D_GNU_SOURCE
I have tried it with -g instead of -ggdb3 although I don't think debugging symbols have anything to do with elf.

If objdump can see it, why can't dladdr
dladdr can only see functions exported in the dynamic symbol table. Most likely
nm -D /xxx/libdata.so | grep MatchRec8Cmp
shows nothing. Indeed your objdump shows that the symbol is local, which proves that this is the cause.
The symbol is local either because it has a hidden visibility, is static, or because you hide it in some other way (e.g. with a linker script).
Update:
Those marked with the 'U' work with dladdr. They get "exported" automatically somehow.
They work because they are exported from some other shared library. The U stands for unresolved, i.e. defined elsewhere.

I added -rdynamic to my LDFLAGS.
man gcc says:
-rdynamic
Pass the flag -export-dynamic to the ELF linker, on targets that support it. This instructs the linker to add all symbols, not only used ones, to the
dynamic symbol table. This option is needed for some uses of "dlopen" or to allow obtaining backtraces from within a program.

Adding the gcc option "-export-dynamic" solved this for me.

hinesmr solution worked for me. The exact option I passed gcc was "-Wl,--export-dynamic" and all the functions became visible to dladdr

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight