I need to recover my source code from the executable - c

It's the middle of the night, and I've accidently overwritten all my work by typing
gcc source.c -o source.c
I still have the original binary and my only hope is to dissemble it, but I don't know how or what the best tool to use to get the most readable result. I know this probably isn't the right place to post but I'm stressing out. Can someone help me out please?

Thanks for uploading the file. As I suspected, it was unstripped so the function names remained. Besides standard boilerplate code I could identify functions main, register_broker, connect_exchange (unused and empty) and handle_requests.
I spent a bit of time in IDA Pro and it wasn't too hard to recover the main() function. First, here's the original, unmodified listing of main() from IDA: http://pastebin.com/sBxhRJMM
To proceed, you need to familiarize yourself with AMD64 calling convention. To summarize, the first four arguments are passed in RDI(EDI), RSI(ESI), RDX(EDX) and RCX(ECX). The rest is passed on the stack, but all calls in main() use only up to four arguments so we don't need to worry about that.
IDA has helpfully labeled arguments of the standard C functions and even renamed some local variables. However, it can be improved and commented further. For example, since we're in main(), we know that argc (first argument) comes from EDI (since it's an int meaning 32-bit, it uses only the low half of RDI) and argv comes from RSI (it's a pointer so it uses the full 8 bytes of the register). So, we can rename the local variables into which EDI and RSI are copied:
mov [rbp+argc], edi
mov [rbp+argv], rsi
Next is a simple conditional block:
cmp [rbp+argc], 2
jz short loc_400EB3
mov rax, cs:stderr##GLIBC_2_2_5
mov rdx, rax
mov eax, offset aUsage ; "Usage"
mov rcx, rdx ; s
mov edx, 5 ; n
mov esi, 1 ; size
mov rdi, rax ; ptr
call _fwrite
mov edi, 1 ; status
call _exit
Here we compare argc with 2, and if it is equal, we jump further in the code. If it is not equal, we call fwrite(). The first argument to it is in rdi, and rdi is loaded from rax, which holds the address of a constant string "Usage". The second argument is in esi and is 1, the third in edx and is 5, the fourth in rcx, which is loaded from rdx which has the value of stderr##GLIBC_2_2_5, which is basically a fancy reference to the stderr variable from libc. Stringing it all up together, we get:
fwrite("Usage", 1, 5, stderr);
From my experience, I can say that most likely it is an inlined fprintf, since 5 is exactly the length of the string. I.e. the original code probably was:
fprintf(stderr, "Usage");
Next call is a simple exit(1);. Combining both with the comparison, we get:
if ( argc != 2 )
{
fprintf(stderr, "Usage");
exit(1);
}
Continuing in this vein, we can identify other calls and variables they use. It's somewhat tedious to describe it all, so I uploaded a commented version of the disassembly, where I tried to show the equivalent C code for each call. You can see it here: http://pastebin.com/p5sRSwgQ
From that commented version it's not very hard to imagine a possible version of main():
int main(int argc, char **argv)
{
if ( argc != 2 )
{
fprintf(stderr, "Usage");
exit(1);
}
char name[256];
gethostname(name, sizeof(name));
struct hostent* _hostent = gethostbyname(name);
struct in_addr *_addr0 = (struct in_addr *)(_hostent->h_addr_list[0]);
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(0);
addr.sin_addr.s_addr = _addr0->s_addr;
char *tmp = (char *)malloc(6);
sprintf(tmp, "%d", addr.sin_port);
char *ip_str = inet_ntoa(*_addr0);
char *newbuf = (char *)malloc(strlen(argv[1]) + strlen(ip_str) + strlen(tmp) + 5);
strcpy(newbuf, "r");
strcat(newbuf, " ");
strcat(newbuf, argv[1]);
strcat(newbuf, " ");
strcat(newbuf, ip_str);
strcat(newbuf, " ");
strcat(newbuf, tmp);
register_broker(newbuf);
int fd = socket(PF_INET, SOCK_STREAM, 0);
if ( fd < 0 )
{
perror("Error creating socket");
exit(1);
}
if ( bind(fd, (struct sockaddr*)&addr, sizeof(addr)) != 0 )
{
perror("Error binding socket");
exit(1);
}
if ( listen(fd, 0x80) != 0 )
{
perror("Error listening on socket");
exit(1);
}
handle_requests(fd);
}
Recovering the other two functions is left an exercise for the reader :)

There are several tools (you can search with Google) but I would suggest to re-code it. The time you will invest into refactoring what a disassebler will return is probably higher than re-coding.
I know it seems obvious but the correct answer would be: restore from a backup (that you should have)

There is unfortunately really no good way to go from the binary back to the source. You can try Boomerang, but I really don't expect good results.

Firstly, look for a backup source file. Most editors create files named .bak or filename.c~ with each file save. On a Windows machine, a forensic software tool might be able to retrieve the last source file(s). The tool I wrote, getfile used to be offered by NTI, but was acquired by Armor Holdings a few years ago—no idea if it is still available.
If the code is runnable, oftentimes running it under the strace() utility (a standard component of Linux distributions) can help with some aspects of decoding the program, especially if it is i/o oriented. Alas, if the program is mostly internal data manipulation, this is not of much use. Strace() creates a log of the system calls and parameters passed by the program; it is an invaluable tool at times for understanding how a program behaves. for example, strace date produces (in part—I've omitted the runtime library startup):
clock_gettime(CLOCK_REALTIME, {1315760058, 681379835}) = 0
open("/etc/localtime", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78b5000
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 4096) = 2819
_llseek(3, -24, [2795], SEEK_CUR) = 0
read(3, "\nPST8PDT,M3.2.0,M11.1.0\n", 4096) = 24
_llseek(3, 2818, [2818], SEEK_SET) = 0
close(3) = 0
munmap(0xb78b5000, 4096) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78b5000
write(1, "Sun Sep 11 09:54:18 PDT 2011\n", 29Sun Sep 11 09:54:18 PDT 2011) = 29
close(1) = 0
munmap(0xb78b5000, 4096) = 0
close(2) = 0
As soon as you have anything worth saving:
Add some sort of source control (git, svn, cvs, ...) maybe more than one
Use an automated build tool, like make to avoid silly mistakes
Make backups once in a while. Even when I am at a stone-knives-and-bear-skins client, I can still email myself source files for a last-ditch backup mechanism.

You can use dcc. But, next time, you should use Git ;)

You can try disassembling with objdump -d <filename>.
You can also look at the symbol names with the nm utility to jog your memory and help recode the source.
The commercial IDA Pro disassembler/debugger is popular in software reverse engineering. Unfortunately, reverse engineering a binary is slow and difficult work.

Related

seccomp --- how to EXIT_SUCCESS?

Ηow to EXIT_SUCCESS after strict mode seccomp is set. Is it the correct practice, to call syscall(SYS_exit, EXIT_SUCCESS); at the end of main?
#include <stdlib.h>
#include <unistd.h>
#include <sys/prctl.h>
#include <linux/seccomp.h>
#include <sys/syscall.h>
int main(int argc, char **argv) {
prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
//return EXIT_SUCCESS; // does not work
//_exit(EXIT_SUCCESS); // does not work
// syscall(__NR_exit, EXIT_SUCCESS); // (EDIT) This works! Is this the ultimate answer and the right way to exit success from seccomp-ed programs?
syscall(SYS_exit, EXIT_SUCCESS); // (EDIT) works; SYS_exit equals __NR_exit
}
// gcc seccomp.c -o seccomp && ./seccomp; echo "${?}" # I want 0
As explained in eigenstate.org and in SECCOMP (2):
The only system calls that the calling thread is permitted to
make are read(2), write(2), _exit(2) (but not exit_group(2)),
and sigreturn(2). Other system calls result in the delivery
of a SIGKILL signal.
As a result, one would expect _exit() to work, but it's a wrapper function that invokes exit_group(2) which is not allowed in strict mode ([1], [2]), thus the process gets killed.
It's even reported in exit(2) - Linux man page:
In glibc up to version 2.3, the _exit() wrapper function invoked the kernel system call of the same name. Since glibc 2.3, the wrapper function invokes exit_group(2), in order to terminate all of the threads in a process.
Same happens with the return statement, which should end up in killing your process, in the very similar manner with _exit().
Stracing the process will provide further confirmation (to allow this to show up, you have to not set PR_SET_SECCOMP; just comment prctl()) and I got similar output for both non-working cases:
linux12:/home/users/grad1459>gcc seccomp.c -o seccomp
linux12:/home/users/grad1459>strace ./seccomp
execve("./seccomp", ["./seccomp"], [/* 24 vars */]) = 0
brk(0) = 0x8784000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775f000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=97472, ...}) = 0
mmap2(NULL, 97472, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7747000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\226\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
mmap2(NULL, 1739484, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xdd0000
mmap2(0xf73000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xf73000
mmap2(0xf76000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf76000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7746000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7746900, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xf73000, 8192, PROT_READ) = 0
mprotect(0x8049000, 4096, PROT_READ) = 0
mprotect(0x16e000, 4096, PROT_READ) = 0
munmap(0xb7747000, 97472) = 0
exit_group(0) = ?
linux12:/home/users/grad1459>
As you can see, exit_group() is called, explaining everything!
Now as you correctly stated, "SYS_exit equals __NR_exit"; for example it's defined in mit.syscall.h:
#define SYS_exit __NR_exit
so the last two calls are equivalent, i.e. you can use the one you like, and the output should be this:
linux12:/home/users/grad1459>gcc seccomp.c -o seccomp && ./seccomp ; echo "${?}"
0
PS
You could of course define a filter yourself and use:
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, filter);
as explained in the eigenstate link, to allow _exit() (or, strictly speaking, exit_group(2)), but do that only if you really need to and know what you are doing.
The problem occurs, because the GNU C library uses the exit_group syscall, if it is available, in Linux instead of exit, for the _exit() function (see sysdeps/unix/sysv/linux/_exit.c for verification), and as documented in the man 2 prctl, the exit_group syscall is not allowed by the strict seccomp filter.
Because the _exit() function call occurs inside the C library, we cannot interpose it with our own version (that would just do the exit syscall). (The normal process cleanup is done elsewhere; in Linux, the _exit() function only does the final syscall that terminates the process.)
We could ask the GNU C library developers to use the exit_group syscall in Linux only when there are more than one thread in the current process, but unfortunately, it would not be easy, and even if added right now, would take quite some time for the feature to be available on most Linux distributions.
Fortunately, we can ditch the default strict filter, and instead define our own. There is a small difference in behaviour: the apparent signal that kills the process will change from SIGKILL to SIGSYS. (The signal is not actually delivered, as the kernel does kill the process; only the apparent signal number that caused the process to die changes.)
Furthermore, this is not even that difficult. I did waste a bit of time looking into some GCC macro trickery that would make it trivial to manage the allowed syscalls' list, but I decided it would not be a good approach: the list of allowed syscalls should be carefully considered -- we only add exit_group() compared to the strict filter, here! -- so making it a bit difficult is okay.
The following code, say example.c, has been verified to work on a 4.4 kernel (should work on kernels 3.5 or later) on x86-64 (for both x86 and x86-64, i.e. 32-bit and 64-bit binaries). It should work on all Linux architectures, however, and it does not require or use the libseccomp library.
#define _GNU_SOURCE
#include <stdlib.h>
#include <stddef.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <linux/seccomp.h>
#include <linux/filter.h>
#include <stdio.h>
static const struct sock_filter strict_filter[] = {
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, (offsetof (struct seccomp_data, nr))),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_rt_sigreturn, 5, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_read, 4, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_write, 3, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_exit, 2, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_exit_group, 1, 0),
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL),
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW)
};
static const struct sock_fprog strict = {
.len = (unsigned short)( sizeof strict_filter / sizeof strict_filter[0] ),
.filter = (struct sock_filter *)strict_filter
};
int main(void)
{
/* To be able to set a custom filter, we need to set the "no new privs" flag.
The Documentation/prctl/no_new_privs.txt file in the Linux kernel
recommends this exact form: */
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
fprintf(stderr, "Cannot set no_new_privs: %m.\n");
return EXIT_FAILURE;
}
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &strict)) {
fprintf(stderr, "Cannot install seccomp filter: %m.\n");
return EXIT_FAILURE;
}
/* The seccomp filter is now active.
It differs from SECCOMP_SET_MODE_STRICT in two ways:
1. exit_group syscall is allowed; it just terminates the
process
2. Parent/reaper sees SIGSYS as the killing signal instead of
SIGKILL, if the process tries to do a syscall not in the
explicitly allowed list
*/
return EXIT_SUCCESS;
}
Compile using e.g.
gcc -Wall -O2 example.c -o example
and run using
./example
or under strace to see the syscalls and library calls done;
strace ./example
The strict_filter BPF program is really trivial. The first opcode loads the syscall number into the accumulator. The next five opcodes compare it to an acceptable syscall number, and if found, jump to the final opcode that allows the syscall. Otherwise the second-to-last opcode kills the process.
Note that although the documentation refers to sigreturn being the allowed syscall, the actual name of the syscall in Linux is rt_sigreturn. (sigreturn was deprecated in favour of rt_sigreturn ages ago.)
Furthermore, when the filter is installed, the opcodes are copied to kernel memory (see kernel/seccomp.c in the Linux kernel sources), so it does not affect the filter in any way if the data is modified later. Having the structures static const has zero security impact, in other words.
I used static since there is no need for the symbols to be visible outside this compilation unit (or in a stripped binary), and const to put the data into the read-only data section of the ELF binary.
The form of a BPF_JUMP(BPF_JMP | BPF_JEQ, nr, equals, differs) is simple: the accumulator (the syscall number) is compared to nr. If they are equal, then the next equals opcodes are skipped. Otherwise, the next differs opcodes are skipped.
Since the equals cases jump to the very final opcode, you can add new opcodes at the top (that is, just after the initial opcode), incrementing the equals skip count for each one.
Note that printf() will not work after the seccomp filter is installed, because internally, the C library wants to do a fstat syscall (on standard output), and a brk syscall to allocate some memory for a buffer.

Doing an ASM call / ret in C

I try to do a simple call / ret sequence in assembly (from c code compiled with GCC), by manually writing the ret op code, and making a call to the ret address:
void *addr;
addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
// Writing the ret op code
((char*)addr)[0] = 0xC3;
// Going to addr with the ret
asm volatile("call *%0" : : "r" (addr));
But I get a segmentation fault. Anyone would know why, and how to correct ?
In order to be able to execute instructions on a memory page, read and write privileges are not enough; it also needs to be marked executable (PROT_EXEC).

getaddrinfo stucks forever when linked with sqlite3

I have a program which requires a DNS query and a sqlite3 DB connection.
I have determined that it hangs indefinitely at a getaddrinfo() call. So I created a test program (from busybox's nslookup.c) with only this call. When I do not link the libsqlite3 it works as expected. The code segment is as follows:
#include <arpa/inet.h>
#include <netdb.h>
#include <resolv.h>
#include <string.h>
#include <signal.h>
static int sockaddr_to_dotted(struct sockaddr *saddr, char *buf, int buflen)
{
if (buflen <= 0) return -1;
buf[0] = '\0';
if (saddr->sa_family == AF_INET)
{
inet_ntop(AF_INET, &((struct sockaddr_in*)saddr)->sin_addr, buf, buflen);
return 0;
}
if (saddr->sa_family == AF_INET6)
{
inet_ntop(AF_INET6, &((struct sockaddr_in6*)saddr)->sin6_addr, buf, buflen);
return 0;
}
return -1;
}
static int print_host(const char *hostname, const char *header)
{
char str[128]; /* IPv6 address will fit, hostnames hopefully too */
struct addrinfo *result = NULL;
int rc;
struct addrinfo hint;
memset(&hint, 0, sizeof(hint));
/* hint.ai_family = AF_UNSPEC; - zero anyway */
/* Needed. Or else we will get each address thrice (or more)
* for each possible socket type (tcp,udp,raw...): */
hint.ai_socktype = SOCK_STREAM;
// hint.ai_flags = AI_CANONNAME;
printf("BEFORE GETADDRINFO\n");
rc = getaddrinfo(hostname, NULL /*service*/, &hint, &result);
printf("AFTER GETADDRINFO\n");
if (!rc)
{
struct addrinfo *cur = result;
// printf("%s\n", cur->ai_canonname); ?
while (cur)
{
sockaddr_to_dotted(cur->ai_addr, str, sizeof(str));
printf("%s %s\nAddress: %s\n", header, hostname, str);
str[0] = ' ';
if (getnameinfo(cur->ai_addr, cur->ai_addrlen, str + 1,
sizeof(str) - 1, NULL, 0, NI_NAMEREQD))
str[0] = '\0';
puts(str);
cur = cur->ai_next;
}
}
else
{
printf("getaddrinfo('%s') failed: %s", hostname, gai_strerror(rc));
}
freeaddrinfo(result);
return (rc != 0);
}
int main(int argc, char **argv)
{
if (argc != 2)
return -1;
res_init();
return print_host(argv[1], "Name: ");
}
I can only see "BEFORE GETADDRINFO" on the output.
I also tried to strace the program. (My dns server is 192.168.11.11, and queried "www.google.com") This is where it suspends:
socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.11.11")}, 16) = 0
send(3, "\0\2\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, 0) = 32
pselect6(4, [3], NULL, NULL, {10, 0}, 0) = 1 (in [3], left {9, 988000000})
recv(3, "\0\2\201\200\0\1\0\5\0\0\0\0\3www\6google\3com\0\0\1\0"..., 512, 0) = 112
close(3) = 0
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]
My compiler is bfin-linux-uclibc-gcc (gcc version 4.1.2)
I cross compiled sqlite3 for bfin-linux-uclibc (version 3.6.23)
I appreciate any comment, help, debug procedure suggestion.
output of strace -e trace=file mybinary:
stat("/etc/ld.so.cache", {st_mode=S_IFREG|0644, st_size=1073, ...}) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
open("/lib/libsqlite3.so.0", O_RDONLY) = 3
open("/lib/libstdc++.so.6", O_RDONLY) = 3
open("/lib/libm.so.0", O_RDONLY) = 3
open("/lib/libgcc_s.so.1", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
open("/lib/libdl.so.0", O_RDONLY) = 3
open("/lib/libpthread.so.0", O_RDONLY) = 3
open("/lib/libgcc_s.so.1", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
open("/lib/libm.so.0", O_RDONLY) = 3
open("/lib/libgcc_s.so.1", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
open("/lib/libc.so.0", O_RDONLY) = 3
stat("/lib/ld-uClibc.so.0", {st_mode=S_IFREG|0755, st_size=29824, ...}) = 0
open("/etc/resolv.conf", O_RDONLY) = 3
open("/etc/hosts", O_RDONLY) = 3
Output of bfin-linux-uclibc-nm -g mybinary
00004fc4 A ___bss_start
w ___deregister_frame_info##GCC_3.0
00004f10 D ___dso_handle
00004fc4 A __edata
00004fe0 A __end
00000d60 T __fini
U _freeaddrinfo
U _gai_strerror
U _getaddrinfo
U _getnameinfo
U _inet_ntop
00000534 T __init
w __Jv_RegisterClasses
00000aa4 T _main
U _printf
U _puts
w ___register_frame_info##GCC_3.0
U ___res_init
00000e18 R __ROFIXUP_END__
00000de0 R __ROFIXUP_LIST__
00000670 T ___self_reloc
00020000 A __stacksize
0000060c T __start
U ___uClibc_main
Updated information shows libpthread being loaded, so the scenario is likely SQLite was built with pthread support enabled (default on most platforms), and your binary was not.
The clue is the presence of libpthread and the hang at rt_sigsuspend(), this is an explicit wait for a signal, and is very likely one thread waiting for another thread to exit, which never happens of course.
The background to this is that since C and the standard library/libc pre-date contemporary threading, there are many cases where the standard library or API is either not re-entrant or not thread-safe, or both. Back when dragons roamed the land it was common for the programmer to have to explicitly call alternate versions of such functions (names suffixed with "_r") or use alternate libraries (again usually with an "_r" suffix) to ensure that code behaved correctly. pthreads changed the programming interface for the better, but since thread-safety comes at a cost (performance, sometime substantial, and code size) it's not enabled unless you ask for it.
When you use -pthread at least two things usually happen:
_REENTRANT is defined as a preprocessor macro, this may change compile time behaviour
libpthread is linked in (equivalent to -lpthread), this will change run-time behaviour
It would take some non-trivial debugging to be certain, but what probably happened is that your binary ended up mixing the stub pthread functions in uClibc with a handful of the real pthread functions. This is because libpthread was not loaded explicitly, only the pthread symbols referenced by libsqlite were imported.
uClibc contains (as does glibc) dummy pthread functions (run nm on libc.so to see), these are defined as "weak" symbols, when the real libpthread is loaded explicitly it takes over all entry points with its "strong" symbols. (These stubs exists so that thread-aware libraries can work with non-threaded programs without changes.)
Building your binary with an explicit -pthread eliminates this mismatch, and resolves the issue.
For debugging:
Run nm -g and ldd (the uClibc version) against your compiled binary, and check which symbols are in which library, and see if you can spot a mismatch. Setting LD_DEBUG=all when running your program should be useful too (you'll probably want to redirect stderr for that, there will be a lot of output).
The SQLite library has a .init section, but as far as I can tell it's a stub that doesn't call any internal functions, so simply linking shouldn't cause SQLite code to execute.
Since SQLite uses threads, make sure you built thread-safe, and are using the .so dynamic library.
When you link against your build of SQLite, make sure you use both -L (compile-time) and -R (run-time) library paths, usually something like this before compile & link will do the trick (amend the path as needed):
export CFLAGS=-L/usr/local/sqlite3/lib
export LDFLAGS=-R/usr/local/sqlite3/lib
Test program:
#include<stdio.h>
#include<sqlite3.h>
int main(int argc,char *argv[]) {
printf("SQLite version (compile): %s\n",SQLITE_VERSION);
printf("SQLite version (API): %s\n",sqlite3_libversion());
}
If you run this and get different versions, then something is definitely wrong with your build environment.
These guesses don't directly solve this problem, but I'll leave them here for the record:
Normally my first guess would usually be an NSS library run-time/compile-time library mismatch: as you're using the system getaddrinfo() NSS (name service switch) is involved. This will dlopen() various libraries to support various user/group/host databases, depending on /etc/nsswitch.conf, including local file, DNS, LDAP, Berkeley and quite possibly SQLite. Since uClibc doesn't support this (glibc style libnss_xxx.so), that's one thing ruled out...
There's another possibility: PAM does something similar, and may load an incompatible library (BerkeleyDB or possibly SQLite, as used by pam_userdb or pam-sqlite). Neither uClibc nor SQLite use PAM though, and it's improbable that it's being linked by accident.)
Since dlopen() is used you won't see such libraries (NSS or PAM) with ldd, running under strace -e trace=file should help to confirm what libraries are being used, without the usual volume of output.

mmap substitute for malloc

I need to find a way to use mmap instead of malloc. How is this possible? (I am not using libc only syscalls) And yes brk() is possible. I used sbrk() but realized its not sys-call... (x86 inline assembly)
I've been looking around and saw this: How to use mmap to allocate a memory in heap? But it didn't help for me, because I had a segfault.
Basically, all I want to do a create 3 slabs of memory for storing characters.
Say,
char * x = malloc(1000);
char * y = malloc(2000);
char * z = malloc(3000);
How is this possible with mmap and how to free it later with munmap?
Did you carefully read the mmap(2) man page? I recommend reading it several times.
Notice that you can only ask the kernel [thru mmap etc...] to manage memory aligned to and multiple of the page size sysconf(_SC_PAGE_SIZE) which is often 4096 bytes (and I am supposing that in my answer).
Then you might do:
size_t page_size = sysconf(_SC_PAGE_SIZE);
assert (page_size == 4096); // otherwise this code is wrong
// 1000 bytes fit into 1*4096
char *x = mmap (NULL, page_size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS, -1, (off_t)0);
if (x == MMAP_FAILED) perror("mmap x"), exit (EXIT_FAILURE);
// 2000 bytes fit into 1*4096
char *y = mmap (NULL, page_size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS, -1, (off_t)0);
if (y == MMAP_FAILED) perror("mmap y"), exit (EXIT_FAILURE);
later to free the memory, use
if (munmap(x, page_size))
perror("munmap x"), exit(EXIT_FAILURE);
etc
If you want to allocate 5Kbytes, you'll need two pages (because 5Kbytes < 2*4096 and 5Kbytes > 1*4096) i.e. mmap(NULL, 2*page_size, ...
Actually, all of your x, y, z takes only 8000 bytes and could fit into two, not three, pages... But then you could only munmap that memory together.
Be aware that mmap is a system call which might be quite expensive. malloc implementations take care to avoid calling it too often, that is why they manage previously free-d zones to reuse them later (in further malloc-s) without any syscall. In practice, most malloc implementations manage differently big allocations (e.g. more than a megabyte), which are often mmap-ed at malloc and munmap-ed at free time.... You could study the source code of some malloc. The one from MUSL Libc might be easier to read than the Glibc malloc.
BTW, the file /proc/1234/maps is showing you the memory map of process of pid 1234. Try also cat /proc/self/maps in a terminal, it shows the memory map of that cat process.
You can call mmap to make an anonymous mapping in x86 asm with something like:
mov eax, 192 ; mmap
xor ebx, ebx ; addr = NULL
mov ecx, 4096 ; len = 4096
mov edx, $7 ; prot = PROT_READ|PROT_WRITE|PROT_EXEC
mov esi, $22 ; flags = MAP_PRIVATE|MAP_ANONYMOUS
mov edi, -1 ; fd = -1 (Ignored for MAP_ANONYMOUS)
xor ebp, ebp ; offset = 0 (4096*0) (Ignored for MAP_ANONYMOUS)
int $80 ; make call (There are other ways to do this too)

Debugging a system call from FUSE

I'm writing a FUSE filesystem that does some mapping through sqlite, then passes the calls through to the underlying filesystem (somewhat of an expansion on bbfs). It started giving me trouble when I tried to start making files. When I call mknod, it returns with ERANGE. Here's the tail of an strace (filesystem is mounted on test/):
$ ./p4fs test/
$ strace touch test/kilo 2> logs
$ cat logs
...
fstat(3, {st_mode=S_IFREG|0644, st_size=56467024, ...}) = 0
mmap(NULL, 56467024, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fbf006bf000
close(3) = 0
close(0) = 0
open("test/kilo", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 ERANGE (Numerical result out of range)
futimesat(AT_FDCWD, "test/kilo", NULL) = 0
close(1) = 0
exit_group(0) = ?
and here's the relevant section from my internal logging:
getattr: database opened
getattr: requesting attr for /kilo
db_getrowid: statement executed: finding rowid of /kilo
db_getrowid: mapped /kilo to rowid 0
getattr: does not exist: /kilo
mknod: database opened
mknod: statement executed: checking for existing path
mknod: calling db_mkdentry(db, /kilo, 100644, 0, 0)
db_mkdentry: parent is /
db_getrowid: statement executed: finding rowid of /
db_getrowid: mapped / to rowid 1
db_mkdentry: statement executed: creating dentry /kilo
db_getrowid: statement executed: finding rowid of /kilo
db_getrowid: mapped /kilo to rowid 3
p4fs: calling system mknod(3, 100644, 0)
p4fs: got errno 13
I'm looking for (1) the solution to this immediate problem and (2) a good way to debug FUSE in general. I have a sneaking suspicion that the ERANGE is coming from strtol(), but I don't know how to check. I wish I could make gdb pop up when it hits the callback...
Thanks!
EDIT: Oh, here's the source for my mknod() function:
static int p4_mknod(const char *path, mode_t mode, dev_t dev) {
sqlite3 *db;
sqlite3_stmt *statement;
char query[MAX_QUERY_LENGTH];
int rc;
int return_value;
int path_exists = -1;
OPEN_LOG("mknod")
OPEN_DB(db_path, db)
/* check for existing filename */
sprintf(query,
"SELECT COUNT(*) FROM dentry "
"WHERE name = '%s'",
path);
sqlite3_prepare(db,
query,
-1,
&statement,
NULL);
rc = sqlite3_step(statement);
SQLITE3_ERRCHK("checking for existing path")
path_exists = sqlite3_column_int(statement, 0);
sqlite3_finalize(statement);
if (path_exists <= 0) {
int physical_rowid;
char physical_name[MAX_QUERY_LENGTH];
/* path is not already in db */
syslog(LOG_DEBUG, "calling db_mkdentry(db, %s, %o, 0, 0)",
path, mode);
db_mkdentry(db, (char *) path, mode, 0, 0);
/* make the actual file */
physical_rowid = db_getrowid(db, (char *) path);
sprintf(physical_name, "%i", physical_rowid);
syslog(LOG_DEBUG, "calling system mknod(%s, %o, %li)",
physical_name, mode, dev);
return_value = mknod(physical_name, mode, dev);
} else {
syslog(LOG_INFO, "called on existing path");
return_value = -EEXIST;
}
syslog(LOG_DEBUG, "errno %i", errno);
return errno;
}
A few pieces of advice:
Do not use sprintf and friends to build SQL statements for sqlite3. It is recommended that you use hosted parameters in your statement and bind values to them using the sqlite3_bind functions.
Always prefer snprintf instead of sprintf and check its output. It will save you from lots of trouble.
Make sure you run your FUSE filesystem process in the foreground - it makes debugging easier.
Have you tried breakpoints in gdb? Or a bunch of perror() calls in your code to locate where the errno value that you mentioned comes from?
BTW there is no strtol() call in the code snippet you provided and there are a few macros without their definitions. Also IIRC 13 is the error code for EACCESS.
EDIT:
Something that you may have missed from the FUSE API:
A major exception is that instead of
returning an error in 'errno', the
operation should return the negated
error value (-errno) directly.
You seem to be returning errno as-is.
It's not really an answer, but I got around the issue by running as root. I suspect it has something to do with FUSE, because I had a similar issue when trying to get sshfs to run as a normal user a few months ago.

Resources