So I've got a bunch of worker threads doing simple curl class, each worker thread has his own curl easy handle. They are doing only HEAD lookups on random web sites. Also locking functions are present to enable multi threaded SSL as documented here. Everything is working except on 2 web pages ilsole24ore.com ( seen in example down ), and ninemsn.com.au/ , they sometimes produce seg fault as shown in trace output shown here
#0 *__GI___libc_res_nquery (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024, answerp=0xb4d0d234,
answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:182
#1 0x00434e8b in __libc_res_nquerydomain (statp=0xb4d12df4, name=0xb4d0ca10 "", domain=0x0, class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:576
#2 0x004352b5 in *__GI___libc_res_nsearch (statp=0xb4d12df4, name=0x849e9bd "ilsole24ore.com", class=1, type=1, answer=0xb4d0ca10 "", anslen=1024,
answerp=0xb4d0d234, answerp2=0x0, nanswerp2=0x0, resplen2=0x0) at res_query.c:377
#3 0x009c0bd6 in *__GI__nss_dns_gethostbyname3_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
errnop=0xb4d12b30, h_errnop=0xb4d0d614, ttlp=0x0, canonp=0x0) at nss_dns/dns-host.c:197
#4 0x009c0f2b in _nss_dns_gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, result=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512,
errnop=0xb4d12b30, h_errnop=0xb4d0d614) at nss_dns/dns-host.c:251
#5 0x0079eacd in __gethostbyname2_r (name=0x849e9bd "ilsole24ore.com", af=2, resbuf=0xb4d0d5fc, buffer=0xb4d0d300 "\177", buflen=512, result=0xb4d0d618,
h_errnop=0xb4d0d614) at ../nss/getXXbyYY_r.c:253
#6 0x00760010 in gaih_inet (name=<value optimized out>, service=<value optimized out>, req=0xb4d0f83c, pai=0xb4d0d764, naddrs=0xb4d0d754)
at ../sysdeps/posix/getaddrinfo.c:531
#7 0x00761a65 in *__GI_getaddrinfo (name=0x849e9bd "ilsole24ore.com", service=0x0, hints=0xb4d0f83c, pai=0xb4d0f860) at ../sysdeps/posix/getaddrinfo.c:2160
#8 0x00917f9a in ?? () from /usr/lib/libkrb5support.so.0
#9 0x003b2f45 in krb5_sname_to_principal () from /usr/lib/libkrb5.so.3
#10 0x0028a278 in ?? () from /usr/lib/libgssapi_krb5.so.2
#11 0x0027eff2 in ?? () from /usr/lib/libgssapi_krb5.so.2
#12 0x0027fb00 in gss_init_sec_context () from /usr/lib/libgssapi_krb5.so.2
#13 0x00d8770e in ?? () from /usr/lib/libcurl.so.4
#14 0x00d62c27 in ?? () from /usr/lib/libcurl.so.4
#15 0x00d7e25b in ?? () from /usr/lib/libcurl.so.4
#16 0x00d7e597 in ?? () from /usr/lib/libcurl.so.4
#17 0x00d7f133 in curl_easy_perform () from /usr/lib/libcurl.so.4
My function looks something like this
int do_http_check(taskinfo *info,standardResult *data)
{
standardResultInit(data);
char errorBuffer[CURL_ERROR_SIZE];
CURL *curl;
CURLcode result;
curl = curl_easy_init();
if(curl)
{
//required options first
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, errorBuffer);
curl_easy_setopt(curl, CURLOPT_URL, info->address.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writer);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data->body);
curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, writer);
curl_easy_setopt(curl, CURLOPT_WRITEHEADER, &data->head);
curl_easy_setopt(curl, CURLOPT_DNS_USE_GLOBAL_CACHE,0);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 30 );
curl_easy_setopt(curl, CURLOPT_NOSIGNAL,1);
curl_easy_setopt(curl, CURLOPT_NOBODY,1);
curl_easy_setopt(curl, CURLOPT_TIMEOUT ,240);
//optional options
if(info->options.follow)
{
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(curl, CURLOPT_MAXREDIRS, info->options.redirects);
}
result = curl_easy_perform(curl);
if (result == CURLE_OK)
{
data->success = true;
curl_easy_getinfo(curl,CURLINFO_RESPONSE_CODE,&data->httpMsg);
curl_easy_getinfo(curl,CURLINFO_REDIRECT_COUNT,&data->numRedirects);
data->msg = "OK";
}
else
{
... handle error
}
return 1;
}
Now, when i call function without any threads, just calling it from main it never breaks, so I was thinking its connected to threads, or maybe how data return structure is being returned, but from what I saw in trace it looks like fault is generated in easy_perform() call, and its confusing me.
So if someone has any idea where should i look next it would be most helpful, thanks.
There is a whole section dedicated in libcurl to Multi-Threading.
The first basic rule is that you must
never share a libcurl handle (be it
easy or multi or whatever) between
multiple threads. Only use one handle
in one thread at a time.
libcurl is completely thread safe,
except for two issues: signals and
SSL/TLS handlers. Signals are used for
timing out name resolves (during DNS
lookup) - when built without c-ares
support and not on Windows.
If you are accessing HTTPS or FTPS
URLs in a multi-threaded manner, you
are then of course using the
underlying SSL library multi-threaded
and those libs might have their own
requirements on this issue. Basically,
you need to provide one or two
functions to allow it to function
properly. For all details, see this:
OpenSSL
http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION
GnuTLS
http://www.gnu.org/software/gnutls/manual/html_node/Multi_002dthreaded-applications.html
NSS
is claimed to be thread-safe already
without anything required.
yassl
Required actions unknown.
When using multiple threads you should
set the CURLOPT_NOSIGNAL option to 1
for all handles. Everything will or
might work fine except that timeouts
are not honored during the DNS lookup
- which you can work around by building libcurl with c-ares support.
c-ares is a library that provides
asynchronous name resolves. On some
platforms, libcurl simply will not
function properly multi-threaded
unless this option is set.
Also, note that
CURLOPT_DNS_USE_GLOBAL_CACHE is not
thread-safe.
As mentioned in error: longjmp causes uninitialized stack frame, the latest libcurl versions (>= 7.32.0) in Debian/Ubuntu repositories contain a new multithreaded resolver to solve these problems. The c-ares support is not a good solution:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570436#74
"The real problem is that c-ares is not yet a full replacement for gethostby* functions (e.g. it does not support multicast DNS) and enabling it in stock libcurl packages may not be a good move (note that these are words of the upstream author of both curl and c-ares, not mine)."-
Related
As mentioned in the title, gdb behaves weirdly when I try to set a breakpoint at linux socket functions such as send etc. I've read through similar threads where it's been suggested to use the debug argument, but I can't set it as I'm just messing around with different Linux programs/video games and I've noticed the same behaviour - for the most part, send can't be backtraced. Only the familiar "< memory address > in ??" messages are shown, and the addresses themselves don't point to anything (can't be retrieved). At the same time, the message buffer in send (or sendto etc) is stuck at one value and not updating (while all the other values such as len are, in real time). I suppose these are simply limitations of gdb, but I'd appreciate it if someone more knowledgeable could shed some light on the issue.
EDIT:
First I'll list my steps:
As an example, I'm trying to backtrace the sendto in openarena (free linux quake-based game). Openarena in particular is not ELF readable, but I get the same results with other ELF-readable files. Because it isn't ELF-readable, I can only attach to a running process. So I type gdb /usr/games/openarena -p < process name > , though I'm pretty sure the binary path is redundant in this case
(it still says "0x7ffc8a81ebe0s": not in executable format: file format not recognized, but I'm able to list functions and everything anyway) As a side note, attaching produces this bug:
((( https://forum.manjaro.org/t/critical-bug-gdb-broken-with-last-stable-update/53155
"Error while reading shared library symbols for /lib/x86_64-linux-gnu/libpthread.so.0:"
However, in my case, I'm still able to attach, but the program eventually crashes after complaining about not being able to find a thread. This also often happens when joining a server from a lobby, but this is a side note, as I've tested programs by running them directly from gdb as well which doesn't produce the error, but still led to this weird socket behaviour. )))
So after attaching to the process, I type source script, the script containing:
break sendto
commands 1
backtrace -raw-frame-arguments on
continue
end
I then resume the program and it's firing backtraces in realtime. This is sample output after joining a server:
Thread 1 "ioquake3" hit Breakpoint 1, __libc_sendto (fd=43, buf=0x7ffefc5028f0, len=32, flags=0, addr=..., addrlen=16) at ../sysdeps/unix/sysv/linux/sendto.c:25
25 in ../sysdeps/unix/sysv/linux/sendto.c
#0 __libc_sendto (fd=43, buf=0x7ffefc5028f0, len=32, flags=0, addr=..., addrlen=16) at ../sysdeps/unix/sysv/linux/sendto.c:25
#1 0x00005642f88bc5fc in ?? ()
#2 0x00005642f88baf94 in ?? ()
#3 0x00005642f888b1ee in ?? ()
#4 0x00005642f88779cf in ?? ()
#5 0x00005642f8886572 in ?? ()
#6 0x00005642f88a58bf in ?? ()
#7 0x00005642f886e3f5 in main ()
Thread 1 "ioquake3" hit Breakpoint 1, __libc_sendto (fd=43, buf=0x7ffefc5028f0, len=34, flags=0, addr=..., addrlen=16) at ../sysdeps/unix/sysv/linux/sendto.c:25
25 in ../sysdeps/unix/sysv/linux/sendto.c
#0 __libc_sendto (fd=43, buf=0x7ffefc5028f0, len=34, flags=0, addr=..., addrlen=16) at ../sysdeps/unix/sysv/linux/sendto.c:25
#1 0x00005642f88bc5fc in ?? ()
#2 0x00005642f88baf94 in ?? ()
#3 0x00005642f888b1ee in ?? ()
#4 0x00005642f88779cf in ?? ()
#5 0x00005642f8886572 in ?? ()
#6 0x00005642f88a58bf in ?? ()
#7 0x00005642f886e3f5 in main ()
As you can see, there is nothing between main and the send, the socket buffer is stuck at the same message, while len is updating correctly. I can perform any kind of actions, jump, shoot, and the output still stays the same. As I mentioned, I get pretty much the same output with other applications. There's some main function/loop, then nothing and then just the send function.
As for my system specs, I'm on Kubuntu 21.04,
GDB version is: GNU gdb (Ubuntu 10.1-2ubuntu2) 10.1.90.20210411-git
Glibc: Ubuntu GLIBC 2.33-0ubuntu5
I've migrated recently from an earlier LTS release, the upgrade might not have been entirely clean, I suppose...
As you can see, there is nothing between main and the send, the socket buffer is stuck at the same message, while len is updating correctly.
A few points:
There are no function names (which isn't the same as "nothing"). That is expected IF there is no symbol table. GDB uses symbol table(s) from loaded binaries to translate addresses into function names.
If the binary is fully stripped, or if the code generated into memory directly, or if the binary is decompressed or decrypted into memory, then you would need to teach GDB where it can get the symbol table from (if the symbol table exists at all, which isn't a given).
The socket buffer being "stuck" is not necessarily unexpected either: the program is very likely to be doing repeated sendto calls using the same stack buffer. Like this:
while (!error) {
char buf[4096];
int n = copy_to(buf); // fill buf[] with data
if (sendto(fd, buf, n, ...) != n) // handle error
}
Update:
I still don't quite understand why I can't see buffer values change.
You are not looking at the buffer contents, you are looking at the buffer address (i.e. &buf[0] given the code above).
If you want to look at the buffer contents, you need to print / examine it. E.g. to examine the first 8 bytes being sent, add this to your breakpoint command: x/8cx buf. But also note that it is common to have a fixed prefix on all the packets being sent, and it's not guaranteed that the 8 leading bytes will change on every packet either.
I have used cURL in my app. It works fine (no errors) in debug mode. However, if i switch code to Release build, app start crashing. I am using VC 2013
My code:
data_downloads.curl = curl_easy_init();
data_downloads.curlData = (CURL_DOWNLOADED_DATA *)malloc(sizeof(CURL_DOWNLOADED_DATA));
data_downloads.curlData->data = (char *)malloc(sizeof(char));
data_downloads.curlData->data[0] = '\0';
curl_easy_setopt(data_downloads.curl, CURLOPT_WRITEFUNCTION, &my_curl_writeCallback);
curl_easy_setopt(data_downloads.curl, CURLOPT_WRITEDATA, data_downloads->curlData);
curl_easy_setopt(data_downloads.curl, CURLOPT_VERBOSE, 1L); //tell curl to output its progress
curl_easy_setopt(data_downloads.curl, CURLOPT_URL, USER_INFO_URL);
curl_easy_setopt(data_downloads.curl, CURLOPT_COOKIEFILE, "cookie.txt");
curl_easy_perform(data_downloads.curl); //-- it crashes here
I noticed that in debug mode, VC adds some space to the stack at every of your functions. Would a function briefly be out of bounds and overwrite a few bytes of the stack, then this will not be noticed...until you compile for release.
You should further check that all libraries for the release mode are the proper ones. VC knows many library types for many models (for DLLs, multi-threaded/not, ...). Check them against the libraries for your debugging mode.
These are the issues I came across. There may be other issues.
I'm developing a system that tracks objects with a P(an)T(ilt)Z(oom) camera which can be controlled via HTTP requests. The C application I develop is supposed to receive position data of the tracked object and to send commands to the camera to control the pan and tilt angle. In addition to these commands the camera has to receive a session refresh command every 5 seconds. HTTP Digest Authorization has to be used for the connection.
I'm sending the HTTP request with libcurl. I figured already out that for digest auth one needs to use on and the same curl handle for all requests in this stackoverflow post.
For sending the session refresh command periodically I tried to use a thread which is just doing this:
while(1)
{
usleep(5000000);
sessionContinue(g_Config.cam_ip);
}
With sessionContinue looking like this:
CURLcode sessionContinue(char* url)
{
CURLcode res;
char requestURL[40];
char referer[47];
struct curl_slist *headers=NULL;
strcpy(requestURL , url);
strcat(requestURL, CAM_SESSION_CONTINUE);
strcpy(referer , "Referer: http://");
strcat(referer , url);
strcat(referer , CAM_MONITOR);
headers = curl_slist_append(headers,"Connection:keep-alive");
headers = curl_slist_append(headers, camCookie);
// In windows, this will init the winsock stuff
curl_global_init(CURL_GLOBAL_ALL);
curl_easy_reset(curl);
if(curl)
{
// First set the URL that is about to receive our POST. This URL can
//just as well be a https:// URL if that is what should receive the
//data.
curl_easy_setopt( curl , CURLOPT_URL , requestURL );
curl_easy_setopt( curl , CURLOPT_HTTPHEADER , headers );
curl_easy_setopt( curl , CURLOPT_HTTPGET , 1 );
curl_easy_setopt( curl , CURLOPT_USERNAME , "root" );
curl_easy_setopt( curl , CURLOPT_PASSWORD , "password" );
curl_easy_setopt( curl , CURLOPT_HTTPAUTH , CURLAUTH_BASIC | CURLAUTH_DIGEST );
// Perform the request, res will get the return code
res = curl_easy_perform(curl);
// Check for errors
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed # %s:%d : %s\n", curl_easy_strerror(res) , __FILE__ , __LINE__ );
}
return res;
}
The application always crashed with segmentation fault after executing curl_easy_perform(curl). So I read the libcurl tutorial again and now I know that using one curl handle in multiple threads is a no go.
What I tried then was to use a timer with SIGALRM to implement the periodic session refresh. This didn't change the problem with the crash at curl_easy_perform(curl). The strange thing is that the application doesn't crash when sending the normal command to control the pan and tilt position which uses the same curl handle. The only difference between session refresh and pan/tilt command is that session refresh uses GET and pan/tilt uses POST.
Are there any other possibilities to send pan/tilt commands continuously with a short pause every 5 seconds used to send the session refresh?
You have a long range of problems in one small program. Here's a few:
You might overflow one of those small fixed-size buffers with the dangerous unbounded C functions you use. Quite likely one of them is the reason for the segfault.
curl_global_init() is documented to be called once, you call it over and over again - this even without calling curl_global_cleanup() in between. You obviously call curl_easy_init() somewhere out of the function and you should move the global init there.
'referer' gets filled with data but is never used otherwise
Another advice is to use CURLOPT_ERRORBUFFER to get error messages in rather than curl_easy_strerror() as you may get some extra details then. And of course to set CURLOPT_VERBOSE while debugging the request to see that things look the way you want it.
Thanks for your comment Daniel Stenberg. I'm now calling curl_global_init() just once when the handle has been set up. referer wasn't really needed here, but I had forgotten to remove it before pasting the code here.
The reason for the segmentation fault was that the session refresh command and the commands for pan and tilt tried to use one and the same curl handle at the same time, which obviously can't really work. So the solution with the timer and SIGALRM wasn't the problem. The segmentation faults have been solved by adding a mutex lock to avoid concurrent accesses to the curl handle.
Hello what I am trying to do is send post method twice, however when I send it a second time the information from the first time is also being included and I do not want that.
To illustrate what I mean, this is the code that sends using post method. (the handle curl was already created)
void process(char* transferBuffer) {
curl_easy_setopt(curl, CURLOPT_URL, "http://localhost/cpp.php");
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, transferBuffer);
res = curl_easy_perform(curl);
if (res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
}
If I do something like:
process("name=John"); - webserver receives name=John
process("name=El"); - webserver receives name=John AND name=El
What I want to do is somehow clean previously used data;
the curl handle was already created ... What I want to do is somehow clean previously used data
All I can say is that if you want to reuse your curl handle - which is a best practice, you should reset it with curl_easy_reset before re-setting your options and re-performing the transfer.
Note that without the complete sample code (including the creation of your curl handle, etc) it is quite hard to provide a detailed answer.
I've a project aiming to run php-cgi chrooted for mass virtual hosting (more than 10k virtual host), with each virtual host having their own chroot, under Ubuntu Lucid x86_64.
I would like to avoid creating the necessary environment inside each chroot for things like /dev/null, /dev/zero, locales, icons... and whatever which could be needed by php modules thinking that they run outside chroot.
The goal is to make php-cgi run inside a chroot, but allowing him access to files outside the chroot as long as those files are (for most of them) opened in read-only mode, and on an allowed list (/dev/log, /dev/zero, /dev/null, path to the locales...)
The obvious way seems to create (or use if it exists) a kernel module, which could hook and redirect trusted open() paths, outside of the chroot.
But I don't think it's the easiest way:
I've never done a kernel module, so I do not correctly estimate the difficulty.
There seems to be multiple syscall to hook file "open" (open, connect, mmap...), but I guess there is a common kernel function for everything related to file opening.
I do want to minimize the number of patchs to php or it's module, to minimize the amount of work needed each time I will update our platform to the latest stable PHP release (and so update from upstream PHP releases more often and quickly), so I find better to patch the behavior of PHP from the outside (because we have a particular setup, so patching PHP and propose patch to upstream is not relevant).
Instead, I'm currently trying an userland solution : hook libc functions with LD_PRELOAD, which works well in most cases and is really quick to implement, but I've encountered a problem which I'm unable to resolve alone.
(The idea is to talk to a daemon running outside the chroot, and get file descriptor from it using ioctl SENDFD and RECVFD).
When I call syslog() (without openlog() first), syslog() calls connect() to open a file.
Example:
folays#phenix:~/ldpreload$ strace logger test 2>&1 | grep connect
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(1, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0
So far so good, I've tried to hook the connect() function of libc, without success.
I've also tried to put some flags to dlopen() inside the _init() function of my preload library to test if some of them could make this work, without success
Here is the relevant code of my preload library:
void __attribute__((constructor)) my_init(void)
{
printf("INIT preloadz %s\n", __progname);
dlopen(getenv("LD_PRELOAD"), RTLD_NOLOAD | RTLD_DEEPBIND | RTLD_GLOBAL |
RTLD_NOW);
}
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
{
printf("HOOKED connect\n");
int (*f)() = dlsym(RTLD_NEXT, "connect");
int ret = f(sockfd, addr, addrlen);
return ret;
}
int __connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
{
printf("HOOKED __connect\n");
int (*f)() = dlsym(RTLD_NEXT, "connect");
int ret = f(sockfd, addr, addrlen);
return ret;
}
But the connect() function of the libc still takes precedence over mine:
folays#phenix:~/ldpreload$ LD_PRELOAD=./lib-preload.so logger test
INIT preloadz logger
[...] no lines with "HOOKED connect..." [...]
folays#phenix:~/ldpreload$
Looking at the code of syslog() (apt-get source libc6 , glibc-2.13/misc/syslog.c), it seems to call openlog_internal, which in turn call __connect(), at misc/syslog.c line 386:
if (LogFile != -1 && !connected)
{
int old_errno = errno;
if (__connect(LogFile, &SyslogAddr, sizeof(SyslogAddr))
== -1)
{
Well, objdump shows me connect and __connect in the dynamic symbol table of libc:
folays#phenix:~/ldpreload$ objdump -T /lib/x86_64-linux-gnu/libc.so.6 |grep -i connec
00000000000e6d00 w DF .text 000000000000005e GLIBC_2.2.5 connect
00000000000e6d00 w DF .text 000000000000005e GLIBC_2.2.5 __connect
But no connect symbol in the dynamic relocation entries, so I guess that it explains why I cannot successfully override the connect() used by openlog_internal(), it probably does not use dynamic symbol relocation, and probably has the address of the __connect() function in hard (a relative -fPIC offset?).
folays#phenix:~/ldpreload$ objdump -R /lib/x86_64-linux-gnu/libc.so.6 |grep -i connec
folays#phenix:~/ldpreload$
connect is a weak alias to __connect:
eglibc-2.13/socket/connect.c:weak_alias (__connect, connect)
gdb is still able to breakpoint on the libc connect symbol of the libc:
folays#phenix:~/ldpreload$ gdb logger
(gdb) b connect
Breakpoint 1 at 0x400dc8
(gdb) r test
Starting program: /usr/bin/logger
Breakpoint 1, connect () at ../sysdeps/unix/syscall-template.S:82
82 ../sysdeps/unix/syscall-template.S: No such file or directory.
in ../sysdeps/unix/syscall-template.S
(gdb) c 2
Will ignore next crossing of breakpoint 1. Continuing.
Breakpoint 1, connect () at ../sysdeps/unix/syscall-template.S:82
82 in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0 connect () at ../sysdeps/unix/syscall-template.S:82
#1 0x00007ffff7b28974 in openlog_internal (ident=<value optimized out>, logstat=<value optimized out>, logfac=<value optimized out>) at ../misc/syslog.c:386
#2 0x00007ffff7b29187 in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x40198e "%s", ap=0x7fffffffdd40) at ../misc/syslog.c:274
#3 0x00007ffff7b293af in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131
Of course, I could completely skip this particular problem by doing an openlog() myself, but I guess that I will encounter the same type of problem with some others functions.
I don't really understand why openlog_internal does not use dynamic symbol relocation to call __connect(), and if it's even possible to hook this __connect() call by using simple LD_PRELOAD mechanism.
The others way I see how it could be done:
Load libc.so from an LD_PRELOAD with dlopen, get the address of the libc's __connect with dlsym() and then patch the function (ASM wise) to get the hook working. It seems really overkill and error prone.
Use a modified custom libc for PHP to fix those problems directly at the source (open / connect / mmap functions...)
Code a LKM, to redirect file access where I want. Pros : no need of ioctl(SENDFD) and no daemon outside the chroot.
I would really appreciate to learn, if it is ever possible, how I could still hook the call to __connect() issued by openlog_internal, suggestions, or links to kernel documentation related to syscall hooking and redirection.
My google searches related to "hook syscalls" found lot of references to LSM, but it seems to only allow ACLs answering "yes" or "no", but no redirection of open() paths.
Thanks for reading.
It's definitely not possible with LD_PRELOAD without building your own heavily-modified libc, in which case you might as well just put the redirection hacks directly inside. There are not necessarily calls to open, connect, etc. whatsoever. Instead there may be calls to a similar hidden function bound at library-creation time (not dynamically rebindable) or even inline syscalls, and this can of course change unpredictably with the version.
Your options are either a kernel module, or perhaps using ptrace on everything inside the "chroot" and modifying the arguments to syscalls whenever the tracing process encounters one that needs patching up. Neither sounds easy...
Or you could just accept that you need a minimal set of critical device nodes and files to exist inside a chroot for it to work. Using a different libc in place of glibc, if possible, would help you minimize the number of additional files needed.