I have an issue where any Leak Sanitizer backtraces that go through dynamically loaded libraries report Unknown Module for any function calls within that library.
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x4e3e36 in malloc (/usr/sbin/radiusd+0x4e3e36)
#1 0x7fb406e95f69 (<unknown module>)
#2 0x7fb406eafc36 (<unknown module>)
#3 0x7fb406eafd40 (<unknown module>)
#4 0x7fb406ea3364 (<unknown module>)
#5 0x7fb4063de7d4 (<unknown module>)
#6 0x7fb4063c61c4 (<unknown module>)
#7 0x7fb406617863 (<unknown module>)
#8 0x7fb415620681 in dl_load_func /usr/src/debug/freeradius-server-4.0.0/src/main/dl.c:194:34
#9 0x7fb41561edab in dl_symbol_init_walk /usr/src/debug/freeradius-server-4.0.0/src/main/dl.c:301:7
#10 0x7fb41561df1e in dl_module /usr/src/debug/freeradius-server-4.0.0/src/main/dl.c:748:6
#11 0x7fb41561f3db in dl_instance /usr/src/debug/freeradius-server-4.0.0/src/main/dl.c:853:20
#12 0x7fb41564f4ab in module_bootstrap /usr/src/debug/freeradius-server-4.0.0/src/main/module.c:827:6
#13 0x7fb41564ed56 in modules_bootstrap /usr/src/debug/freeradius-server-4.0.0/src/main/module.c:1070:14
#14 0x5352bb in main /usr/src/debug/freeradius-server-4.0.0/src/main/radiusd.c:561:6
#15 0x7fb41282ab34 in __libc_start_main (/lib64/libc.so.6+0x21b34)
#16 0x4204ab in _start (/usr/sbin/radiusd+0x4204ab)
I've had an almost identical issue with valgrind before, and I know it's due to the libraries being unloaded with dlclose on exit, and the symbols being unavailable when the symbolizer runs.
With valgrind the fix is simple
/*
* Only dlclose() handle if we're *NOT* running under valgrind
* as it unloads the symbols valgrind needs.
*/
if (!RUNNING_ON_VALGRIND) dlclose(module->handle); /* ignore any errors */
RUNNING_ON_VALGRIND being a macro provided by the valgrind library for detecting if the program is being valground.
I can't see anything in the LSAN docs for a similar feature for when ASAN_OPTIONS=detect_leaks=1 is set.
Does anyone know if it's possible to perform a runtime check for running under LSAN?
The LSAN interface headers allow the user to define a callback __lsan_is_turned_off to allow the program to disable the leak checker. This callback is only executed if LSAN is enabled.
#include <sanitizer/lsan_interface.h>
static bool running_under_lsan = false;
int __attribute__((used)) __lsan_is_turned_off(void)
{
running_under_lsan = true;
return 0;
}
EDIT: It's actually more complicated than that. As #yugr commented It appears __lsan_is_turned_off is only executed when a process or child process exits.
There is however a solution!
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <sanitizer/common_interface_defs.h>
static int from_child[2] = {-1, -1};
static int pid;
int __attribute__((used)) __lsan_is_turned_off(void)
{
uint8_t ret = 1;
/* Parent */
if (pid != 0) return 0;
/* Child */
if (write(from_child[1], &ret, sizeof(ret)) < 0) {
fprintf(stderr, "Writing LSAN status failed: %s", strerror(errno));
}
close(from_child[1]);
return 0;
}
int main(int argc, char **argv)
{
uint8_t ret = 0;
if (pipe(from_child) < 0) {
fprintf(stderr, "Failed opening internal pipe: %s", strerror(errno));
exit(EXIT_FAILURE);
}
pid = fork();
if (pid == -1) {
fprintf(stderr, "Error forking: %s", strerror(errno));
exit(EXIT_FAILURE);
}
/* Child */
if (pid == 0) {
close(from_child[0]); /* Close parent's side */
exit(EXIT_SUCCESS);
}
/* Parent */
close(from_child[1]); /* Close child's side */
while ((read(from_child[0], &ret, sizeof(ret)) < 0) && (errno == EINTR));
close(from_child[0]); /* Close our side (so we don't leak FDs) */
/* Collect child */
waitpid(pid, NULL, 0);
if (ret) {
printf("Running under LSAN\n");
} else {
printf("Not running under LSAN\n");
}
exit(EXIT_SUCCESS);
}
Example:
clang -g3 -fsanitize=address foo.c
ASAN_OPTIONS='detect_leaks=1' ./a.out
Running under LSAN
ASAN_OPTIONS='detect_leaks=0' ./a.out
Not running under LSAN
First of all, not printing stacktraces on dlclose (or printing incorrect ones) is a known issue in all sanitizers (not just LSan).
Secondly, as of now there's no API to detect that LeakSanitizer is enabled at runtime so your best bet is to manually check that program is linked against Lsan and detect_leaks=0 isn't set in environment:
void (*__lsan_is_turned_off)() = dlsym(RTLD_DEFAULT, "__lsan_is_turned_off");
const char *lsan_opts = getenv("LSAN_OPTIONS");
const char *asan_opts = getenv("ASAN_OPTIONS");
int disable_dlclose = __lsan_is_turned_off != 0 && !__lsan_is_turned_off()
&& !(lsan_opts && (strstr(lsan_opts, "detect_leaks=0") || strstr(lsan_opts, "detect_leaks=false"))
&& !(asan_opts && (strstr(asan_opts, "detect_leaks=0") || strstr(asan_opts, "detect_leaks=false"));
(__lsan_is_turned_off is defined in sanitizer/lsan_interface.h).
If you enable LSan via -fsanitize=address, you can replace __lsan_is_turned_off check with #ifdef __SANITIZE_ADDRESS__.
Related
I'm making a C web server on my raspberry pi and I've come across a problem where when I constantly reload the webpage, the web server gives me a segmentation fault. It also sometimes refuses to run the threads after a while of constant traffic. I used gdb to debug the segfault and this is what I found:
Thread 145 "webServer" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x6f59b440 (LWP 21885)]
memcpy () at ../sysdeps/arm/memcpy.S:196
196 ../sysdeps/arm/memcpy.S: No such file or directory.
(gdb) where
#0 memcpy () at ../sysdeps/arm/memcpy.S:196
#1 0xb6e94b48 in __GI__IO_file_xsgetn (fp=0x52719f08, data=<optimized out>,
n=1) at fileops.c:1303
#2 0xb6e87df8 in __GI__IO_fread (buf=0x2343c <fileLine>, size=1, count=1,
fp=0x52719f08) at iofread.c:38
#3 0x000114c4 in handleClient (pClientSock=0x23830 <clientSock>)
at webServer.c:242
#4 0x000116a0 in giveThreadWork () at webServer.c:298
#5 0xb6f7f494 in start_thread (arg=0x6f59b440) at pthread_create.c:486
#6 0xb6f02568 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73
from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I couldn't find anything for the refusing threads though.
It says that there are some problems on lines: 242 and 298. All there is there, is the following:
Line 242-266
============
while ((freadErr = fread(fileLine, sizeof(fileLine), 1, fpointer)) != 0) {
if (freadErr < 0) {
printf("fread error\n");
perror("fread");
close(acceptSock);
memset_all();
return NULL;
}
if ((send(acceptSock, httpResponse, strlen(httpResponse), MSG_NOSIGNAL)) == -1) {
printf("write error1\n");
perror("we1");
close(acceptSock);
memset_all();
return NULL;
}
if ((send(acceptSock, fileLine, sizeof(fileLine), MSG_NOSIGNAL)) == -1) {
printf("write error2\n");
perror("we2");
close(acceptSock);
memset_all();
return NULL;
}
memset(fileLine, 0, 1);
memset(httpResponse, 0, 1000);
line 285-304 (298 is where error is)
==================
void *giveThreadWork() {
while (1) {
int *pclient;
pthread_mutex_lock(&mutex);
if ((pclient = dequeue()) == NULL) {
pthread_cond_wait(&condition_var, &mutex);
pclient = dequeue();
}
pthread_mutex_unlock(&mutex);
if (pclient != NULL) {
handleClient(pclient);
} else {
printf("\n\n\n\n\n no more listener \n\n\n\n\n");
close(*pclient);
}
}
}
I can't get anymore info from gdb but maybe someone else knows how. I've tried changing the strcpy() functions and checked all the string manipulation functions (or i'm pretty sure I have) and I've found nothing.
Here is all the code if anyone needs it: https://www.toptal.com/developers/hastebin/xokipuvidu.c
Hopefully someone can help or point me in the right direction
Summary
When my code calls BIO_do_connect it jumps back to the start of the function that called it and then segfaults.
What Tried
Debugger, Valgrind, changing code
// compiled with:
// gcc -g -O0 -Wall -Wextra -o sslex sslex_main.c -fstack-protector -lssl -lcrypto
#include <stdio.h>
#include <string.h>
#include <openssl/ssl.h>
#include <openssl/bio.h>
#include <openssl/err.h>
// BIO handles communication including files and sockets.
static BIO* g_bio = NULL;
// holds SSL connection information
static SSL_CTX* g_sslContext = NULL;
char* g_trustedStore = "certs/trusted.pem";
void initialize() {
SSL_load_error_strings();
ERR_load_BIO_strings();
OpenSSL_add_all_algorithms();
}
int connect(char* hostnamePort) {
SSL* sslp = NULL;
BIO* out = NULL;
printf("Connect called\n");
printf("Connecting... to %s\n", hostnamePort);
g_sslContext = SSL_CTX_new(TLS_client_method());
// load trusted certificate store
if (! SSL_CTX_load_verify_locations(g_sslContext, g_trustedStore, NULL)) {
fprintf(stderr, "Failure loading certificats from trusted store %s!\n", g_trustedStore);
fprintf(stderr, "Error: %s\n", ERR_reason_error_string(ERR_get_error()));
return -1;
}
g_bio = BIO_new_ssl_connect(g_sslContext);
if (g_bio == NULL) {
fprintf(stderr, "Error cannot get BSD Input/Output\n");
return -1;
}
// retrieve ssl pointer of the BIO
BIO_get_ssl(g_bio, &sslp);
if (sslp == NULL) {
fprintf(stderr, "Could not locate SSL pointer\n");
fprintf(stderr, "Error: %s\n", ERR_reason_error_string(ERR_get_error()));
return -1;
}
// if server wants a new handshake, handle that in the background
SSL_set_mode(sslp, SSL_MODE_AUTO_RETRY);
// attempt to connect
BIO_set_conn_hostname(g_bio, hostnamePort);
out = BIO_new_fp(stdout, BIO_NOCLOSE);
printf("Connecting to: %s\n", BIO_get_conn_hostname(g_bio));
// THIS LINE CAUSES STACK SMASH
if (BIO_do_connect(g_bio) <= 0) { // BUGGY LINE
fprintf(stderr, "Error cannot connect to %s\n", hostnamePort);
fprintf(stderr, "Error: %s\n", ERR_reason_error_string(ERR_get_error()));
BIO_free_all(g_bio);
SSL_CTX_free(g_sslContext);
return -1;
}
return -1;
}
void close_connection() {
BIO_free_all(g_bio);
SSL_CTX_free(g_sslContext);
}
int main(int argc, char* argv[]) {
char* hostnamePort = argv[1];
initialize();
if (connect(hostnamePort) != 0)
return 0;
printf("Done connecting. doing operation\n");
close_connection();
return 0;
}
Expected Result:
"Connect called" should be displayed only once.
Program should not Segmentation fault.
Actual Output:
./sslex 192.168.11.141
Connect called
Connecting... to 192.168.11.141
Connecting to: 192.168.11.141
Connect called
Segmentation fault (core dumped)
Debugger Output and Backtrace:
Starting program: sslex 192.168.11.141
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.
Connect called
Connecting... to 192.168.11.141
Connecting to: 192.168.11.141
Breakpoint 3, connect (hostnamePort=0x7fffffffe9db "192.168.11.141") at sslex_main.c:57
57 if (BIO_do_connect(g_bio) <= 0) { // BUGGY LINE
(gdb) bt
#0 connect (hostnamePort=0x7fffffffe9db "192.168.11.141") at sslex_main.c:57
#1 0x000055555555503a in main (argc=2, argv=0x7fffffffe698) at sslex_main.c:75
(gdb) s
Connect called
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff733d646 in ?? ()
(gdb) bt
#0 0x00007ffff733d646 in ?? ()
#1 0x00007ffff72e94d3 in ?? ()
#2 0x0000000000000000 in ?? ()
Your function connect() is hiding the standard neworking library function of the same name that OpenSSL is calling to make the actual TCP connection, but instead of getting the library one, it's calling yours.
Rename your function (say, to do_connect()) so it won't clash with the one from the library.
I'm having issues while implementing a multithread program.
The program seems to work fine for a single thread (when I set THREADS to 1) but for NTHREADS > 1, I'm getting the following error:
Segmentation fault (core dumped)
or
double free or corruption (!prev)
or
free(): invalid size: 0xb6b00a10 ***
0Aborted (core dumped)
as you can see the error varies a lot and I'm getting confused.
The program I'm executing is the following:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <pthread.h>
#define NTHREADS 5
typedef struct data_t
{
int num;
FILE *fp;
pthread_mutex_t mutex;
int thread_id;
}data_t;
void writefp(int num1, FILE *fp){
if(fp!=NULL){
int i;
int nume = 1;
int long_var=log10(nume);
for(i=long_var;i>=0;i--){
nume=(num1 / (int) round(pow(10, i)) % 10);
char d=nume+'0';
fwrite(&d, 1, 1, fp);
printf("%c", d);
}
}
fclose(fp);
}
void *thread_writefp(void* args)
{
data_t *data = (data_t *)args;
printf(" Thread id %d\n", data->thread_id);
pthread_mutex_lock(&(data->mutex));
writefp(data->num, data->fp);
pthread_mutex_unlock(&(data->mutex));
pthread_exit(NULL);
}
int randomf(){
int num,i;
for(i = 0; i<2; i++) {
num = rand()%100000+1;
}
return num;
}
int prime(int num1){
int is_prime=1;
int i = 2;
printf("Number: ");
while( i<=num1/2 && is_prime==1 ) {
printf("%i ", i);
if(i%30==0){
printf("\n");
}
if( num1 % i == 0 ) {
is_prime = 0;
}
i++;
}
printf("\n");
if(is_prime){
printf("%i is number prime\n", num1);
}else{
printf("NO is prime %i\n",num1);
}
return 0;
}
int main(void){
int i;
//int num1=randomf();
srand(time(NULL));
FILE *fp = fopen("fich.txt", "w+b");
data_t data;
pthread_t consumers_thread[NTHREADS];
data.mutex = (pthread_mutex_t) PTHREAD_MUTEX_INITIALIZER;
data.fp = fp;
//writefp( num1, fp);
for(i = 0; i < NTHREADS; i++)
{
data.num = randomf();
data.thread_id = i;
printf("Number prime is %i\n", prime(data.num));
if(pthread_create(&consumers_thread[i], NULL,
thread_writefp, (void*) &data) != 0)
{
fprintf(stderr, "%s\n", "Error creating thread!");
return EXIT_FAILURE;
}
}
// wait for all consumers thread to finish
for(i = 0; i < NTHREADS; ++i)
{
pthread_join(consumers_thread[i], NULL);
}
return EXIT_SUCCESS;
}
I compile the program as follows :
$gcc -pthread -Wall -o consummer consummer.c -lm
Here are for exemple tree error I got when I ran it with gdb tree successive time without changing anything to the code:
1
Thread 2 "consummer" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7cc1b40 (LWP 18122)]
tcache_thread_freeres () at malloc.c:3003
3003 malloc.c: No such file or directory.
(gdb) bt
#0 tcache_thread_freeres () at malloc.c:3003
#1 0xb7e258c2 in __libc_thread_freeres () at thread-freeres.c:29
#2 0xb7ea03ad in start_thread (arg=0xb7cc1b40) at pthread_create.c:478
#3 0xb7dbb0a6 in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108
(gdb)
2
Thread 3 "consummer" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb72ffb40 (LWP 18131)]
0xb7d2af2b in __GI__IO_fwrite (buf=0xb72ff30f, size=1, count=1, fp=0x404160) at iofwrite.c:37
37 iofwrite.c: No such file or directory.
(gdb) run
3
Thread 3 "consummer" received signal SIGABRT, Aborted.
[Switching to Thread 0xb74c0b40 (LWP 18143)]
0xb7fd7cf9 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7fd7cf9 in __kernel_vsyscall ()
#1 0xb7cf17e2 in __libc_signal_restore_set (set=0xb74bfe9c) at ../sysdeps/unix/sysv/linux/nptl-signals.h:80
#2 __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3 0xb7cf2f51 in __GI_abort () at abort.c:90
#4 0xb7d340cc in __libc_message (action=(do_abort | do_backtrace), fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:181
#5 0xb7d3af5d in malloc_printerr (action=<optimized out>, str=0xb7e418d8 "double free or corruption (!prev)", ptr=<optimized out>,
ar_ptr=0xb7e967a0 <main_arena>) at malloc.c:5425
#6 0xb7d3bb3b in _int_free (av=0xb7e967a0 <main_arena>, p=<optimized out>, have_lock=have_lock#entry=0) at malloc.c:4174
#7 0xb7d3fcb0 in __GI___libc_free (mem=0x404160) at malloc.c:3144
#8 0xb7e2587d in tcache_thread_freeres () at malloc.c:3004
#9 0xb7e258c2 in __libc_thread_freeres () at thread-freeres.c:29
#10 0xb7ea03ad in start_thread (arg=0xb74c0b40) at pthread_create.c:478
#11 0xb7dbb0a6 in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108
(gdb)
I'd like/apreciate your help to know what I did wrong please. Thanks in advance.
Per this answer (but see Edit 2), multiple threads cannot safely access the same FILE *fp. As #IlyaBursov pointed out, you only have one data_t data shared across all threads — and, therefore, only one FILE * data.fp.
Thanks for your comment noting that you moved the fopen into the thread function. That way each thread independently opens and closes the file, so there is no FILE * sharing between threads.
This seems to be implementation-dependent — I was not able to reproduce the issue on Cygwin x64 with gcc 6.4.0. I suspect the effect of the mutex may also vary by implementation. It may also be dependent on compiler options — see this example.
Edit As #MichaelDorgan pointed out, calling fclose on a FILE * that other threads are using is also a bad idea.
Edit 2 As #JohnBollinger points out, individual stream operations are thread-safe these days. That would suggest that the fclose before another thread tried to access the file might be the problem. However, I wonder if perhaps the OP's stdio implementation is non-conformant in some way. I would think a compliant fwrite would simply return error on an access to a closed file, rather than crashing. See further comments below.
I have a program that is using a library called "wjelement", whenever I try to use this library with FastCGI I get a segfault. I have made a simplified test case below. If I compile the code without fcgi_stdio.h and do not link against the library, the code works fine, if I add the fastcgi header and link against it I get a segfault, even if I don't use any fast cgi calls.
In My FastCGI code the opposite is also true, if I remove the WJelement code the rest of the program works fine.
I'm not sure if I need to blame my program, the FastCGI Library, or the WJElement library...
#include <stdio.h>
#include <fcgi_stdio.h>
#include <wjreader.h>
int main (int argc, char *argv[]) {
FILE *my_schema_file;
my_schema_file = fopen("test_schema.json", "rb");
if (my_schema_file == NULL) {
printf("Failed to open test schema file\n");
return 1;
} else {
printf("Opened test schema file\n");
}
WJReader my_schema_reader;
my_schema_reader = WJROpenFILEDocument(my_schema_file, NULL, 0);
if (my_schema_reader == NULL) {
printf("Failed to open test schema reader\n");
return 1;
} else {
printf("Opened test schema reader\n");
}
return 0;
}
GDB Backtrace:
Program received signal SIGSEGV, Segmentation fault.
0x0000003e19e6c85f in __GI__IO_fread (buf=0x6023c4, size=1, count=2731, fp=0x602250) at iofread.c:41
41 _IO_acquire_lock (fp);
(gdb) backtrace
#0 0x0000003e19e6c85f in __GI__IO_fread (buf=0x6023c4, size=1, count=2731, fp=0x602250) at iofread.c:41
#1 0x00007ffff7dde5d9 in WJRFileCallback () from /lib/libwjreader.so.0
#2 0x00007ffff7dde037 in WJRFillBuffer () from /lib/libwjreader.so.0
#3 0x00007ffff7dde4e9 in _WJROpenDocument () from /lib/libwjreader.so.0
#4 0x000000000040081f in main (argc=1, argv=0x7fffffffdeb8) at test.c:20
Found the answer here: http://www.fastcgi.com/devkit/doc/fcgi-devel-kit.htm
If your application passes FILE * to functions implemented in libraries for which you do not have source code, then you'll need to include the headers for these libraries before you include fcgi_stdio.h
I then had to convert from FCGI_FILE * to FILE * with FCGI_ToFILE(FCGI_FILE *);
#include <stdio.h>
#include <wjreader.h>
#include <fcgi_stdio.h>
int main (int argc, char *argv[]) {
FILE *my_schema_file;
my_schema_file = fopen("test_schema.json", "rb");
if (my_schema_file == NULL) {
printf("Failed to open test schema file\n");
return 1;
} else {
printf("Opened test schema file\n");
}
WJReader my_schema_reader;
my_schema_reader = WJROpenFILEDocument(FCGI_ToFILE(my_schema_file), NULL, 0);
if (my_schema_reader == NULL) {
printf("Failed to open test schema reader\n");
return 1;
} else {
printf("Opened test schema reader\n");
}
return 0;
}
The standard way would be the following:
if (ptrace(PTRACE_TRACEME, 0, NULL, 0) == -1)
printf("traced!\n");
In this case, ptrace returns an error if the current process is traced (e.g., running it with GDB or attaching to it).
But there is a serious problem with this: if the call returns successfully, GDB may not attach to it later. Which is a problem since I'm not trying to implement anti-debug stuff. My purpose is to emit an 'int 3' when a condition is met (e.g., an assert fails) and GDB is running (otherwise I get a SIGTRAP which stops the application).
Disabling SIGTRAP and emitting an 'int 3' every time is not a good solution because the application I'm testing might be using SIGTRAP for some other purpose (in which case I'm still screwed, so it wouldn't matter, but it's the principle of the thing :))
On Windows there is an API, IsDebuggerPresent, to check if process is under debugging. At Linux, we can check this with another way (not so efficient).
Check "/proc/self/status" for "TracerPid" attribute.
Example code:
#include <sys/stat.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <ctype.h>
bool debuggerIsAttached()
{
char buf[4096];
const int status_fd = open("/proc/self/status", O_RDONLY);
if (status_fd == -1)
return false;
const ssize_t num_read = read(status_fd, buf, sizeof(buf) - 1);
close(status_fd);
if (num_read <= 0)
return false;
buf[num_read] = '\0';
constexpr char tracerPidString[] = "TracerPid:";
const auto tracer_pid_ptr = strstr(buf, tracerPidString);
if (!tracer_pid_ptr)
return false;
for (const char* characterPtr = tracer_pid_ptr + sizeof(tracerPidString) - 1; characterPtr <= buf + num_read; ++characterPtr)
{
if (isspace(*characterPtr))
continue;
else
return isdigit(*characterPtr) != 0 && *characterPtr != '0';
}
return false;
}
The code I ended up using was the following:
int
gdb_check()
{
int pid = fork();
int status;
int res;
if (pid == -1)
{
perror("fork");
return -1;
}
if (pid == 0)
{
int ppid = getppid();
/* Child */
if (ptrace(PTRACE_ATTACH, ppid, NULL, NULL) == 0)
{
/* Wait for the parent to stop and continue it */
waitpid(ppid, NULL, 0);
ptrace(PTRACE_CONT, NULL, NULL);
/* Detach */
ptrace(PTRACE_DETACH, getppid(), NULL, NULL);
/* We were the tracers, so gdb is not present */
res = 0;
}
else
{
/* Trace failed so GDB is present */
res = 1;
}
exit(res);
}
else
{
waitpid(pid, &status, 0);
res = WEXITSTATUS(status);
}
return res;
}
A few things:
When ptrace(PTRACE_ATTACH, ...) is successful, the traced process will stop and has to be continued.
This also works when GDB is attaching later.
A drawback is that when used frequently, it will cause a serious slowdown.
Also, this solution is only confirmed to work on Linux. As the comments mentioned, it won't work on BSD.
You could fork a child which would try to PTRACE_ATTACH its parent (and then detach if necessary) and communicates the result back. It does seem a bit inelegant though.
As you mention, this is quite costly. I guess it's not too bad if assertions fail irregularly. Perhaps it'd be worthwhile keeping a single long-running child around to do this - share two pipes between the parent and the child, child does its check when it reads a byte and then sends a byte back with the status.
I had a similar need, and came up with the following alternatives
static int _debugger_present = -1;
static void _sigtrap_handler(int signum)
{
_debugger_present = 0;
signal(SIGTRAP, SIG_DFL);
}
void debug_break(void)
{
if (-1 == _debugger_present) {
_debugger_present = 1;
signal(SIGTRAP, _sigtrap_handler);
raise(SIGTRAP);
}
}
If called, the debug_break function will only interrupt if a debugger is attached.
If you are running on x86 and want a breakpoint which interrupts in the caller (not in raise), just include the following header, and use the debug_break macro:
#ifndef BREAK_H
#define BREAK_H
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
int _debugger_present = -1;
static void _sigtrap_handler(int signum)
{
_debugger_present = 0;
signal(SIGTRAP, SIG_DFL);
}
#define debug_break() \
do { \
if (-1 == _debugger_present) { \
_debugger_present = 1; \
signal(SIGTRAP, _sigtrap_handler); \
__asm__("int3"); \
} \
} while(0)
#endif
I found that a modified version of the file descriptor "hack" described by Silviocesare and blogged by xorl worked well for me.
This is the modified code I use:
#include <stdio.h>
#include <unistd.h>
// gdb apparently opens FD(s) 3,4,5 (whereas a typical prog uses only stdin=0, stdout=1,stderr=2)
int detect_gdb(void)
{
int rc = 0;
FILE *fd = fopen("/tmp", "r");
if (fileno(fd) > 5)
{
rc = 1;
}
fclose(fd);
return rc;
}
If you just want to know whether the application is running under GDB for debugging purposes, the simplest solution on Linux is to readlink("/proc/<ppid>/exe"), and search the result for "gdb".
This is similar to terminus' answer, but uses pipes for communication:
#include <unistd.h>
#include <stdint.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#if !defined(PTRACE_ATTACH) && defined(PT_ATTACH)
# define PTRACE_ATTACH PT_ATTACH
#endif
#if !defined(PTRACE_DETACH) && defined(PT_DETACH)
# define PTRACE_DETACH PT_DETACH
#endif
#ifdef __linux__
# define _PTRACE(_x, _y) ptrace(_x, _y, NULL, NULL)
#else
# define _PTRACE(_x, _y) ptrace(_x, _y, NULL, 0)
#endif
/** Determine if we're running under a debugger by attempting to attach using pattach
*
* #return 0 if we're not, 1 if we are, -1 if we can't tell.
*/
static int debugger_attached(void)
{
int pid;
int from_child[2] = {-1, -1};
if (pipe(from_child) < 0) {
fprintf(stderr, "Debugger check failed: Error opening internal pipe: %s", syserror(errno));
return -1;
}
pid = fork();
if (pid == -1) {
fprintf(stderr, "Debugger check failed: Error forking: %s", syserror(errno));
return -1;
}
/* Child */
if (pid == 0) {
uint8_t ret = 0;
int ppid = getppid();
/* Close parent's side */
close(from_child[0]);
if (_PTRACE(PTRACE_ATTACH, ppid) == 0) {
/* Wait for the parent to stop */
waitpid(ppid, NULL, 0);
/* Tell the parent what happened */
write(from_child[1], &ret, sizeof(ret));
/* Detach */
_PTRACE(PTRACE_DETACH, ppid);
exit(0);
}
ret = 1;
/* Tell the parent what happened */
write(from_child[1], &ret, sizeof(ret));
exit(0);
/* Parent */
} else {
uint8_t ret = -1;
/*
* The child writes a 1 if pattach failed else 0.
*
* This read may be interrupted by pattach,
* which is why we need the loop.
*/
while ((read(from_child[0], &ret, sizeof(ret)) < 0) && (errno == EINTR));
/* Ret not updated */
if (ret < 0) {
fprintf(stderr, "Debugger check failed: Error getting status from child: %s", syserror(errno));
}
/* Close the pipes here, to avoid races with pattach (if we did it above) */
close(from_child[1]);
close(from_child[0]);
/* Collect the status of the child */
waitpid(pid, NULL, 0);
return ret;
}
}
Trying the original code under OS X, I found waitpid (in the parent) would always return -1 with an EINTR (System call interrupted). This was caused by pattach, attaching to the parent and interrupting the call.
It wasn't clear whether it was safe to just call waitpid again (that seemed like it might behave incorrectly in some situations), so I just used a pipe to do the communication instead. It's a bit of extra code, but will probably work reliably across more platforms.
This code has been tested on OS X v10.9.3 (Mavericks), Ubuntu 14.04 (Trusty Tahr) (3.13.0-24-generic) and FreeBSD 10.0.
For Linux, which implements process capabilities, this method will only work if the process has the CAP_SYS_PTRACE capability, which is typically set when the process is run as root.
Other utilities (gdb and lldb) also have this capability set as part of their filesystem metadata.
You can detect whether the process has effective CAP_SYS_PTRACE by linking against -lcap,
#include <sys/capability.h>
cap_flag_value_t value;
cap_t current;
/*
* If we're running under Linux, we first need to check if we have
* permission to to ptrace. We do that using the capabilities
* functions.
*/
current = cap_get_proc();
if (!current) {
fprintf(stderr, "Failed getting process capabilities: %s\n", syserror(errno));
return -1;
}
if (cap_get_flag(current, CAP_SYS_PTRACE, CAP_PERMITTED, &value) < 0) {
fprintf(stderr, "Failed getting permitted ptrace capability state: %s\n", syserror(errno));
cap_free(current);
return -1;
}
if ((value == CAP_SET) && (cap_get_flag(current, CAP_SYS_PTRACE, CAP_EFFECTIVE, &value) < 0)) {
fprintf(stderr, "Failed getting effective ptrace capability state: %s\n", syserror(errno));
cap_free(current);
return -1;
}
C++ version of Sam Liao's answer (Linux only):
// Detect if the application is running inside a debugger.
bool being_traced()
{
std::ifstream sf("/proc/self/status");
std::string s;
while (sf >> s)
{
if (s == "TracerPid:")
{
int pid;
sf >> pid;
return pid != 0;
}
std::getline(sf, s);
}
return false;
}