Related
I am maintaining/developing an existing code-base. We have a Raspberry Pi controlling a bunch of hardware, some of which is modular. The code, written in C (might be C++), communicates using IPv4 standards over a socket (using socket.h) with a gui on Windows. I'm pretty sure we don't have multithreading implemented in the raspberry code, or this would be much easier.
Some of the modular hardware is not interacting with the code. Without the extra hardware, the raspberry sits there waiting until the GUI sends something. That's how we want it behaving.
Problem is, when we have this extra hardware connected, the raspberry needs to also run some code in response to one of its IO pins tied to a button on the hardware.
I tried adding (various) conditions to a loop in main(), which is where I thought the code was idling. I always got either the hardware OR GUI control working, but not both.
I eventually figured out that read(), which is called near the end of that loop, is blocking.
Now I'm trying to figure out how to execute one chunk of code (for the hardware control) if read() is blocking, and another when it stops.
Something like:
(Pseudocode)
read socket
while(blocking)
{
check for hardware signal, if found, runTest();
}
{use result of read()}
The loop in main:
while(true)
{
initPort();
while(listening)
{
openPort();
if (netOpen)
{
//puts("// Tester is connected /////////////////////////////////");
loadCalFactors();
loadCalResistors();
loadExtConf(true);
inOn(false);
outOn(false);
// Loop until client terminates connection.
initPins();
while(netOpen)
{
/*
* Moving the block to work().
*/
// oldI = i;
// i = digitalRead(iTSW);
//
// if((i ^ oldI) == 0 && oldI == 1) // TSW has been held on
// testing = true;
//
// else if((i ^ oldI) == 1 && oldI == 0)
// testing = false;
//
// if(testLoaded && usingTermCtrl && i && !testing)
// {
// RunTp(false);
// }
// else
work();
}
}
}
}
Here's the code where it's blocking:
void work()
{
char bs[1];
int n;
//socket Sockfd is blocking here, so we only briefly return to the
//loop in main right after an action
n = read(Sockfd, bs, 1);
if(n > 0)
doAct(bs[0]);
else
checkForSOT();
}
I tried putting all my hardware checks in checkForSOT() (which was previously empty) already, but it didn't do any better than the loop in main.
And where the socket is being setup:
void initPort()
{
struct sockaddr_in serv_addr;
printf("Initializing Port %d\n", HOST_PORT);
listening = false;
sockListen = socket(AF_INET, SOCK_STREAM, 0);
printf("sockListen=%d\n",sockListen);
if(sockListen < 0)
{
puts("Error opening socket");
delay(1000);
}
else
{
bzero((char *)&serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(HOST_PORT);
if(bind(sockListen, (struct sockaddr *) &serv_addr,
sizeof(serv_addr)) < 0)
{
puts("Error on binding");
delay(1000);
}
else
{
listen(sockListen, 1);
listening = true;
}
}
}
void openPort()
{
int clilen;
int newsockfd;
printf("\nTester waiting for connection on socket %d\n", sockListen);
clilen = sizeof(cli_addr);
// infinite wait on a connection
if((newsockfd = accept(sockListen,
(struct sockaddr *) &cli_addr, (socklen_t*) &clilen) ) < 0 )
puts("Error on accept");
else
{
if (DEBUG) debugMode = true;
netOpen = true;
Sockfd = newsockfd;
sendVersion();
netWrite("Connected\n");
sendExtConfMessage();
sendExtConf();
}
}
Most of the posts I found on sockets blocking mention poll() and select() (along with their p- and e- upgrades(?)), but don't go into clear-enough detail on how they work for me to figure out if they are what I need.
I'm also not sure what it would take to change the socket to non-blocking while maintaining the same behavior.
Note: I am still trying to read through and wrap my head around Beej's guide to network programming, so if there's a specific section of that that'll help me, please be specific.
Also, if anyone knows of (or could write) a good guide to setting up remote debugging the raspberry through NetBeans 12 (on Windows 10), that would be a HUGE help!
Poll isn't very hard. Make a loop that runs at least every 5 msec.
while (running)
Initialize, then wait for something to happen
struct pollfd fds = {fd, POLLIN, 0};
int rc = poll(&fds, 1, 5); // wait for fds or 5 msec
if (rc < 0) {
perror("poll");
exit(1);
} else if (rc > 0) {
if (fds.revents & POLLIN)
recv(fd); // fds is ready
}
// check button here
}
Obviously there should be more checking and cleanup on error conditions.
I used select for synchronous I/O multiplexing.It will check for any data for 1 second.After 1 second if no data it will display a output (puts("Waited for 1 sec no data");) then it will check again for data.But this is working only at first time then it enters endless loops.
Is there any alternative solution for this.
//..............................
//.............................
//Creating listener socket and other sort of things
struct timeval tv;
tv.tv_sec=1;
tv.tv_usec=0;
while(1)
{
FD_ZERO(master);
FD_SET(listener,master);
fdmax = listener;
int retval=select(fdmax+1,master, NULL, NULL,&tv);
printf("retval is %d\n",retval);
if(retval == -1)
{
perror("Server-select() error");
}else if(retval)
{
puts("Data available");
//If there is no data do some work and checkagain.
}else
{
puts("Waited for 1 sec no data");
//If there is no data do some work and checkagain.
}
}
From man select:
On Linux, select() modifies timeout to reflect the amount of time not slept; most other implementations do not do this. (POSIX.1-2001 permits either behavior.) This causes problems both when Linux code which reads timeout is ported to other operating systems, and when code is ported to Linux that reuses a struct timeval for multiple select()s in a loop without reinitializing it. Consider timeout to be undefined after select() returns.
So like master, you will have to set tv before each select call.
In my codes, I often have something like:
FD_ZERO(master);
FD_SET(listener,master);
fdmax = listener;
while (1)
{
struct timeval tv = {1, 0};
int retval=select(fdmax+1,master, NULL, NULL,&tv);
printf("retval is %d\n",retval);
if(retval == -1) {
perror("Server-select() error");
break; // <-- notice the break here
} else if(retval) {
puts("Data available");
} else {
puts("Waited for 1 sec no data");
}
}
In addition to Mathieu answer, it seems that before each call to select FD_ZERO must be called to empty readable file handle set and then calls the FD_SET.
I was struggling with select() returning always 0 after reaching the timeout regardless the fact that there are data to read (I read inputs from the keyboard, I checked with 'cat' command that data are sent at each Key press).
The code should be (Credit to previous answers):
while (1)
{
struct timeval tv = {1, 0};
FD_ZERO(master);
FD_SET(listener,master);
fdmax = listener;
int retval=select(fdmax+1,master, NULL, NULL,&tv);
printf("retval is %d\n",retval);
if(retval == -1) {
perror("Server-select() error");
break; // <-- notice the break here
} else if(retval) {
puts("Data available");
} else {
puts("Waited for 1 sec no data");
}
}
I'm initializing a daemon in C in a Debian:
/**
* Initializes the daemon so that mcu.serial would listen in the background
*/
void init_daemon()
{
pid_t process_id = 0;
pid_t sid = 0;
// Create child process
process_id = fork();
// Indication of fork() failure
if (process_id < 0) {
printf("Fork failed!\n");
logger("Fork failed", LOG_LEVEL_ERROR);
exit(1);
}
// PARENT PROCESS. Need to kill it.
if (process_id > 0) {
printf("process_id of child process %i\n", process_id);
exit(0);
}
//unmask the file mode
umask(0);
//set new session
sid = setsid();
if(sid < 0) {
printf("could not set new session");
logger("could not set new session", LOG_LEVEL_ERROR);
exit(1);
}
// Close stdin. stdout and stderr
close(STDIN_FILENO);
close(STDOUT_FILENO);
close(STDERR_FILENO);
}
The main daemon runs in the background and monitors a serial port to communicate with a microcontroller - it reads peripherals (such as button presses) and passes information to it. The main functional loop is
int main(int argc, char *argv[])
{
// We need the port to listen to commands writing
if (argc < 2) {
fprintf(stderr,"ERROR, no port provided\n");
logger("ERROR, no port provided", LOG_LEVEL_ERROR);
exit(1);
}
int portno = atoi(argv[1]);
// Initialize serial port
init_serial();
// Initialize server for listening to socket
init_server(portno);
// Initialize daemon and run the process in the background
init_daemon();
// Timeout for reading socket
fd_set setSerial, setSocket;
struct timeval timeout;
timeout.tv_sec = 0;
timeout.tv_usec = 10000;
char bufferWrite[BUFFER_WRITE_SIZE];
char bufferRead[BUFFER_READ_SIZE];
int n;
int sleep;
int newsockfd;
while (1)
{
// Reset parameters
bzero(bufferWrite, BUFFER_WRITE_SIZE);
bzero(bufferRead, BUFFER_WRITE_SIZE);
FD_ZERO(&setSerial);
FD_SET(fserial, &setSerial);
FD_ZERO(&setSocket);
FD_SET(sockfd, &setSocket);
// Start listening to socket for commands
listen(sockfd,5);
clilen = sizeof(cli_addr);
// Wait for command but timeout
n = select(sockfd + 1, &setSocket, NULL, NULL, &timeout);
if (n == -1) {
// Error. Handled below
}
// This is for READING button
else if (n == 0) {
// This timeout is okay
// This allows us to read the button press as well
// Now read the response, but timeout if nothing returned
n = select(fserial + 1, &setSerial, NULL, NULL, &timeout);
if (n == -1) {
// Error. Handled below
} else if (n == 0) {
// timeout
// This is an okay tiemout; i.e. nothing has happened
} else {
n = read(fserial, bufferRead, sizeof bufferRead);
if (n > 0) {
logger(bufferRead, LOG_LEVEL_INFO);
if (strcmp(stripNewLine(bufferRead), "ev b2") == 0) {
//logger("Shutting down now", LOG_LEVEL_INFO);
system("shutdown -h now");
}
} else {
logger("Could not read button press", LOG_LEVEL_WARN);
}
}
}
// This is for WRITING COMMANDS
else {
// Now read the command
newsockfd = accept(sockfd, (struct sockaddr *) &cli_addr, &clilen);
if (newsockfd < 0 || n < 0) logger("Could not accept socket port", LOG_LEVEL_ERROR);
// Now read the command
n = read(newsockfd, bufferWrite, BUFFER_WRITE_SIZE);
if (n < 0) {
logger("Could not read command from socket port", LOG_LEVEL_ERROR);
} else {
//logger(bufferWrite, LOG_LEVEL_INFO);
}
// Write the command to the serial
write(fserial, bufferWrite, strlen(bufferWrite));
sleep = 200 * strlen(bufferWrite) - timeout.tv_usec; // Sleep 200uS/byte
if (sleep > 0) usleep(sleep);
// Now read the response, but timeout if nothing returned
n = select(fserial + 1, &setSerial, NULL, NULL, &timeout);
if (n == -1) {
// Error. Handled below
} else if (n == 0) {
// timeout
sprintf(bufferRead, "err\r\n");
logger("Did not receive response from MCU", LOG_LEVEL_WARN);
} else {
n = read(fserial, bufferRead, sizeof bufferRead);
}
// Error reading from the socket
if (n < 0) {
logger("Could not read response from serial port", LOG_LEVEL_ERROR);
} else {
//logger(bufferRead, LOG_LEVEL_INFO);
}
// Send MCU response to client
n = write(newsockfd, bufferRead, strlen(bufferRead));
if (n < 0) logger("Could not write confirmation to socket port", LOG_LEVEL_ERROR);
}
close(newsockfd);
}
close(sockfd);
return 0;
}
But the CPU usages is always at 100%. Why is that? What can I do?
EDIT
I commented out the entire while loop and made the main function as simple as:
int main(int argc, char *argv[])
{
init_daemon();
while(1) {
// All commented out
}
return 0;
}
And I'm still getting 100% cpu usage
You need to set timeout to the wanted value on every iteration, the struct gets modified on Linux so I think your loop is not pausing except for the first time, i.e. select() is only blocking the very first time.
Try to print tv_sec and tv_usec after select() and see, it's modified to reflect how much time was left before select() returned.
Move this part
timeout.tv_sec = 0;
timeout.tv_usec = 10000;
inside the loop before the select() call and it should work as you expect it to, you can move many delcarations inside the loop too, that would make your code easier to maintan, you could for example move the loop content to a function in the future and that might help.
This is from the linux manual page select(2)
On Linux, select() modifies timeout to reflect the amount of time not slept; most other implementations do not do this. (POSIX.1-2001 permits either behavior.) This causes problems both when Linux code which reads timeout is ported to other operating systems, and when code is ported to Linux that reuses a struct timeval for multiple select()s in a loop without reinitializing it. Consider timeout to be undefined after select() returns.
I think the bold part in the qoute is the important one.
I am doing a university project, we have to use the Unix system call.
Right now, I'm really struggling to understand if, in my project, there really is a mistake. This is because, while in terminal it compiles and it starts and finishes without error, on xcode I get several errors.
In particular, I get errors when using semaphores.
I'll try to explain what errors, I receive, but since I'm not native English speakers forgive me in advance if I make some mistakes.
First, the program creates a number of child processes with a fork (). It does depending on how many clientei.txt located (i = iterator).
Immediately I block parent with a semaphore, I run the child up to a certain point, then I block it with a semaphore and I restart the parent.
At this point, the parent should read a message sent by his son, call a function to print the content inside a log.txt and restart the son.
Then the child does other things (including erase the message) and it block.
The parent restart, and everything is repeated for subsequent children.
While in terminal synchronization is perfect (everything happens at the right time without error) this both Linux and Mac, about XCode I had several errors:
semop: Resource temporarily unavailable (if I created more than 5 txt)
semop: File too large (if I created more than 2)
with 2 instead gave me two errors:
semop 1: Interrupted system call (this stops after running both processes)
semop 3: Identifier removed (with this in restarting the second process)
is not so much time that I do C then I do not know what to do. I would like first of all to know if I have to worry (so there is an error), or I have to be quiet because it is a bug in xcode.
If there was a mistake I kindly ask you not to ask me to change the code a lot.
This is mainly because they are close to expiring and I can not afford to do it all again.
I also ask you, if you can, to be as clear as possible. I understand enough English, but not as a mother-tongue, I can not always follow the responses here on StackOverflow.
The code is here:
https://www.dropbox.com/s/2utsb6r5d7kzzqj/xcode%2Bterminal.zip?dl=0
this zip contain a small part of the project that has this problem.
the terminal version works. there is a makefile in this version to simplify the compilation.
xcode version does not work. It contains the Debug folder. Indeed xcode, txt files, it does not read from the root folder where the codes are contained in the folder where it creates the compiled. There is a readme in each case with the procedure in detail.
I tried to minimize, I commented all in English.
I removed the code that was not needed, but I added the file with all the include and functions that use.
here the code:
main.c
key_t key, key_2;
int semid, semid_2;
union semun arg;
union semun arg_2;
struct sembuf sb_2 = {0, -1, 0};
char* nome_file;
nome_file = (char*) malloc(sizeof(char*));
int numero_clienti;
//semaphore for all the child
struct sembuf sb[numero_clienti];
int i_c;
for (i_c = 0; i_c < numero_clienti; i_c++) {
sb[i_c].sem_num = i_c;
sb[i_c].sem_op = -1;
sb[i_c].sem_flg = 0;
}
//cretion of first SEMAPHORE
{
//key creation
if ((key = ftok("cliente0.txt", 'J')) == -1)
{
perror("ftok");
exit(EXIT_FAILURE);
}
//creation of the semaphore
if ((semid = semget(key, numero_clienti, 0666 | IPC_CREAT | IPC_EXCL)) == -1)
{
perror("semget");
exit(EXIT_FAILURE);
}
//set value of all child semaphore
for (i_c = 0; i_c < numero_clienti; i_c++) {
arg.val = 0;
if (semctl(semid, i_c, SETVAL, arg) == -1)
{
perror("semctl");
exit(EXIT_FAILURE);
}
}
}
//cretion of second SEMAPHORE
{
//key creation
if ((key_2 = ftok("cliente1.txt", 'J')) == -1)
{
perror("ftok");
exit(EXIT_FAILURE);
}
//creation of the semaphore
if ((semid_2 = semget(key_2, 1, 0666 | IPC_CREAT | IPC_EXCL)) == -1)
{
perror("semget");
exit(EXIT_FAILURE);
}
//set value of parent semaphore
arg_2.val = 0;
if (semctl(semid_2, 0, SETVAL, arg_2) == -1)
{
perror("semctl");
exit(EXIT_FAILURE);
}
}
while(fd > 0 && pid > 0){
j++;
close(fd);
pid = fork();
if(pid != 0)
{
i++;
sprintf(nome_file, "./cliente%d.txt", i);
fd = open(nome_file, O_RDONLY);
}
switch(pid)
{
//error case
case -1:
{
perror("Error during fork.");
exit(EXIT_FAILURE);
break;
}
//child case
case 0:
{
puts("Child: I'm a child");
messaggio(numero_clienti, j);
puts("Child: I have to do something");
//Start parent
sb_2.sem_op = 1;
if (semop(semid_2, &sb_2, 1) == -1)
{
perror("semop");
exit(1);
}
//, stop itself
sb[j].sem_op = -1;
if (semop(semid, &sb[j], 1) == -1)
{
perror("semop");
exit(1);
}
printf("Child: I have to do something else %d\n", getpid());
_exit(EXIT_SUCCESS);
break;
}
//parent case
default:
{
puts("Parent: I'm a parent");
//Stop itself
sb_2.sem_op = -1;
if (semop(semid_2, &sb_2, 1) == -1)
{
perror("semop padre");
exit(1);
}
puts("Parent: now I can send the message, my child is blocked");
//restart child
sb[j].sem_op = 1;
if (semop(semid, &sb[j], 1) == -1)
{
perror("semop");
exit(1);
}
//stop itself
sb_2.sem_op = -1;
if (semop(semid_2, &sb_2, 1) == -1)
{
perror("semop");
exit(1);
}
puts("Parent: end of while");
break;
}
}
}
puts("Parent: I can restart all my child");
for (i_c = 0; i_c < numero_clienti; i_c++) {
sb[i_c].sem_op = 1;
if (semop(semid, &sb[i_c], 1) == -1)
{
perror("semop");
exit(1);
}
}
puts("I wait the end of my child...");
while (wait(NULL) != -1);
puts("All child end");
//remove semaphore I create
if (semctl(semid, 0, IPC_RMID, arg) == -1)
{
perror("semctl");
exit(1);
}
if (semctl(semid_2, 0, IPC_RMID, arg_2) == -1)
{
perror("semctl");
exit(1);
}
puts("FINE");
return 0;
}
cliente.c
#include "cliente.h"
/**
inside this function child do some thing.
1. at this point it give control to parent after it create a message
2. at this point it remove the message
*/
void messaggio(int numero_clienti, int num_j){
key_t key, key_2;
int semid, semid_2;
struct sembuf sb[numero_clienti];
int i_c;
for (i_c = 0; i_c < numero_clienti; i_c++) {
sb[i_c].sem_num = i_c;
sb[i_c].sem_op = -1;
sb[i_c].sem_flg = 0;
}
struct sembuf sb_2 = {0, -1, 0};
if ((key = ftok("cliente0.txt", 'J')) == -1) {
perror("ftok");
exit(1);
}
if ((semid = semget(key, 1, 0)) == -1) {
perror("semget");
exit(1);
}
if ((key_2 = ftok("cliente1.txt", 'J')) == -1) {
perror("ftok");
exit(1);
}
if ((semid_2 = semget(key_2, 1, 0)) == -1) {
perror("semget");
exit(1);
}
//creation of a message
//1. Restart parent
sb_2.sem_op = 1;
if (semop(semid_2, &sb_2, 1) == -1)
{
perror("semop");
exit(1);
}
puts("cambio sem");
//stop itself
sb[num_j].sem_op = -1;
if (semop(semid, &sb[num_j], 1) == -1)
{
perror("semop");
exit(1);
}
//here it can move again
puts("remove message");
puts("Figlio: sono tornato attivo, mio padre aspetta");
}
1st you do
nome_file = (char*) malloc(sizeof(char*));
which allocates 4 or 8 bytes (depending on the platform you compile on: 32 or 64bit).
Then you do
sprintf(nome_file, "./cliente%d.txt", i);
The latter writes to invalid memory, as "./cliente%d.txt" is 14+1 characters long plus the potenial number of digits from i if i>9 or and addtional sign if i<0.
To fix this allocate what is needed:
nome_file = malloc(13 + 10 + 1 + 1); /* 13 for the filename,
10 for the digits,
1 for a potential sign,
1 the C-"strings" 0-terminator. */
This is a really ugly bug, which is expected to be the main issue in your code.
Also in the sources (you linked) in function read_line() you allocate memory, which you do not properly initialise, but later depend on its content.
main.c:20
char* myb2 = (char*) malloc(sizeof(char*));
malloc() does not initialise the memory it allocates, so either do:
char * myb2 = calloc(1, sizeof(char*));
of add and addtional call to
memset(mb2, 0, sizeof(char*));
after the call to malloc().
This bug is nasty either.
Also^2 you should build using gcc's options -std=c99 -D_XOPEN_SOURCE.
That is because:
You are using C constructs available from C99 on only. Typically VLAs, so tell the compiler to treat the code as being C99 code by explcitly stating -std=c99
To #define _XOPEN_SOURCE is issued by gcc, for some header you include in your project.
Also^3 you seem to be not necessarily count the correct number of client(file)s, at least not if your files a distributed as per the archive you linked:
main.c:82
system("ls cliente* | wc -l");
Change this to be:
system("ls cliente*.txt | wc -l");
If the bug described above should return more files then there actually are the following code fails as well from a certain value of i on:
main.c:176
fd = open(nome_file, O_RDONLY);
The result of the above operation is NOT tested. The possible invalid fd is used and the infamous undefined behaviour is taking over. Everything can happen.
As a final note: It's mostly never a bug in the tools we are using.
The standard way would be the following:
if (ptrace(PTRACE_TRACEME, 0, NULL, 0) == -1)
printf("traced!\n");
In this case, ptrace returns an error if the current process is traced (e.g., running it with GDB or attaching to it).
But there is a serious problem with this: if the call returns successfully, GDB may not attach to it later. Which is a problem since I'm not trying to implement anti-debug stuff. My purpose is to emit an 'int 3' when a condition is met (e.g., an assert fails) and GDB is running (otherwise I get a SIGTRAP which stops the application).
Disabling SIGTRAP and emitting an 'int 3' every time is not a good solution because the application I'm testing might be using SIGTRAP for some other purpose (in which case I'm still screwed, so it wouldn't matter, but it's the principle of the thing :))
On Windows there is an API, IsDebuggerPresent, to check if process is under debugging. At Linux, we can check this with another way (not so efficient).
Check "/proc/self/status" for "TracerPid" attribute.
Example code:
#include <sys/stat.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <ctype.h>
bool debuggerIsAttached()
{
char buf[4096];
const int status_fd = open("/proc/self/status", O_RDONLY);
if (status_fd == -1)
return false;
const ssize_t num_read = read(status_fd, buf, sizeof(buf) - 1);
close(status_fd);
if (num_read <= 0)
return false;
buf[num_read] = '\0';
constexpr char tracerPidString[] = "TracerPid:";
const auto tracer_pid_ptr = strstr(buf, tracerPidString);
if (!tracer_pid_ptr)
return false;
for (const char* characterPtr = tracer_pid_ptr + sizeof(tracerPidString) - 1; characterPtr <= buf + num_read; ++characterPtr)
{
if (isspace(*characterPtr))
continue;
else
return isdigit(*characterPtr) != 0 && *characterPtr != '0';
}
return false;
}
The code I ended up using was the following:
int
gdb_check()
{
int pid = fork();
int status;
int res;
if (pid == -1)
{
perror("fork");
return -1;
}
if (pid == 0)
{
int ppid = getppid();
/* Child */
if (ptrace(PTRACE_ATTACH, ppid, NULL, NULL) == 0)
{
/* Wait for the parent to stop and continue it */
waitpid(ppid, NULL, 0);
ptrace(PTRACE_CONT, NULL, NULL);
/* Detach */
ptrace(PTRACE_DETACH, getppid(), NULL, NULL);
/* We were the tracers, so gdb is not present */
res = 0;
}
else
{
/* Trace failed so GDB is present */
res = 1;
}
exit(res);
}
else
{
waitpid(pid, &status, 0);
res = WEXITSTATUS(status);
}
return res;
}
A few things:
When ptrace(PTRACE_ATTACH, ...) is successful, the traced process will stop and has to be continued.
This also works when GDB is attaching later.
A drawback is that when used frequently, it will cause a serious slowdown.
Also, this solution is only confirmed to work on Linux. As the comments mentioned, it won't work on BSD.
You could fork a child which would try to PTRACE_ATTACH its parent (and then detach if necessary) and communicates the result back. It does seem a bit inelegant though.
As you mention, this is quite costly. I guess it's not too bad if assertions fail irregularly. Perhaps it'd be worthwhile keeping a single long-running child around to do this - share two pipes between the parent and the child, child does its check when it reads a byte and then sends a byte back with the status.
I had a similar need, and came up with the following alternatives
static int _debugger_present = -1;
static void _sigtrap_handler(int signum)
{
_debugger_present = 0;
signal(SIGTRAP, SIG_DFL);
}
void debug_break(void)
{
if (-1 == _debugger_present) {
_debugger_present = 1;
signal(SIGTRAP, _sigtrap_handler);
raise(SIGTRAP);
}
}
If called, the debug_break function will only interrupt if a debugger is attached.
If you are running on x86 and want a breakpoint which interrupts in the caller (not in raise), just include the following header, and use the debug_break macro:
#ifndef BREAK_H
#define BREAK_H
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
int _debugger_present = -1;
static void _sigtrap_handler(int signum)
{
_debugger_present = 0;
signal(SIGTRAP, SIG_DFL);
}
#define debug_break() \
do { \
if (-1 == _debugger_present) { \
_debugger_present = 1; \
signal(SIGTRAP, _sigtrap_handler); \
__asm__("int3"); \
} \
} while(0)
#endif
I found that a modified version of the file descriptor "hack" described by Silviocesare and blogged by xorl worked well for me.
This is the modified code I use:
#include <stdio.h>
#include <unistd.h>
// gdb apparently opens FD(s) 3,4,5 (whereas a typical prog uses only stdin=0, stdout=1,stderr=2)
int detect_gdb(void)
{
int rc = 0;
FILE *fd = fopen("/tmp", "r");
if (fileno(fd) > 5)
{
rc = 1;
}
fclose(fd);
return rc;
}
If you just want to know whether the application is running under GDB for debugging purposes, the simplest solution on Linux is to readlink("/proc/<ppid>/exe"), and search the result for "gdb".
This is similar to terminus' answer, but uses pipes for communication:
#include <unistd.h>
#include <stdint.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#if !defined(PTRACE_ATTACH) && defined(PT_ATTACH)
# define PTRACE_ATTACH PT_ATTACH
#endif
#if !defined(PTRACE_DETACH) && defined(PT_DETACH)
# define PTRACE_DETACH PT_DETACH
#endif
#ifdef __linux__
# define _PTRACE(_x, _y) ptrace(_x, _y, NULL, NULL)
#else
# define _PTRACE(_x, _y) ptrace(_x, _y, NULL, 0)
#endif
/** Determine if we're running under a debugger by attempting to attach using pattach
*
* #return 0 if we're not, 1 if we are, -1 if we can't tell.
*/
static int debugger_attached(void)
{
int pid;
int from_child[2] = {-1, -1};
if (pipe(from_child) < 0) {
fprintf(stderr, "Debugger check failed: Error opening internal pipe: %s", syserror(errno));
return -1;
}
pid = fork();
if (pid == -1) {
fprintf(stderr, "Debugger check failed: Error forking: %s", syserror(errno));
return -1;
}
/* Child */
if (pid == 0) {
uint8_t ret = 0;
int ppid = getppid();
/* Close parent's side */
close(from_child[0]);
if (_PTRACE(PTRACE_ATTACH, ppid) == 0) {
/* Wait for the parent to stop */
waitpid(ppid, NULL, 0);
/* Tell the parent what happened */
write(from_child[1], &ret, sizeof(ret));
/* Detach */
_PTRACE(PTRACE_DETACH, ppid);
exit(0);
}
ret = 1;
/* Tell the parent what happened */
write(from_child[1], &ret, sizeof(ret));
exit(0);
/* Parent */
} else {
uint8_t ret = -1;
/*
* The child writes a 1 if pattach failed else 0.
*
* This read may be interrupted by pattach,
* which is why we need the loop.
*/
while ((read(from_child[0], &ret, sizeof(ret)) < 0) && (errno == EINTR));
/* Ret not updated */
if (ret < 0) {
fprintf(stderr, "Debugger check failed: Error getting status from child: %s", syserror(errno));
}
/* Close the pipes here, to avoid races with pattach (if we did it above) */
close(from_child[1]);
close(from_child[0]);
/* Collect the status of the child */
waitpid(pid, NULL, 0);
return ret;
}
}
Trying the original code under OS X, I found waitpid (in the parent) would always return -1 with an EINTR (System call interrupted). This was caused by pattach, attaching to the parent and interrupting the call.
It wasn't clear whether it was safe to just call waitpid again (that seemed like it might behave incorrectly in some situations), so I just used a pipe to do the communication instead. It's a bit of extra code, but will probably work reliably across more platforms.
This code has been tested on OS X v10.9.3 (Mavericks), Ubuntu 14.04 (Trusty Tahr) (3.13.0-24-generic) and FreeBSD 10.0.
For Linux, which implements process capabilities, this method will only work if the process has the CAP_SYS_PTRACE capability, which is typically set when the process is run as root.
Other utilities (gdb and lldb) also have this capability set as part of their filesystem metadata.
You can detect whether the process has effective CAP_SYS_PTRACE by linking against -lcap,
#include <sys/capability.h>
cap_flag_value_t value;
cap_t current;
/*
* If we're running under Linux, we first need to check if we have
* permission to to ptrace. We do that using the capabilities
* functions.
*/
current = cap_get_proc();
if (!current) {
fprintf(stderr, "Failed getting process capabilities: %s\n", syserror(errno));
return -1;
}
if (cap_get_flag(current, CAP_SYS_PTRACE, CAP_PERMITTED, &value) < 0) {
fprintf(stderr, "Failed getting permitted ptrace capability state: %s\n", syserror(errno));
cap_free(current);
return -1;
}
if ((value == CAP_SET) && (cap_get_flag(current, CAP_SYS_PTRACE, CAP_EFFECTIVE, &value) < 0)) {
fprintf(stderr, "Failed getting effective ptrace capability state: %s\n", syserror(errno));
cap_free(current);
return -1;
}
C++ version of Sam Liao's answer (Linux only):
// Detect if the application is running inside a debugger.
bool being_traced()
{
std::ifstream sf("/proc/self/status");
std::string s;
while (sf >> s)
{
if (s == "TracerPid:")
{
int pid;
sf >> pid;
return pid != 0;
}
std::getline(sf, s);
}
return false;
}