I am running two separate threads in C, both doing some operations. Both threads include infinite loop. When I run this program couple of times, I always get a memory leak error.
*** glibc detected *** ./a.out: free(): invalid next size (normal): 0x08652510 ***
======= Backtrace: =========
/lib/libc.so.6(+0x6c501)[0x3d1501]
...
I believe it is a memory error, the problem is, when I always need to stop that program(cause im still testing it), i just terminate the program with Ctrl+C, so I believe I always miss the free(anything) command, which then causes the error.
Can you tell me how to avoid this situation? So I can free() memory even in case I terminate the program?
Next thing which comes to my mind is, when i wait a couple of minutes and then run the program again, it runs perfectly again
Thanks for any hints
void *lineOne(void *dataO)
{
struct IPlist *IPlist = dataO;
static struct ARP_entryI ARP_tableI[ARP_TABLE_SIZE];
int neigh=0; //pocet susedov
int neigh1=0; //stav pred tym
int i;
getAddress();
while(1)
{
while (neigh == neigh1)
{
neigh =rcvBroad(IPlist, neigh);
}
neigh1=neigh;
for (i=neigh ; i<neigh+1; i++)
{
main_client(ARP_tableI, IPlist[i-1].IPaddr); // vysle tcp, prijme arp
getAddress();
}
}
}
//pocuvaServer, odpoveda ARP
void *lineTwo()
{
static struct ARP_entryO ARP_tableO[ARP_TABLE_SIZE];
int line = from_local_arp(ARP_tableO);
main_server(ARP_tableO, line); // pocuva tcp, vysle arp
}
void main()
{
static struct IPlist *IPlist[ARP_TABLE_SIZE];
pthread_t thread1, thread2;
int iret1, iret2;
/* Create independent threads each of which will execute function */
iret1 = pthread_create( &thread1, NULL, lineOne, (void *)IPlist); //(void *) &
iret2 = pthread_create( &thread2, NULL, lineTwo, NULL);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
}
you could handle SIGINT, but it doesn't matter, your code already corrupts the memory by the time you want to do that extra free() call.
to find the problem compile it with -g and run it with valgrind.
Try running your program through Valgrind and see if you can get any help on where the memory allocation structure is corrupted. From the error it looks like you are somewhere doing something invalid with the memory allocation the corrupts the internal data structures for memory allocation.
use a signal handler to catch the Ctrl-C event (Ctrl-C generates a SIGINT signal), and set a flag in the handler. modify the infinite loop so that it stop looping when it sees the flag, and write a cleanup code after the loop. your program will then end "normally".
the signal handling functions are part of the GNU C library (and of any POSIX system, i think). here is a link to the documentation of the gnu c library regarding signal handling.
You have corrupted your heap space. Perhaps you are writing off the end (or before the beginning) of a chunk of allocated memory. The free is detecting the corruption and producing the error.
When the program is terminated all memory will be freed automatically.
Related
I have a small program that contains a variable that I need to malloc:
char **v;
v = (char**)malloc(sizeof(char *) * MAX_EVENTS);
for (int i = 0; i < MAX_EVENTS; i++)
v[i] = (char *)malloc(MAX_NAME_SIZE);
In order to make Valgrind happy, to avoid any memory leaks, I set up handlers for termination signals. This handler will simply free that allocation before exiting, as well as terminating child processes.
static void term_handler() {
if (v != NULL) {
for (int i = 0; i < MAX_EVENTS; i++) {
if (v[i] != NULL)
free(v[i]);
}
free(v);
}
for (int i = 0; i < MAX_PROCS; i++)
if (children[i])
kill(children[i], SIGTERM);
exit(EXIT_SUCCESS);
}
To access v from the handler, I put it as a global variable. children is a static array pid_t children[MAX_PROCS]; but could potentially be malloced as well.
What is the cleanest way to access those allocations from the handler? Having global variables is not recommend but nor are memory leaks and not properly terminated programs.
Should I keep an array of pointers to my allocations as a global variable? Or should I just avoid handling unexpected signals?
Signal handlers are tricky, in that they are called asynchronously, and therefore there are only a small set of function calls that are safe to call from within a signal handler. In particular, allocating or freeing memory from within a signal-handler is a no-no (as is calling exit()!), so don't do it.
If you want to make sure the memory gets freed(*), however, you can do so by having your signal handler "tell" your program's main thread that it is time for it to exit. The main thread can then break out of its event loop, free the memory, and do any other cleanup work it would normally do before exiting.
So then the question becomes, how can a signal handler safely tell the main thread to perform a controlled/graceful exit?
If the main thread is running an event loop that executes on a fixed schedule (e.g. every so-many milliseconds), it may be as easy as declaring a global variable (e.g. volatile bool pleaseQuitNow = false; that the main thread tests on each iteration of its event loop, and having the signal-handler set that variable to a different value. The main thread will then see the changed variable on its next iteration and respond by breaking out of the event loop.
If the main thread's event-loop is event-based, on the other hand (e.g. it is blocked inside select() or poll() or similar and the call won't return for some indefinite amount of time), then an alternate way to wake up the main thread would be to create a pipe() or socketpair() at program startup, and have the main thread watch one of the two file-descriptors for read-ready status. Then when the signal handler runs, it can send() a byte on the other file descriptor, which will cause the first file descriptor to indicate ready-for-read status. The main thread can respond to that ready-for-read status by breaking out of its event loop and exiting gracefully.
In addition to avoiding async-signal-unsafe calls, the benefit of doing it this way is that you have only one shutdown/cleanup-path to test/debug/maintain, instead of two.
(*) Of course on any modern OS the memory will get freed anyway, by the OS's process-cleanup routines; but valgrind will complain about memory leaks, so it's better to free the memory manually if possible, if only so that you can use valgrind to find "real" memory leaks without having to sort through a bunch of false-positives every time.
I need your help to find the problem in this code: it's the main function of my program that simulates a multiprocessor system. I use thread library to build the Ram entity and all CPUs. Compiling there aren't problems and most of the executions work well. But, sometime, I launch the exe and after one or two prints, there is a segmentation fault.
So, I try to find it using gdb (without any response) and valgrind. The only thing that Valgrind told to me is that there was only a possibly lost (this is the messagge: 272 bytes in 1 blocks are possibly lost in loss record 1 of 1).
Ps. I write #include for each library function.
int main(int argc, char *argv[])
{
if(argc!=3)
syserr("Utilizzo: simulazione <numCpu> <ramDim>\n");
pthread_t ram;
ram_dim=atoi(argv[2]);
int num_cpu=atoi(argv[1]);
pthread_t cpu[num_cpu];
command *cpu_info=(command *) malloc(sizeof(command)*num_cpu);
request *buffer=(request *) malloc(sizeof(request));
int curs, status;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_mutex_init(&ram_lock, NULL);
vpthread_mutex_lock(&ram_lock);
if((status=pthread_create(&ram, &attr, ram_job, (void *) buffer))!=0)
syserr("Creazione thread Ram fallita.\n");
pthread_mutex_init(&cpu_lock, NULL);
pthread_mutex_init(&rw_lock, NULL);
pthread_mutex_lock(&rw_lock);
for(curs=0;curs<num_cpu;curs++)
{
cpu_info[curs].istr=buffer;
cpu_info[curs].num_cpu=curs+1;
if((status=pthread_create(&cpu[curs], &attr, cpu_job, (void *) &cpu_info[curs]))!=0)
syserr("Creazione thread Cpu fallita.\n");
}
pthread_attr_destroy(&attr);
for(curs=0;curs<num_cpu;curs++)
pthread_join(cpu[curs], (void **) 0);
free(buffer);
free(cpu_info);
pthread_mutex_destroy(&rw_lock);
pthread_mutex_destroy(&cpu_lock);
pthread_mutex_destroy(&ram_lock);
return 0;
}
Because if there's an error, syserr will return. If there's an error, it will do something close to "printf(...); exit(...);".
And you will never do the rest of the code, which is suppose to free all variables.
Use "strerror(errno)" instead ;)
Join with that thread is missing. Add:
pthread_join(ram, 0);
Somewhere at the end before you destroy the resources used by that thread.
But that probably won't fix that crash because you need first to stop that thread gracefully somehow.
I am trying to implement a user level thread library and need to schedule threads in a round robin fashion. I am currently trying to make switching work for 2 threads that I have created using makecontext, getcontext and swapcontext. setitimer with ITIMER_PROF value is used and sigaction is assigned a handler to schedule a new thread whenever the SIGPROF signal is generated.
However, the signal handler is not invoked and the threads therefore never get scheduled. What could be the reason? Here are some snippets of the code:
void userthread_init(long period){
/*long time_period = period;
//Includes all the code like initializing the timer and attaching the signal
// handler function "schedule()" to the signal SIGPROF.
// create a linked list of threads - each thread's context gets added to the list/updated in the list
// in userthread_create*/
struct itimerval it;
struct sigaction act;
act.sa_flags = SA_SIGINFO;
act.sa_sigaction = &schedule;
sigemptyset(&act.sa_mask);
sigaction(SIGPROF,&act,NULL);
time_period = period;
it.it_interval.tv_sec = 4;
it.it_interval.tv_usec = period;
it.it_value.tv_sec = 1;
it.it_value.tv_usec = 100000;
setitimer(ITIMER_PROF, &it,NULL);
//for(;;);
}
The above code is to initialize a timer and attach a handler schedule to the signal handler. I am assuming the signal SIGPROF will be given to the above function which will invoke the scheduler() function. The scheduler function is given below:
void schedule(int sig, siginfo_t *siginf, ucontext_t* context1){
printf("\nIn schedule");
ucontext_t *ucp = NULL;
ucp = malloc(sizeof(ucontext_t));
getcontext(ucp);
//ucp = &sched->context;
sched->context = *context1;
if(sched->next != NULL){
sched = sched->next;
}
else{
sched = first;
}
setcontext(&sched->context);
}
I have a queue of ready threads in which their respective contexts are stored. Each thread should get scheduled whenever setcontext instruction is executed. However, scheduler() is not invoked! Can anyone please point out my mistake??
Completely revising this answer after looking at the code. There are a few issues:
There are several compiler warnings
You are never initializing your thread ID's, not outside or inside your thread creation method, so I'm surprised the code even works!
You are reading from uninitialized memory in your gtthread_create() function, I tested on both OSX & Linux, on OSX it crashes, on Linux by some miracle it's initialized.
In some places you call malloc(), and overwrite it with a pointer to something else - leaking memory
Your threads don't remove themselves from the linked list after they've finished, so weird things are happening after the routines finish.
When I add in the while(1) loop, I do see schedule() being called and output from thread 2, but thread 1 vanishes into fat air (probably because of the uninitialized thread ID). I think you need to have a huge code cleanup.
Here's what I'd suggest:
Fix ALL of your compiler warnings — even if you think they don't matter, the noise may lead to you missing things (such as incompatible pointer types, etc). You're compiling with -Wall & -pedantic; that's a good thing - so now take the next step & fix them.
Put \n at the END of your printf statements, not the start — The two threads ARE outputting to stdout, but it's not getting flushed so you can't see it. Change your printf("\nMessage"); calls to printf("Message\n");
Use Valgrind to detect memory issues — valgrind is the single most amazing tool you will ever use for C/C++ development. It's available through apt-get & yum. Instead of running ./test1, run valgrind ./test1 and it will highlight memory corruption, memory leaks, uninitialized reads, etc. I can't stress this enough; Valgrind is amazing.
If a system call returns a value, check it — in your code, check the return values to all of getcontext, swapcontext, sigaction, setitimer
Only call async-signal-safe methods from your scheduler (or any signal handler) — so far you've fixed malloc() and printf() from inside your scheduler. Check out the signal(7) man page - see "Async-signal-safe functions"
Modularize your code — your linked list implementation could be tidier, and if it was separated out, then 1) your scheduler would have less code & be simpler, and 2) you can isolate issues in your linked list without having to debug scheduler code.
You're almost there, so keep at it - but keep in mind these three simple rules:
Clean as you go
Keep the compiler warnings fixed
When weird things are happening, use valgrind
Good luck!
Old answer:
You should check the return value of any system call. Whether or not it helps you find the answer, you should do it anyway :)
Check the return value of sigaction(), if it's -1, check errno. sigaction() can fail for a few reasons. If your signal handler isn't getting fired, it's possible it hasn't been set up.
Edit: and make sure you check the return of setitimer() too!
Edit 2: Just a thought, can you try getting rid of the malloc()? malloc is not signal safe. eg: like this:
void schedule(int sig, siginfo_t *siginf, ucontext_t* context1){
printf("In schedule\n");
getcontext(&sched->context);
if(sched->next != NULL){
sched = sched->next;
}
else{
sched = first;
}
setcontext(&sched->context);
}
Edit 3: According to this discussion, you can't use printf() inside a signal handler. You can try replacing it with a call to write(), which is async-signal safe:
// printf("In schedule\n");
const char message[] = "In schedule\n";
write( 1, message, sizeof( message ) );
I wrote a simple thread program:
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdint.h>
#define THREADS 5
void* HelloWorld(void *t)
{
printf("Thread ID #%lu: (%lu) Hello World !!\n", pthread_self(), (unsigned long)t);
return NULL;
}
int main()
{
pthread_t thread[THREADS];
uint32_t i;
int err;
for(i = 0; i < THREADS; ++i)
{
err = pthread_create(&thread[i], NULL, &HelloWorld, (void*)(unsigned long long)i);
if(err != 0)
{
printf("Error %d: Thread %d Creation Unsuccessful !!\n", err, i);
}
printf("Thread %lu in main()\n", pthread_self());
}
/*
for(i = 0; i < THREADS; ++i)
{
pthread_join(thread[i], NULL); // Error checking implemented
}
*/
return 0;
}
But on using valgrind as:
valgrind --tool=memcheck --leak-check=full --show-reachable=yes ./hello
It shows same output for memory usage/leaks whether pthread_join() is used or not used in the program.
Please explain this behaviour as I read here that:
The pthread_join() or pthread_detach() function should eventually be called for every thread that is created with the detachstate attribute set to PTHREAD_CREATE_JOINABLE so that storage associated with the thread may be reclaimed.
How the storage is reclaimed if I do not call pthread_join()
There are two questions raised from what I understand. One is why valgrind reports the same memory leaks with or without calls to pthread_join(), and the other is how does calling pthread_join() reclaim storage if it is not actually freeing any memory.
One possible explanation for both issues is that your thread library does not actually free any memory after a call to pthread_join(), but instead places the resources that were allocated into a "available if I end up creating another thread in the future" container. Lets call that container a pool. The next call to pthread_create() can re-use any resources that are in the pool. If the pool is empty, new memory is allocated.
Without calling pthread_join(), any resources associated with the exited thread would not be returned to the pool. Thus, those resources would remain unusable, the pool remain empty, and so a new pthread_create() would allocate more resources for the thread creation request.
This means a pthread_join() does not necessarily free any memory at all. It can be simply placing the reaped resources into a pool maintained by the thread library. So, with or without calls to pthread_join(), valgrind would show the same amount of "leaked" memory. But, the memory is reclaimed by pthread_join(), since it is placed in a pool for a future call to pthread_create().
Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?
struct A {
int x;
};
A* a = 0;
a->x = 123; // this is where segmentation violation occurs
// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....
I want a mechanism similar to NullPointerException that is present in Java, C# etc.
Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.
What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>
void safe_func(void)
{
puts("Safe now ?");
exit(0); //can't return to main, it's where the segfault occured.
}
void
handler (int cause, siginfo_t * info, void *uap)
{
//For test. Never ever call stdio functions in a signal handler otherwise*/
printf ("SIGSEGV raised at address %p\n", info->si_addr);
ucontext_t *context = uap;
/*On my particular system, compiled with gcc -O2, the offending instruction
generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
pointer to the next instruction (general register 14 is EIP, on linux x86) */
context->uc_mcontext.gregs[14] += 6;
//alternativly, try to jump to a "safe place"
//context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}
int
main (int argc, char *argv[])
{
struct sigaction sa;
sa.sa_sigaction = handler;
int *f = NULL;
sigemptyset (&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
if (sigaction (SIGSEGV, &sa, 0)) {
perror ("sigaction");
exit(1);
}
//cause a segfault
*f = 16;
puts("Still Alive");
return 0;
}
$ ./a.out
SIGSEGV raised at address (nil)
Still Alive
I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.
You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).
The signal name is SIGSEGV.
You will have to use the sigaction() function, from the signal.h header.
Basically, it works the following way:
struct sigaction sa1;
struct sigaction sa2;
sa1.sa_handler = your_handler_func;
sa1.sa_flags = 0;
sigemptyset( &sa1.sa_mask );
sigaction( SIGSEGV, &sa1, &sa2 );
Here's the prototype of the handler function:
void your_handler_func( int id );
As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.
"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.
See R.'s comment to MacMade answer.
Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:
#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>
static jmp_buf context;
static void sig_handler(int signo)
{
/* XXX: don't do this, not reentrant */
printf("Got SIGFPE\n");
/* avoid infinite loop */
longjmp(context, 1);
}
int main()
{
int a;
struct sigaction sa;
memset(&sa, 0, sizeof(struct sigaction));
sa.sa_handler = sig_handler;
sa.sa_flags = SA_RESTART;
sigaction(SIGFPE, &sa, NULL);
if (setjmp(context)) {
/* If this one was on setjmp's block,
* it would need to be volatile, to
* make sure the compiler reloads it.
*/
sigset_t ss;
/* Make sure to unblock SIGFPE, according to POSIX it
* gets blocked when calling its signal handler.
* sigsetjmp()/siglongjmp would make this unnecessary.
*/
sigemptyset(&ss);
sigaddset(&ss, SIGFPE);
sigprocmask(SIG_UNBLOCK, &ss, NULL);
goto skip;
}
a = 10 / 0;
skip:
printf("Exiting\n");
return 0;
}
No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.
Consider some of the possible causes of a segmentation fault:
you forgot to assign a legitimate value to a pointer
a pointer has been overwritten possibly because you are accessing heap memory you have freed
a bug has corrupted the heap
a bug has corrupted the stack
a malicious third party is attempting a buffer overflow exploit
malloc returned null because you have run out of memory
Only in the first case is there any kind of reasonable expectation that you might be able to carry on
If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.
Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:
void foobarMyProcess(struct SomeStruct* structPtr)
{
char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
//
// if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
// where the stack frame was created
//
strcpy(aBuffer, "Some longish string"); // You've just written the string to some random location in your address space
// good luck with that!
}
Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.
void segv_handler(int)
{
// Do what you want here
}
signal(SIGSEGV, segv_handler);
There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by raise or kill.
Just do yourself a favour and take error cases into account.
In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.
You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)?
Everything is possible, but it doesn't mean that it has to be done.
Unfortunately, you can't in this case. The buggy function has undefined behavior and could have corrupted your program's state.
What you CAN do is run the functions in a new process. If this process dies with a return code that indicates SIGSEGV, you know it has failed.
You could also rewrite the functions yourself.
I can see at case for recovering from a Segmentation Violation, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes Segmentation Violation are much the same as NullPointerExceptions in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.
the best solution is to inbox each unsafe access this way :
#include <iostream>
#include <signal.h>
#include <setjmp.h>
static jmp_buf buf;
int counter = 0;
void signal_handler(int)
{
longjmp(buf,0);
}
int main()
{
signal(SIGSEGV,signal_handler);
setjmp(buf);
if(counter++ == 0){ // if we did'nt try before
*(int*)(0x1215) = 10; // access an other process's memory
}
std::cout<<"i am alive !!"<<std::endl; // we will get into here in any case
system("pause");
return 0;
}
you program will never crash in almost all os
This glib manual gives you a clear picture of how to write signal handlers.
A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.
In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.
Signal handlers are broadly classified into tow categories
You can have the handler function note that the signal arrived by tweaking some
global data structures, and then return normally.
You can have the handler function terminate the program or transfer
control to a point where it can recover from the situation that caused the signal.
SIGSEGV comes under program error signals