C Runtime Library issue MD/MDd - c

We have a C library that we want to distribute, alongside C example code.
The library is of course built in release mode.
The example code project is in cmake so that it can be easily run on both Linux and Windows.
On Linux (debug and release) as well as on Windows (release), we have no issue.
However, on Windows (debug), we have an issue when leaving the main : the program triggers an assertion:
Invalid address specified to RtlValidateHeap
Expression: _CrtIsValidHeapPointer(block)
Then when continuing the process, it raises the following exception:
Unhandled exception at [...] (ntdll.dll)
0xC000000D STATUS_INVALID_PARAMETER
As this seemed related to runtime library, we tryed changing it from MDd (multithreaded dll debug) to MD (multithreaded dll) [ more on these here ] and it solved the issue.
However, this seems like a work around rather than a fix: the release library (built with MD) should be usable in a debug program using MDd, right?
As we understand it, conflicts in runtime library only appears when allocation is made in the caller and deallocation is made in the callee or vice-versa.
So we tracked all allocations to check them and everything seems ok.
We ran leak detections on the example code in both Linux (Valgrind) and Windows (CrtDbg) but they didn't find any leak, everything seems fine.
Is it right to expect a release library built with MD to run in an MDd program?
If not, it seems strange: libraries are always distributed in release, yet used in debug solutions while developing...
If yes, what can cause the issue?

It sounds more like a heap corruption than a leak. It implies someone is overwriting the heap (passed its allocated memory). Finding it can be pain in the but.
First, check your example code. Strip it down to the bare minimum "hello world" and then build it up until it happens again. Then check the example code. If it is not the example code, check which library functions were called and code-review those.
As an aid, you can use the MS heap check functions. place them at function entry and function exit, or maintain global versions that you regularly check. The following is an example:
#include <crtdbg.h>
void example(void)
{
_CrtMemState memStateStart, memStateEnd, memStateDelta;
// Make a checkpoint of the heap's state so we can later check the heap is still OK
_CrtMemCheckpoint( &memStateStart );
//
// do your things
//
// Check the heap
_CrtMemCheckpoint( &memStateEnd );
_CrtSetReportMode( _CRT_WARN, _CRTDBG_MODE_WNDW );
_CrtSetReportMode( _CRT_ERROR, _CRTDBG_MODE_WNDW );
_CrtSetReportMode( _CRT_ASSERT, _CRTDBG_MODE_WNDW );
if (_CrtMemDifference( &memStateDelta, &memStateStart, &memStateEnd ))
_CrtMemDumpStatistics( &memStateDelta );
_CrtDumpMemoryLeaks();
}

Related

Why might malloc'd memory from a shared library be inaccessible to the application?

I maintain a library written in C, which is being accessed by a user on Linux, directly from Python using a module which loads the shared library and call functions. The module is very commonly used, as is this version of the shared library, by people doing a popular tutorial.
The user is getting a segmentation fault. Running his Python script under gdb, he sees that it is in the shared library, within a function that mallocs memory for an struct and returns the pointer. He is getting a pointer back, but when he attempts to use it in subsequent calls to the shared library, the segmentation fault occurs as the memory is inaccessible.
If he runs the Python script as root, the problem does not occur. Nor does it occur in an alternate Linux installation.
So to recap:
His Python code loads the shared library.
It then calls a function which returns a pointer to memory allocated within the shared library.
Then he calls another function in the shared library, and passed in the pointer it returned to him, and the shared library chokes on it's own pointer.
It only occurs when he runs it as a normal user on "4.0.7-2-ARCH x86_64 GNU/Linux". It does not occur on that OS, when he switches to root and runs it.
It does not occur when he attempts to reproduce the problem on a Ubuntu machine.
What gives? Is this some ARCH bug? Or is there programming nuances to this which can be cleared up?
You can read the minutiae here which includes enough detail to reproduce the problem, if the problem is not self-evident to users with more Linux programming experience than I.
Quick links to the shared library functions:
Source code for TCOD_map_new.
Source code for TCOD_map_set_properties.
Excerpt of his Python code for posterity and ease of access:
#!/usr/bin/env python2
import curses
import libtcodpy as libtcod
def main(stdscr):
curses.start_color()
curses.use_default_colors()
map = libtcod.map_new(10, 10) # any numbers work
libtcod.map_set_properties(map, 0, 0, True, True) # any in bounds integer coordinates fail
stdscr.getch()
curses.wrapper(main)
I met the same problem with you. My solution is that I declared the string ( malloc() ) in the caller function, than pass-by-reference to the callee function and it fill the content.

MALLOCDEBUG showing random output when using xlc_r

I have a program, compiled using xlc_r, that spawns off multiple threads and am trying to trace it to see if there's any memory leaks. I've gone through this article detailing how I can use the MALLOCDEBUG feature that's built in to AIX, but after running format_mallocdebug_op.sh, it shows memory leaks all over the place for random pthread and file methods such as pthread_attri_init, _pth_init, fopen, fwrite, etc.
I then made a smaller test program that purposefully doesn't free a char * and compiled it with xlc_r and almost the exact same output appeared. I then compiled the test program again but with xlc and it worked correctly, showing the one char * memory leak and that was it. It seems that the MALLOCDEBUG feature doesn't work well with multi-threaded compiled applications. Is there a setting to tell it to be aware of this?

Why the same function behaves differently after linking other object files?

I am handling a bug of a R extension which only occurs on debian system.
The SSL_CTX_new function produces a stack smashing detected during runtime which might indicate an occurrence of segfault.
To understand the bug, I write a standalone test function:
#include <Rcpp.h>
#include <openssl/ssl.h>
RcppExport SEXP test() {
BEGIN_RCPP
SSL_library_init();
SSL_CTX_new(SSLv23_client_method());
END_RCPP
}
This function run normally standalone.
However, after linking my existed project with the test function, it produces a stack smashing detected
Why the same function behaves differently after linking other object files? Could anyone give me some hints? Thanks!
Here is my project: https://github.com/wush978/RMessenger. It crashes on debian so far.
R handles its own memory management. The Valgrind memory profiler / debugger has been used successfully before, and there are some posts on the web.
If I understand your posts correctly, then the SSL routine may be doing something that upsets R. You will have to debug that. What you have posted here does not constitute a reproducible bug report.
You may also find the feedback you could get on the rcpp-devel list helpful.

dlmalloc crash on Win7

For some time now I've been happily using dlmalloc for a cross-platform project (Windows, Mac OS X, Ubuntu). Recently, however, it seems that using dlmalloc leads to a crash-on-exit on Windows 7.
To make sure that it wasn't something goofy in my project, I created a super-minimal test program-- it doesn't do anything but return from main. One version ("malloctest") links to dlmalloc and the other ("regulartest") doesn't. On WinXP, both run fine. On Windows 7, malloctest crashes. You can see screencasts of the tests here.
My question is: why is this happening? Is it a bug in dlmalloc? Or has the loader in Windows 7 changed? Is there a workaround?
fyi, here is the test code (test.cpp):
#include <stdio.h>
int main() {
return 0;
}
and here is the nmake makefile:
all: regulartest.exe malloctest.exe
malloctest.exe: malloc.obj test.obj
link /out:$# $**
regulartest.exe: test.obj
link /out:$# $**
clean:
del *.exe *.obj
For brevity, I won't include the dlmalloc source in this post, but you can get it (v2.8.4) here.
Edit: See these other relavent SO posts:
Is there a way to redefine malloc at link time on Windows?
Globally override malloc in visual c++
Looks like a bug in the C runtime. Using Visual Studio 2008 on Windows 7, I reproduced the same problem. After some quick debugging by putting breakpoints in dlmalloc and dlfree, I saw that dlfree was getting called with an address that it never returned earlier from dlmalloc, and then it was hitting an access violation shortly thereafter.
Thankfully, the C runtime's source code is distributed along with VS, so I could see that this call to free was coming from the __endstdio function in _file.c. The corresponding allocation was in __initstdio, and it was calling _calloc_crt to allocate its memory. _calloc_crt calls _calloc_impl, which calls HeapAlloc to get memory. _malloc_crt (used elsewhere in the C runtime, such as to allocate memory for the environment and for argv), on the other hand, calls straight to malloc, and _free_crt calls straight to free.
So, for the memory that gets allocated with _malloc_crt and freed with _free_crt, everything is fine and dandy. But for the memory that gets allocated with _calloc_crt and freed with _free_crt, bad things happen.
I don't know if replacing malloc like this is supported -- if it is, then this is a bug with the CRT. If not, I'd suggest looking into a different C runtime (e.g. MinGW or Cygwin GCC).
Using dlmalloc in cross-platform code is an oxymoron. Replacing any standard C functions (especially malloc and family) results in undefined behavior. The closest thing to a portable way to replace malloc is using search-and-replace (not #define; that's also UB) on the source files to call (for example) my_malloc instead of malloc. Note that internal C library functions will still use their standard malloc, so if the two conflict, things will still blow up. Basically, trying to replace malloc is just really misguided. If your system really has a broken malloc implementation (too slow, too much fragmentation, etc.) then you need to do your replacement in an implementation-specific way, and disable the replacement on all systems except ones where you've carefully checked that your implementation-specific replacement works correctly.

Is it possible to do hot code swapping in C?

this
en.wikipedia.org/wiki/Hot_swapping#cite_note-1
says that VS can do it with the help of its debugger. Does gdb provide a similar functionality ?
this is the closest i could find, but doesn't seem to be ready to be used:
http://www.aitdspace.gr/xmlui/handle/123456789/219
dlopen/dlsym/dlclose are also close, but will not work for -lmylib referenced libraries (reference count never gets to 0).
alternatives i've considered:
1) using -Wl,-wrap,foo and on __wrap_foo() { func = dlopen(); func(); }
2) making libfoo.so a shared library and when we need to hotswap we dlopen(RTLD_GLOBAL) to load the new code and provide updated symbols to the next call to foo();
1) doesn't work very well because it requires me to enumerate all the functions i want to hotswap, which are all of them.
2) doesn't work very well because when foo() is called, the new code is loaded, but foo has forever the reference to that symbol. calling dlopen multiple times make foo to be re evaluated.
You may be interested in Ksplice. It's a technology that came out of MIT that allows software patches to be applied to the Linux kernel without rebooting. This is most relevant for applying security updates:
http://www.ksplice.com/paper
You could certainly hack yourself a system where you store a list of function pointers and can change these pointers to point to whatever library you have dlopen()'d at the time.
You're right, there isn't any easy way to intercept calls to routines with fixed linkage. You can always clobber the start of the routine with an assembly jump to another routine, but that can be dangerous (and isn't C).
Maybe a symbol which is weak in your code and strong in a dlopen()'d library would work?
In any of these cases, you have to deal with the situation where the old code is currently running. That isn't easy either, unless you have points in your program where you know no thread is in the library you want to swap.
the closest i have found is solari dbx which comes with oracle developer studio,however dev studio uses dbx in both linux and solaris,only solaris version supports "edit-and-continue" or "hot code swap"

Resources