Should we check if memory allocations fail? - c

I've seen a lot of code that checks for NULL pointers whenever an allocation is made. This makes the code verbose, and if it's not done consistently, only when the programmer felt like it, doesn't even ensure that the program won't crash when the address space runs out. Besides, if the program can't make more allocations, it wouldn't be able to do its function anyway, right?
So my question is, isn't it better for most programs not to check at all and just let the program crash if memory runs out? At least the code is more readable that way.
Note
I'm talking about desktop apps that run on modern computers (at least 2 GB address space), and that most definitely don't operate space shuttles, life support systems, or BP's oil platforms. Most importantly I'm talking about programs that use malloc but never really go above 5 MB of memory usage.

Always check the return value, but for clarity, it's common to wrap malloc() in a function which never returns NULL:
void *
emalloc(size_t amt){
void *v = malloc(amt);
if(!v){
fprintf(stderr, "out of mem\n");
exit(EXIT_FAILURE);
}
return v;
}
Then, later you can use
char *foo = emalloc(56);
foo[12] = 'A';
With no guilty conscience.

Yes, you should check for a null return value from malloc. Even if you can't recover from the failure of memory allocation you should explicitly exit. Carrying on as though memory allocation had succeeded leaves your application in an inconsistent state and is likely to cause "undefined behavior" which should be avoided.
For example, you may end up writing inconsistent data to external storage which may hinder the ability of the next run of the application to recover. It's much safer to exit swiftly in a more controlled fashion.
Many applications that want to exit on allocation failure wrap malloc in a function that checks the return value and explicitly aborts on failure.
Arguably, this is one advantage of the C++ default new approach to throw an exception on allocation failure. It requires no effort to exit on memory allocation failure.

Similar to Dave's approach above, but adds a macro that automatically passes
the file name and line number to our allocation routine so that we can report
that information in the event of a failure.
#include <stdio.h>
#include <stdlib.h>
#define ZMALLOC(theSize) zmalloc(__FILE__, __LINE__, theSize)
static void *zmalloc(const char *file, int line, int size)
{
void *ptr = malloc(size);
if(!ptr)
{
printf("Could not allocate: %d bytes (%s:%d)\n", size, file, line);
exit(1);
}
return(ptr);
}
int main()
{
/* -- Set 'forceFailure' to a non-zero value in order to observe
how 'zmalloc' behaves when it cannot allocate the
requested memory -- */
int bytes = 10 * sizeof(int);
int forceFailure = 0;
int *anArray = NULL;
if(forceFailure)
bytes = -1;
anArray = ZMALLOC(bytes);
free(anArray);
return(0);
}

but it is much more difficult to troubleshoot if you don't log where the malloc failed.
failed to allocate memory in line XX is to prefer than just to crash.

You should definitely check the return value for malloc. Helpful in debugging and the code becomes robust.
Always check malloc'ed memory?

In a hosted environment error checking the return of malloc makes not much sense nowadays. Most machines have a virtual address space of 64 bit. You'd need a lot of time to exhaust that. Your program will most likely fail at a completely different place, namely when your physical+swap memory is exhausted. It will have shown completely ridiculous performance before that, because it only was swapping and the user will have triggered Cntrl-C long before you ever come there.
Segfaulting "nicely" on a null pointer reference would be a clear point to see where things fail in a debugger. But in my practice I have never seen a failed malloc as a cause.
When programming for embedded systems the picture changes completely. There you definitively should check for failed malloc.
Edit: To clarify that after the edit of the question. The kind of programs/systems described there are clearly not "embedded". I have never seen malloc fail under the circumstances described there.

I'd like to add that edge cases should always be checked even if you think they are safe or cannot lead to other issues than a crash. Null pointer dereference can potentially be exploited (http://uninformed.org/?v=4&a=5&t=sumry).

Related

Freeing a null pointer returned from realloc?

When I have a pointer which should repeatedly be used as argument to realloc and to save it's return value, I understand that realloc won't touch the old object if no allocation could take place, returning NULL. Should I still be worried about the old object in a construction like:
int *p = (int *)malloc(sizeof(int *));
if (p = (int *)realloc(p, 2 * sizeof(int *)))
etc...
Now if realloc succeeds, I need to free(p) when I'm done with it.
When realloc fails, I have assigned it's return NULL to p and free(p) doesn't do anything (as it is a free(NULL)). At the same time (according to the standard) the old object is not deallocated.
So should I have a copy of p (e.g. int *last_p = p;) for keeping track of the old object, so that I can free(last_p) when realloc fails?
So should I have a copy of p (e.g. int *last_p = p;) for keeping track of the old object, so that I can free(last_p) when realloc fails?
Basically: yes, of course.
It is generally not suggested to use the pattern you showed:
p = (int *)realloc(p, ...)
This is considered bad practice for the reason you found out.
Use this instead:
void *new = realloc(p, ...);
if (new != NULL)
p = new;
else
...
It is also considered bad practice to cast the return value of malloc, realloc etc. in C. That is not required.
Should you save a temporary pointer?
It can be a good thing in certain situations, as have been pointed out in previous answers. Provided that you have a plan for how to continue execution after the failure.
The most important thing is not how you handle the error. The important thing is that you do something and not just assume there's no error. Exiting is a perfectly valid way of handling an error.
Don't do it, unless you plan a sensible recovery
However, do note that in most situations, a failure from realloc is pretty hard to recover from. Often is exiting the only sensible option. If you cannot acquire enough memory for your task, what are you going to do? I have encountered a situation where recovering was sensible only once. I had an algorithm for a problem, and I realized that I could make significant improvement to performance if I allocated a few gigabytes of ram. It worked fine with just a few kilobytes, but it got noticeably faster with the extra ram usage. So that code was basically like this:
int *huge_buffer = malloc(1000*1000*1000*sizeof *hugebuffer);
if(!huge_buffer)
slow_version();
else
fast_version();
In those cases, just do this:
p = realloc(p, 2 * sizeof *p)
if(!p) {
fprintf(stderr, "Error allocating memory");
exit(EXIT_FAILURE);
}
Do note both changes to the call. I removed casting and changed the sizeof. Read more about that here: Do I cast the result of malloc?
Or even better, if you don't care in general. Write a wrapper.
void *my_realloc(void *p, size_t size) {
void *tmp = realloc(p, size);
if(tmp) return tmp;
fprintf(stderr, "Error reallocating\n");
free(p);
exit(EXIT_FAILURE);
return NULL; // Will never be executed, but to avoid warnings
}
Note that this might contradict what I'm writing below, where I'm writing that it's not always necessary to free before exiting. The reason is that since proper error handling is so easy to do when I have abstracted it all out to a single function, I might as well do it right. It only took one extra line in this case. For the whole program.
Related: What REALLY happens when you don't free after malloc?
About backwards compatibility in general
Some would say that it's good practice to free before exiting, just because it d actually does matter in certain circumstances. My opinion is that these circumstances are quite specialized, like when coding embedded systems with an OS that does not free memory automatically when they terminate. If you're coding in such an environment, you should know that. And if you're coding for an OS that does this for you, then why not utilize it to keep down code complexity?
In general, I think some C coders focuses too much on backwards compatibility with ancient computers from the 1970th that you only encounter on museums today. In most cases, it's pretty fair to assume ascii, two complement, char size of 8 bits and such things.
A comparison would be to still code web pages so that they are possible to view in Netscape Navigator, Mosaic and Lynx. Only spend time on that if there really is a need.
Even if you skip backwards compatibility, use some guards
However, whenever you make assumptions it can be a good thing to include some meta code that makes the compilation fail with wrong target. For instance with this, if your program relies on 8 bit chars:
_Static_assert(CHAR_BITS == 8, "Char bits");
That will make your program crash upon compilation. If you're doing cross compiling, this might possibly be more complicated. I don't know how to do it properly then.
So should I have a copy of p (e.g. int *last_p = p;) for keeping track of the old object, so that I can free(last_p) when realloc fails?
Yes, exactly. Usually I see the new pointer is assigned:
void *pnew = realloc(p, 2 * sizeof(int *)))
if (pnew == NULL) {
free(p);
/* handle error */
return ... ;
}
p = pnew;

How to handle memory access violations? [duplicate]

Is there any way to determine (programatically, of course) if a given pointer is "valid"? Checking for NULL is easy, but what about things like 0x00001234? When trying to dereference this kind of pointer an exception/crash occurs.
A cross-platform method is preferred, but platform-specific (for Windows and Linux) is also ok.
Update for clarification:
The problem is not with stale/freed/uninitialized pointers; instead, I'm implementing an API that takes pointers from the caller (like a pointer to a string, a file handle, etc.). The caller can send (in purpose or by mistake) an invalid value as the pointer. How do I prevent a crash?
Update for clarification: The problem is not with stale, freed or uninitialized pointers; instead, I'm implementing an API that takes pointers from the caller (like a pointer to a string, a file handle, etc.). The caller can send (in purpose or by mistake) an invalid value as the pointer. How do I prevent a crash?
You can't make that check. There is simply no way you can check whether a pointer is "valid". You have to trust that when people use a function that takes a pointer, those people know what they are doing. If they pass you 0x4211 as a pointer value, then you have to trust it points to address 0x4211. And if they "accidentally" hit an object, then even if you would use some scary operation system function (IsValidPtr or whatever), you would still slip into a bug and not fail fast.
Start using null pointers for signaling this kind of thing and tell the user of your library that they should not use pointers if they tend to accidentally pass invalid pointers, seriously :)
Here are three easy ways for a C program under Linux to get introspective about the status of the memory in which it is running, and why the question has appropriate sophisticated answers in some contexts.
After calling getpagesize() and rounding the pointer to a page
boundary, you can call mincore() to find out if a page is valid and
if it happens to be part of the process working set. Note that this requires
some kernel resources, so you should benchmark it and determine if
calling this function is really appropriate in your api. If your api
is going to be handling interrupts, or reading from serial ports
into memory, it is appropriate to call this to avoid unpredictable
behaviors.
After calling stat() to determine if there is a /proc/self directory available, you can fopen and read through /proc/self/maps
to find information about the region in which a pointer resides.
Study the man page for proc, the process information pseudo-file
system. Obviously this is relatively expensive, but you might be
able to get away with caching the result of the parse into an array
you can efficiently lookup using a binary search. Also consider the
/proc/self/smaps. If your api is for high-performance computing then
the program will want to know about the /proc/self/numa which is
documented under the man page for numa, the non-uniform memory
architecture.
The get_mempolicy(MPOL_F_ADDR) call is appropriate for high performance computing api work where there are multiple threads of
execution and you are managing your work to have affinity for non-uniform memory
as it relates to the cpu cores and socket resources. Such an api
will of course also tell you if a pointer is valid.
Under Microsoft Windows there is the function QueryWorkingSetEx that is documented under the Process Status API (also in the NUMA API).
As a corollary to sophisticated NUMA API programming this function will also let you do simple "testing pointers for validity (C/C++)" work, as such it is unlikely to be deprecated for at least 15 years.
Preventing a crash caused by the caller sending in an invalid pointer is a good way to make silent bugs that are hard to find.
Isn't it better for the programmer using your API to get a clear message that his code is bogus by crashing it rather than hiding it?
On Win32/64 there is a way to do this. Attempt to read the pointer and catch the resulting SEH exeception that will be thrown on failure. If it doesn't throw, then it's a valid pointer.
The problem with this method though is that it just returns whether or not you can read data from the pointer. It makes no guarantee about type safety or any number of other invariants. In general this method is good for little else other than to say "yes, I can read that particular place in memory at a time that has now passed".
In short, Don't do this ;)
Raymond Chen has a blog post on this subject: http://blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx
AFAIK there is no way. You should try to avoid this situation by always setting pointers to NULL after freeing memory.
On Unix you should be able to utilize a kernel syscall that does pointer checking and returns EFAULT, such as:
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdbool.h>
bool isPointerBad( void * p )
{
int fh = open( p, 0, 0 );
int e = errno;
if ( -1 == fh && e == EFAULT )
{
printf( "bad pointer: %p\n", p );
return true;
}
else if ( fh != -1 )
{
close( fh );
}
printf( "good pointer: %p\n", p );
return false;
}
int main()
{
int good = 4;
isPointerBad( (void *)3 );
isPointerBad( &good );
isPointerBad( "/tmp/blah" );
return 0;
}
returning:
bad pointer: 0x3
good pointer: 0x7fff375fd49c
good pointer: 0x400793
There's probably a better syscall to use than open() [perhaps access], since there's a chance that this could lead to actual file creation codepath, and a subsequent close requirement.
Regarding the answer a bit up in this thread:
IsBadReadPtr(), IsBadWritePtr(), IsBadCodePtr(), IsBadStringPtr() for Windows.
My advice is to stay away from them, someone has already posted this one:
http://blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx
Another post on the same topic and by the same author (I think) is this one:
http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx ("IsBadXxxPtr should really be called CrashProgramRandomly").
If the users of your API sends in bad data, let it crash. If the problem is that the data passed isn't used until later (and that makes it harder to find the cause), add a debug mode where the strings etc. are logged at entry. If they are bad it will be obvious (and probably crash). If it is happening way to often, it might be worth moving your API out of process and let them crash the API process instead of the main process.
Firstly, I don't see any point in trying to protect yourself from the caller deliberately trying to cause a crash. They could easily do this by trying to access through an invalid pointer themselves. There are many other ways - they could just overwrite your memory or the stack. If you need to protect against this sort of thing then you need to be running in a separate process using sockets or some other IPC for communication.
We write quite a lot of software that allows partners/customers/users to extend functionality. Inevitably any bug gets reported to us first so it is useful to be able to easily show that the problem is in the plug-in code. Additionally there are security concerns and some users are more trusted than others.
We use a number of different methods depending on performance/throughput requirements and trustworthyness. From most preferred:
separate processes using sockets (often passing data as text).
separate processes using shared memory (if large amounts of data to pass).
same process separate threads via message queue (if frequent short messages).
same process separate threads all passed data allocated from a memory pool.
same process via direct procedure call - all passed data allocated from a memory pool.
We try never to resort to what you are trying to do when dealing with third party software - especially when we are given the plug-ins/library as binary rather than source code.
Use of a memory pool is quite easy in most circumstances and needn't be inefficient. If YOU allocate the data in the first place then it is trivial to check the pointers against the values you allocated. You could also store the length allocated and add "magic" values before and after the data to check for valid data type and data overruns.
I've got a lot of sympathy with your question, as I'm in an almost identical position myself. I appreciate what a lot of the replies are saying, and they are correct - the routine supplying the pointer should be providing a valid pointer. In my case, it is almost inconceivable that they could have corrupted the pointer - but if they had managed, it would be MY software that crashes, and ME that would get the blame :-(
My requirement isn't that I continue after a segmentation fault - that would be dangerous - I just want to report what happened to the customer before terminating so that they can fix their code rather than blaming me!
This is how I've found to do it (on Windows): http://www.cplusplus.com/reference/clibrary/csignal/signal/
To give a synopsis:
#include <signal.h>
using namespace std;
void terminate(int param)
/// Function executed if a segmentation fault is encountered during the cast to an instance.
{
cerr << "\nThe function received a corrupted reference - please check the user-supplied dll.\n";
cerr << "Terminating program...\n";
exit(1);
}
...
void MyFunction()
{
void (*previous_sigsegv_function)(int);
previous_sigsegv_function = signal(SIGSEGV, terminate);
<-- insert risky stuff here -->
signal(SIGSEGV, previous_sigsegv_function);
}
Now this appears to behave as I would hope (it prints the error message, then terminates the program) - but if someone can spot a flaw, please let me know!
There are no provisions in C++ to test for the validity of a pointer as a general case. One can obviously assume that NULL (0x00000000) is bad, and various compilers and libraries like to use "special values" here and there to make debugging easier (For example, if I ever see a pointer show up as 0xCECECECE in visual studio I know I did something wrong) but the truth is that since a pointer is just an index into memory it's near impossible to tell just by looking at the pointer if it's the "right" index.
There are various tricks that you can do with dynamic_cast and RTTI such to ensure that the object pointed to is of the type that you want, but they all require that you are pointing to something valid in the first place.
If you want to ensure that you program can detect "invalid" pointers then my advice is this: Set every pointer you declare either to NULL or a valid address immediately upon creation and set it to NULL immediately after freeing the memory that it points to. If you are diligent about this practice, then checking for NULL is all you ever need.
Setting the pointer to NULL before and after using is a good technique. This is easy to do in C++ if you manage pointers within a class for example (a string):
class SomeClass
{
public:
SomeClass();
~SomeClass();
void SetText( const char *text);
char *GetText() const { return MyText; }
void Clear();
private:
char * MyText;
};
SomeClass::SomeClass()
{
MyText = NULL;
}
SomeClass::~SomeClass()
{
Clear();
}
void SomeClass::Clear()
{
if (MyText)
free( MyText);
MyText = NULL;
}
void SomeClass::Settext( const char *text)
{
Clear();
MyText = malloc( strlen(text));
if (MyText)
strcpy( MyText, text);
}
Indeed, something could be done under specific occasion: for example if you want to check whether a string pointer string is valid, using write(fd, buf, szie) syscall can help you do the magic: let fd be a file descriptor of temporary file you create for test, and buf pointing to the string you are tesing, if the pointer is invalid write() would return -1 and errno set to EFAULT which indicating that buf is outside your accessible address space.
Peeter Joos answer is pretty good. Here is an "official" way to do it:
#include <sys/mman.h>
#include <stdbool.h>
#include <unistd.h>
bool is_pointer_valid(void *p) {
/* get the page size */
size_t page_size = sysconf(_SC_PAGESIZE);
/* find the address of the page that contains p */
void *base = (void *)((((size_t)p) / page_size) * page_size);
/* call msync, if it returns non-zero, return false */
int ret = msync(base, page_size, MS_ASYNC) != -1;
return ret ? ret : errno != ENOMEM;
}
There isn't any portable way of doing this, and doing it for specific platforms can be anywhere between hard and impossible. In any case, you should never write code that depends on such a check - don't let the pointers take on invalid values in the first place.
As others have said, you can't reliably detect an invalid pointer. Consider some of the forms an invalid pointer might take:
You could have a null pointer. That's one you could easily check for and do something about.
You could have a pointer to somewhere outside of valid memory. What constitutes valid memory varies depending on how the run-time environment of your system sets up the address space. On Unix systems, it is usually a virtual address space starting at 0 and going to some large number of megabytes. On embedded systems, it could be quite small. It might not start at 0, in any case. If your app happens to be running in supervisor mode or the equivalent, then your pointer might reference a real address, which may or may not be backed up with real memory.
You could have a pointer to somewhere inside your valid memory, even inside your data segment, bss, stack or heap, but not pointing at a valid object. A variant of this is a pointer that used to point to a valid object, before something bad happened to the object. Bad things in this context include deallocation, memory corruption, or pointer corruption.
You could have a flat-out illegal pointer, such as a pointer with illegal alignment for the thing being referenced.
The problem gets even worse when you consider segment/offset based architectures and other odd pointer implementations. This sort of thing is normally hidden from the developer by good compilers and judicious use of types, but if you want to pierce the veil and try to outsmart the operating system and compiler developers, well, you can, but there is not one generic way to do it that will handle all of the issues you might run into.
The best thing you can do is allow the crash and put out some good diagnostic information.
In general, it's impossible to do. Here's one particularly nasty case:
struct Point2d {
int x;
int y;
};
struct Point3d {
int x;
int y;
int z;
};
void dump(Point3 *p)
{
printf("[%d %d %d]\n", p->x, p->y, p->z);
}
Point2d points[2] = { {0, 1}, {2, 3} };
Point3d *p3 = reinterpret_cast<Point3d *>(&points[0]);
dump(p3);
On many platforms, this will print out:
[0 1 2]
You're forcing the runtime system to incorrectly interpret bits of memory, but in this case it's not going to crash, because the bits all make sense. This is part of the design of the language (look at C-style polymorphism with struct inaddr, inaddr_in, inaddr_in6), so you can't reliably protect against it on any platform.
It's unbelievable how much misleading information you can read in articles above...
And even in microsoft msdn documentation IsBadPtr is claimed to be banned. Oh well - I prefer working application rather than crashing. Even if term working might be working incorrectly (as long as end-user can continue with application).
By googling I haven't found any useful example for windows - found a solution for 32-bit apps,
http://www.codeproject.com/script/Content/ViewAssociatedFile.aspx?rzp=%2FKB%2Fsystem%2Fdetect-driver%2F%2FDetectDriverSrc.zip&zep=DetectDriverSrc%2FDetectDriver%2Fsrc%2FdrvCppLib%2Frtti.cpp&obid=58895&obtid=2&ovid=2
but I need also to support 64-bit apps, so this solution did not work for me.
But I've harvested wine's source codes, and managed to cook similar kind of code which would work for 64-bit apps as well - attaching code here:
#include <typeinfo.h>
typedef void (*v_table_ptr)();
typedef struct _cpp_object
{
v_table_ptr* vtable;
} cpp_object;
#ifndef _WIN64
typedef struct _rtti_object_locator
{
unsigned int signature;
int base_class_offset;
unsigned int flags;
const type_info *type_descriptor;
//const rtti_object_hierarchy *type_hierarchy;
} rtti_object_locator;
#else
typedef struct
{
unsigned int signature;
int base_class_offset;
unsigned int flags;
unsigned int type_descriptor;
unsigned int type_hierarchy;
unsigned int object_locator;
} rtti_object_locator;
#endif
/* Get type info from an object (internal) */
static const rtti_object_locator* RTTI_GetObjectLocator(void* inptr)
{
cpp_object* cppobj = (cpp_object*) inptr;
const rtti_object_locator* obj_locator = 0;
if (!IsBadReadPtr(cppobj, sizeof(void*)) &&
!IsBadReadPtr(cppobj->vtable - 1, sizeof(void*)) &&
!IsBadReadPtr((void*)cppobj->vtable[-1], sizeof(rtti_object_locator)))
{
obj_locator = (rtti_object_locator*) cppobj->vtable[-1];
}
return obj_locator;
}
And following code can detect whether pointer is valid or not, you need probably to add some NULL checking:
CTest* t = new CTest();
//t = (CTest*) 0;
//t = (CTest*) 0x12345678;
const rtti_object_locator* ptr = RTTI_GetObjectLocator(t);
#ifdef _WIN64
char *base = ptr->signature == 0 ? (char*)RtlPcToFileHeader((void*)ptr, (void**)&base) : (char*)ptr - ptr->object_locator;
const type_info *td = (const type_info*)(base + ptr->type_descriptor);
#else
const type_info *td = ptr->type_descriptor;
#endif
const char* n =td->name();
This gets class name from pointer - I think it should be enough for your needs.
One thing which I'm still afraid is performance of pointer checking - in code snipet above there is already 3-4 API calls being made - might be overkill for time critical applications.
It would be good if someone could measure overhead of pointer checking compared for example to C#/managed c++ calls.
It is not a very good policy to accept arbitrary pointers as input parameters in a public API. It's better to have "plain data" types like an integer, a string or a struct (I mean a classical struct with plain data inside, of course; officially anything can be a struct).
Why? Well because as others say there is no standard way to know whether you've been given a valid pointer or one that points to junk.
But sometimes you don't have the choice - your API must accept a pointer.
In these cases, it is the duty of the caller to pass a good pointer. NULL may be accepted as a value, but not a pointer to junk.
Can you double-check in any way? Well, what I did in a case like that was to define an invariant for the type the pointer points to, and call it when you get it (in debug mode). At least if the invariant fails (or crashes) you know that you were passed a bad value.
// API that does not allow NULL
void PublicApiFunction1(Person* in_person)
{
assert(in_person != NULL);
assert(in_person->Invariant());
// Actual code...
}
// API that allows NULL
void PublicApiFunction2(Person* in_person)
{
assert(in_person == NULL || in_person->Invariant());
// Actual code (must keep in mind that in_person may be NULL)
}
Following does work in Windows (somebody suggested it before):
static void copy(void * target, const void* source, int size)
{
__try
{
CopyMemory(target, source, size);
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
doSomething(--whatever--);
}
}
The function has to be static, standalone or static method of some class.
To test on read-only, copy data in the local buffer.
To test on write without modifying contents, write them over.
You can test first/last addresses only.
If pointer is invalid, control will be passed to 'doSomething',
and then outside the brackets.
Just do not use anything requiring destructors, like CString.
On Windows I use this code:
void * G_pPointer = NULL;
const char * G_szPointerName = NULL;
void CheckPointerIternal()
{
char cTest = *((char *)G_pPointer);
}
bool CheckPointerIternalExt()
{
bool bRet = false;
__try
{
CheckPointerIternal();
bRet = true;
}
__except (EXCEPTION_EXECUTE_HANDLER)
{
}
return bRet;
}
void CheckPointer(void * A_pPointer, const char * A_szPointerName)
{
G_pPointer = A_pPointer;
G_szPointerName = A_szPointerName;
if (!CheckPointerIternalExt())
throw std::runtime_error("Invalid pointer " + std::string(G_szPointerName) + "!");
}
Usage:
unsigned long * pTest = (unsigned long *) 0x12345;
CheckPointer(pTest, "pTest"); //throws exception
On macOS, you can do this with mach_vm_region, which as well as telling you if a pointer is valid, also lets you validate what access you have to the memory to which the pointer points (read/write/execute). I provided sample code to do this in my answer to another question:
#include <mach/mach.h>
#include <mach/mach_vm.h>
#include <stdio.h>
#include <stdbool.h>
bool ptr_is_valid(void *ptr, vm_prot_t needs_access) {
vm_map_t task = mach_task_self();
mach_vm_address_t address = (mach_vm_address_t)ptr;
mach_vm_size_t size = 0;
vm_region_basic_info_data_64_t info;
mach_msg_type_number_t count = VM_REGION_BASIC_INFO_COUNT_64;
mach_port_t object_name;
kern_return_t ret = mach_vm_region(task, &address, &size, VM_REGION_BASIC_INFO_64, (vm_region_info_t)&info, &count, &object_name);
if (ret != KERN_SUCCESS) return false;
return ((mach_vm_address_t)ptr) >= address && ((info.protection & needs_access) == needs_access);
}
#define TEST(ptr,acc) printf("ptr_is_valid(%p,access=%d)=%d\n", (void*)(ptr), (acc), ptr_is_valid((void*)(ptr),(acc)))
int main(int argc, char**argv) {
TEST(0,0);
TEST(0,VM_PROT_READ);
TEST(123456789,VM_PROT_READ);
TEST(main,0);
TEST(main,VM_PROT_READ);
TEST(main,VM_PROT_READ|VM_PROT_EXECUTE);
TEST(main,VM_PROT_EXECUTE);
TEST(main,VM_PROT_WRITE);
TEST((void*)(-1),0);
return 0;
}
The SEI CERT C Coding Standard recommendation MEM10-C. Define and use a pointer validation function says it is possible to do a check to some degree, especially under Linux OS.
The method described in the link is to keep track of the highest memory address returned by malloc and add a function that tests if someone tries to use a pointer greater than that value. It is probably of limited use.
IsBadReadPtr(), IsBadWritePtr(), IsBadCodePtr(), IsBadStringPtr() for Windows.
These take time proportional to the length of the block, so for sanity check I just check the starting address.
I have seen various libraries use some method to check for unreferenced memory and such. I believe they simply "override" the memory allocation and deallocation methods (malloc/free), which has some logic that keeps track of the pointers. I suppose this is overkill for your use case, but it would be one way to do it.
Technically you can override operator new (and delete) and collect information about all allocated memory, so you can have a method to check if heap memory is valid.
but:
you still need a way to check if pointer is allocated on stack ()
you will need to define what is 'valid' pointer:
a) memory on that address is
allocated
b) memory at that address
is start address of object (e.g.
address not in the middle of huge
array)
c) memory at that address
is start address of object of expected type
Bottom line: approach in question is not C++ way, you need to define some rules which ensure that function receives valid pointers.
There is no way to make that check in C++. What should you do if other code passes you an invalid pointer? You should crash. Why? Check out this link: http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx
Addendum to the accpeted answer(s):
Assume that your pointer could hold only three values -- 0, 1 and -1 where 1 signifies a valid pointer, -1 an invalid one and 0 another invalid one. What is the probability that your pointer is NULL, all values being equally likely? 1/3. Now, take the valid case out, so for every invalid case, you have a 50:50 ratio to catch all errors. Looks good right? Scale this for a 4-byte pointer. There are 2^32 or 4294967294 possible values. Of these, only ONE value is correct, one is NULL, and you are still left with 4294967292 other invalid cases. Recalculate: you have a test for 1 out of (4294967292+ 1) invalid cases. A probability of 2.xe-10 or 0 for most practical purposes. Such is the futility of the NULL check.
You know, a new driver (at least on Linux) that is capable of this probably wouldn't be that hard to write.
On the other hand, it would be folly to build your programs like this. Unless you have some really specific and single use for such a thing, I wouldn't recommend it. If you built a large application loaded with constant pointer validity checks it would likely be horrendously slow.
you should avoid these methods because they do not work. blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx – JaredPar Feb 15 '09 at 16:02
If they don't work - next windows update will fix it ?
If they don't work on concept level - function will be probably removed from windows api completely.
MSDN documentation claim that they are banned, and reason for this is probably flaw of further design of application (e.g. generally you should not eat invalid pointers silently - if you're in charge of design of whole application of course), and performance/time of pointer checking.
But you should not claim that they does not work because of some blog.
In my test application I've verified that they do work.
these links may be helpful
_CrtIsValidPointer
Verifies that a specified memory range is valid for reading and writing (debug version only).
http://msdn.microsoft.com/en-us/library/0w1ekd5e.aspx
_CrtCheckMemory
Confirms the integrity of the memory blocks allocated in the debug heap (debug version only).
http://msdn.microsoft.com/en-us/library/e73x0s4b.aspx

Why memset called after calloc?

I've researched code of some library and noticed that calls to calloc there are followed by memset for block allocated by calloc.
I've found this question with quite comprehensive answer on differences between calloc and malloc + memset and calling memset for just before allocated storage:
Why malloc+memset is slower than calloc?
What i still can't understand is why one would want to do so.
What are benefits of this operations?
The code example from mentioned above library:
light_pcapng_file_info *light_create_default_file_info()
{
light_pcapng_file_info *default_file_info = calloc(1, sizeof(light_pcapng_file_info));
memset(default_file_info, 0, sizeof(light_pcapng_file_info));
default_file_info->major_version = 1;
return default_file_info;
}
The code of allocated structure(each array contains 32 elements):
typedef struct _light_pcapng_file_info {
uint16_t major_version;
uint16_t minor_version;
char *file_comment;
size_t file_comment_size;
char *hardware_desc;
size_t hardware_desc_size;
char *os_desc;
size_t os_desc_size;
char *user_app_desc;
size_t user_app_desc_size;
size_t interface_block_count;
uint16_t link_types[MAX_SUPPORTED_INTERFACE_BLOCKS];
double timestamp_resolution[MAX_SUPPORTED_INTERFACE_BLOCKS];
} light_pcapng_file_info;
EDIT:
In addition to accepted answer i'd like to provide some info my colleague pointed me to. There was a bug in glibc which, sometimes, prevented calloc from zeroing out memory. Here's the link:
https://bugzilla.redhat.com/show_bug.cgi?id=1293976
Actual bug report text in case link gets moved:
glibc: calloc() returns non-zero'ed memory
Description of problem:
At Facebook we had an app that started hanging and crashing weirdly when going from glibc-2.12-1.149.el6.x86_64 to glibc-2.12-1.163.el6.x86_64. Turns out this patch
glibc-rh1066724.patch
Introduced the problem.
You added the following bit to _int_malloc()
/* There are no usable arenas. Fall back to sysmalloc to get a chunk from
mmap. */
if (__glibc_unlikely (av == NULL))
{
void *p = sYSMALLOc (nb, av);
if (p != NULL)
alloc_perturb (p, bytes);
return p;
}
But this isn't ok, alloc_perturb unconditionally memset's the front byte to 0xf, unlike upstream where it checks to see if perturb_byte is set. This needs to be changed to
if (p != NULL && && __builtin_expect(perturb_byte, 0))
alloc_perturb (p, bytes);
return p;
The patch I've attached fixes the problem for me.
This problem is exacerbated by the fact that any sort of lock contention on the arena's results in us falling back on mmap()'ing a new chunk. This is because we check to see if the uncontended arena we check is corrupt, and if it is we loop through, and if we loop to the beginning we know we didn't find anything. Except if our initial arena isn't actually corrupt we'll still return NULL, so we fall back on this mmap() thing more often, which really makes things unstable.
Please get this fixed as soon as possible, I'd even go so far as to call it a possible security issue.
Calling memset() ensures that the OS actually does the virtual memory mapping. As noted in the answers on the question you linked, calloc() can be optimized so that the actual memory mapping is deferred.
The application might have reasons to not defer the actual creation of the virtual memory mapping - such as using the buffer to read from a very high-speed device, although in the case of using memset() to zero out the memory, using calloc() instead of malloc() does seem redundant.
Nobody is perfect, that's all. Yes, memset to zero following a calloc is extravagant. If you want an explicit memset in order to guarantee you are in possession of the memory that you've asked for, then it should follow a malloc instead.
New code has about 20 bugs per 1,000 lines and the laws of probability imply that not all of them are weeded out. Plus this is not really a bug, as there's no ill-behaviour to be observed.
I would call it a bug, since it's doing pointless work, calloc() is specified to return already-cleared memory so why clear it again? Perhaps a failed refactoring, when someone switched from malloc().
If the code is open source and/or in a repository you have access to, I would check the commit history for those lines and see what has been going on. With a bit of luck, there's a commit message that tells the motivation ...

How to deal with assert() in a function, when you have dynamically allocated memory in main?

I have the following C function:
void mySwap(void * p1, void * p2, int elementSize)
{
void * temp = (void*) malloc(elementSize);
assert(temp != NULL);
memcpy(temp, p1, elementSize);
memcpy(p1, p2, elementSize);
memcpy(p2, temp, elementSize);
free(temp);
}
that I want to use in a generic sorting function. Let's suppose that I use it to sort a dynamically allocated array owned by main(). Now let's suppose that at some point temp in mySwap() is actually NULL and the whole program is aborted without freeing the dynamically allocated array in main(). I thought that both mySwap() and the sorting function could return a bool value indicating whether the allocation was successful or not and by using if statements I could free the array in main() and exit(EXIT_FAILURE), but it doesn't seem like a very elegant sollution. What would be a good way to prevent a memory leak in such an instance?
assert is typically used during debugging to identify problems/errors that should never occur.
Out of memory is something that can occur, and so either should not be handled by assert, or, if you do use assert, beware that it will abort the program. Once the program aborts, all memory used by the program is deallocated, so don't worry about that.
Note: If you don't want to have unwieldy if statements everywhere just to handle errors that hardly ever occur, you can use setjmp/longjmp to return to a recoverable state.
You have to realize that the reason malloc fails is because your computer has ran out of memory. From that point and onwards, there's nothing meaningful that your program can do, except terminating as gracefully as you can.
The OS will free the memory upon program termination, so that's not something you need to worry about.
Still, in the normal case, it is of course good practice to free() yourself, manually. Not so much for the sake of "making the memory available again" - the OS will ensure that - but to verify that your program has not gone terribly wrong along the way and created heap corruption, leaks or other bugs. If you have such bugs in your program, it will crash during the free() call, which is a good thing, as the bugs will surface.
assert should preferably not be used in production code. Build your own error handling if needed, that's something better than just violently terminating your own program in the middle of execution.
Avoid the problem by not using malloc.
Instead of allocating a block of memory for every swap, do the swap one byte at a time;
for (int i = 0; i < elementSize; ++i) {
char tmp = ((char*)p1)[i];
((char*)p1)[i] = ((char*)p2)[i];
((char*)p2)[i] = tmp;
}
Only use assert() to catch programmer-error during development, in release-builds it doesn't do anything. If you need to test other things, use proper error-handling, whether that means abort(), return-codes or emulating exceptions using setjmp()/longjmp().
As an aside, do not cast the result of malloc().

Why doesn't the memory footprint of this program increase?

I am in a scratchbox cross compile environment and have this
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main()
{
int * ptr;
int i=0;
while(1)
{
ptr = (int*)malloc( 10485760 * sizeof(int) );
if(ptr == NULL)
{
printf("Could not malloc\n");
exit(1);
}
else
{
printf("Malloc done\n");
for (i = 0 ; i <= 10485759 ; i++)
{
ptr[i] = i ;
}
sleep (5);
continue;
}
}
}
When I run the binary and do
ps -p pid -o cmd,rss,%mem
I do not see any increase in the memory footprint of the process. Why is that?
You probably built very optimized.
On most modern systems gcc knows that malloc returns a non-aliased pointer. That is to say, it will never return the same pointer twice and it will never return a pointer you have saved 'live' somewhere else.
I find this very very hard to imagine, but it is possible that malloc is being called once and its return value being used over and over. The reasons being:
It knows your memory is a dead store. ie: you write to it, but it never gets read from. The pointer is known to not be aliased so it hasn't escaped to be read from somewhere else, and it isn't marked volatile. Your for loop itself /could/ get thrown away.
At that point it could just use the same memory over and over.
Now here is why I find it hard to believe: Just how much does gcc know about malloc? Malloc could have any kind of side effects like incrementing a global 'number of times called' to 'paints my room a random shade of blue'. It seems really weird that it would drop the call and assume it to be side-effect-free. Hell, 'malloc' could be implemented to return NULL every 100th call (maybe not quite to spec, but who is to say).
What it is NOT doing is freeing it on your behalf. That goes beyond what it 'could' know and into the territory of 'doing things it's just not allowed to'. You're allowed to leak memory, lame though it may be.
2 things would be useful here:
1) Compile environmenet: which os, compiler, and command line flags.
and 2) disassembly of the final binary. (objdump or from the compiler)
rss and %mem are both in terms of "physical memory being used by the process at the moment". It has plenty of opportunity to page stuff out. Try adding vsz. I bet that grows as you expect.
Your compiler is helping you out by freeing the allocated memory (assuming that the optimized version of your code even gets around to doing the malloc) when it realizes that you aren't using it. You might try printing out the value of the pointer (printf("0x%x", ptr);) - I suspect you'll be getting repeating values. A more reliable check would write a known bitstring into memory, having already looked to see if the allocated memory already contains that string. In other words, instead of writing i, write 0xdeadbeef0cabba6e over and over again, after checking to see if that bit pattern is already in the space you have allocated.

Resources