Security in returning arrays - c

Suppose the following function:
float *dosomething(const float *src, const int N)
{
float *dst = (float *)malloc(sizeof(float) * N);
if(!dst)
{
printf("Cannot allocate memory\n");
exit(EXIT_FAILURE);
}
for(int i = 0; i < N; i++)
dst[i] = src[i] * 2;
return dst;
}
In this case we don't need allocate memory previously if we want to use it right?
Now, just another case:
void dosomething(float *dst, const float *src, const int N)
{
for(int i = 0; i < N; i++)
dst[i] = src[i] * 2;
}
In the last case we need to allocate memory previously. So I share it and I'm wondering which is the best method for returning an array. Which of them provide more security to an user of the library or class? which method is most recommended? why?

What's better practice or a better idea depends on what you're actually trying to do.
A function like char *strdup(const char *s) (POSIX) is implemented like the first case, it takes a string as an argument, allocates memory for another of the same length and then copies the source to the new piece of memory. It's convenient and saves you from manually doing the common action of allocating a buffer for the copy of the string. You could assume this is simply like a call to malloc and then strcpy/memcpy.
Then you've got a function like char *strcpy(char *dest, const char *src), which is like the second case, where you have control of where the string is going to be copied to. This way you're not forced into having the string copied into a dynamically allocated, not of your choice, piece of memory.
The first way might come in handy if you needed to create and initialise some sort of dynamic structure (list, tree, etc), but then again the second way also suffices and gives you control of what piece of memory is being used; you can use dynamically allocated memory on the heap, or local variables on the stack, etc.
Personally, I would usually go the second way, because I have more control of what variable's being initialised, and I'm not forced into having to use a newly malloc'd piece of memory (what if I wanted my local variable to be initialised?). You could always then write a wrapper function that makes a call to malloc and then to your function using the newly allocated memory as the destination.
It's really up to you and your design and what you're trying to achieve, there are no right and wrong ways and as long as you remember the allocated memory you shouldn't have any problems. I wouldn't say either of the two is more "secure."

There is no RIGHT answer.
C language is inherently insecure, i.e. you can only make data secure if you make a copy and return the copy. Thus hiding the real location of the original from the caller.
What is more important is how to handle the memory de-allocation of shared data that usually dictates the approach is more correct.
In the example you cite the only data being accessed is the data the caller has already passed (and already owns). So the fact you allocate memory, do something with the data and return the allocated memory to the caller is just fine. Just document that is how the function works (like strdup() works on C strings, the caller is responsible for using free() on any returned non-NULL pointer).
FWIW you don't "share" the data. The caller invokes the function to do work on the data on its behalf, once the function returns no more access occurs. If there was a retained (by the function) memory pointer (or other data) it would be correct to describe the situation as sharing data. Since at some point in the future that retained memory pointer (or other data) maybe utilized in some way.

There is no definite "this is better than the other". I never actually think about these things, and just do whatever comes to mind. Which is likely to be the more "natural" solution for the problem at hand. And if it turns out to be "bad" along the way... well, luckily we are not programming by engraving on stone tablets.
In your case, without knowing anything about the software at all, nothing "feels" better. That's actually quite common; almost everything you do in programming can be done in different ways, and often there's no actual difference other than personal preference or just random "that's what I came up with first".
For example, your second solution lets the caller copy to existing memory, which might be part of a larger object. On the other hand, he has to provide the destination memory every time. Although this could also mean saving allocations by using just one memory block for multiple calls. The first solution seems slightly more convenient for the simple case, but 'locks' the user in that case: there's always a fresh memory block allocated.

Related

gsoap allocating mem for dynamic array of structures.

gSoap usefully creates stubs to help with memory management etc. One of these
commands is soap_malloc but there doesn't seem to be a corresponding soap_realloc.
Before I start to write my own push and pop methods I just want to ensure I'm not missing anything obvious.
//example.h generated with wsdl2h
struct ns1___Customer
{
int __sizeProduct;
ns1__Product *Product;
int customerid;
}
struct ns1__Product
{
int productid;
}
I am currently using soap_malloc and then realloc for dynamically growing the array.
//I could use soap_new_ns1__Product(&soap,XXX) and allocate mem for XXX
//number of ns1__Product structures but this is wasteful and doesn't solve
//anything
struct ns1__Customter cust;
soap_default_ns1__Product(soap, &cust);
struct ns1__Product *prod_array = NULL;
//allocate mem for 1 product
prod_array = soap_new_ns1__Product(soap,1) ;
soap_default_ns1__Product(soap, &prod_array[0]);
prod_array[0].productid=111;
//Need to add product therefore need to realloc mem.
//IS THIS THE BEST WAY IN gsoap?
prod_array = realloc( prod_array, 2 * sizeof(struct ns1__Product)) ;
soap_default_ns1__Product(soap, &emp_array[1]);
prod_array[1].product=222;
//assigning array ptr to Customer
cust.Product=prod_array;
// Remember to adjust sizeProduct
cust.__sizeProduct=2;
This seems wrong and clumsy, does gsoap suggest a better way? I can't find a clear example in the documentation or by searching online.
Before I start to write my own push and pop methods I just want to ensure I'm not missing anything obvious.
I suspect you're missing that soap_malloc() allocates memory that is automatically freed under at least some circumstances. As such, using realloc() to resize the allocated memory is begging for trouble. There's a fair chance that the reallocation as such will succeed, but you're likely at minimum to end up with a nasty mess when gSOAP's automatic freeing tries to kick in in soap_end().
On the other hand, I don't think you're overlooking any reallocation function. The docs indeed do not seem to describe any. You can always implement your own reallocation wrapper that allocates fresh memory with soap_malloc(), copies the contents of the original space (whose size you'll need to know somehow), and releases the original space with soap_dealloc().
The bottom line appears to be that soap_malloc() is not intended to be a general-purpose allocator, and it is not particularly well suited to your use case. It's primary objective appears to be internal, to relieve library users of any need to manually free the temporary objects that the library allocates. I take exposing it to library users for their direct use to be intended as a convenience.
If you want the ability to reallocate blocks, then I suggest you obtain them in the first place via regular malloc(). You'll want to read the docs carefully if you're going to mix malloc()ed data with soap_malloc()ed data, but it is likely possible. Alternatively, consider approaches that do not require reallocation, such as storing your data in a linked list instead of a dynamic array.
What you're doing there is indeed wrong. After using soap_new_T() (where T in your case is ns1__Product), the soap context now manages that memory by holding onto the ns1__Product* pointer internally. Later, when you call soap_destroy() to free all soap_new_T()-allocated objects managed by the soap context, the context will be trying to free a pointer that no longer points to valid memory since you called realloc().
As John Bollinger pointed out, there's no built-in way in gSOAP to do something similar to a realloc. You'd instead just need to do the reallocation manually, e.g.:
// allocate memory for 1 product (as you already do above)
prod_array = soap_new_ns1__Product(soap, 1);
// ... do some stuff, realize you need to "realloc" to add another product ...
// allocate a new array, managed by the soap context
struct ns1__Product* new_array = soap_new_ns1__Product(soap, 2);
// copy the old array into the new one (assuming old_size is however many elements are in prod_array)
for(std::size_t i = 0; i < old_size; ++i)
{
new_array[i] = prod_array[i];
}
// tell the soap context to destroy the old array
soap_dealloc(soap, prod_array);
Aside:
It seems like it should be possible to use an std::vector<ns1__Product> rather than an array, which would solve your problem in an arguably better manner, but that question was already asked here to no avail. Unfortunately I don't know the answer to that at this time.

Binary resources embedded in .exe and memory management when loaded

I'm working in a small C program and I need to embed binary data into an exe file. The method I'm using is converting that binary data into a char[] array... but I'm not including directly that array as a global variable; instead, I copy that array inside a function (LoadResource) that dynamically creates an array on heap, where I copy my original data. That's what I mean:
char *dataPntr;
void LoadResource()
{
char data[2048] = {/*my binary data */};
dataPntr = malloc(2048);
for (int i = 0; i < 2048; i++) dataPntr [i] = data[i];
}
That way, if my understanding is correct, when calling LoadResource() data[] will be placed in stack, copied to heap and finally data[] will be automatically deallocated from stack; heap copy should be manually deallocated with free().
I'm doing it this way because the resource is only used in some situations, not always... and I prefer to avoid a large global variable.
My questions:
When running the program, is data[] array placed somewhere in memory? text segment maybe? or is it just loaded into stack when calling LoadResource()?
Is my solution the proper one (in terms of memory management) or would it be better to just declare a global data array?
Thanks for your answers!
Generally it is a good idea to avoid global variables. I won't say you never need them, but they can be a pain to debug. The problem is that it can be difficult to follow who changed it last. And if you ever do any multi-threading then you will never want to see a global again!
I include your char *dataPntr in those comments - why is that global? It might be better to return the pointer instead.
Not sure why you are using an array on the stack (data), my guess is so that you can use the {...} initialisation syntax. Can you avoid that? It might not be a big deal, 2k is not a large overhead, but maybe it might grow?
Personally I would copy the data using memcpy()
You have a couple of "magic numbers" in your code, 2048 and 2018. Maybe one is a typo? To avoid this kind of issue, most will use a pre-processor macro. For example:
#include <string.h> /* for memcpy() */
#define DATA_SIZE 2048
char * LoadResource(void)
{
char data[DATA_SIZE] = {/*my binary data */};
char * dataPntr = malloc(DATA_SIZE);
if (dataPntr)
memcpy(dataPntr, data, DATA_SIZE);
return dataPntr;
}
By the way, notice the prototype for LoadResource as void. In C (not C++) an empty parameter list means no parameter checking, not no parameters. Also note that I check the returned value from malloc. This means that the function will return NULL on error.
Another strategy might be to make the data array static instead, however exactly when that gets initialised is compiler dependant, and you might find that you incur the memory overhead even if you don't use it.
While I agree with #cdarke in general, it sounds to me that you are creating a constant array (i. e. is never modified at run-time). If this is true, I would not hesitate to make it a global const array. Most compilers will simply place the array in text memory at link time and there will not be any run-time overhead for initialization.
If, on the other hand, you need to modify the data at run-time, I'd follow #cdarke's example, except to make your data array static const. This way, again, most compilers will place the preinitialized array in the text segment and you will avoid the run-time overhead for initializing the data array.

Checking if a certain adress in memory is allocated

I have a function that recieves a pointer to dynamic array of 100 ints. But instead of 100 I have just 50 allocated by malloc or calloc before that.
Is there a way that I could check if any ellement (like 79th for example) is allocated rather than wonder what this SIGSEGV actually means ?
My question is purely theoretic and I have no actual code to show.
No, the pointer does not store its size. You may be better off storing the size and the pointer in a struct and passing it instead:
typedef struct
{
size_t size;
int *ptr;
} my_data;
void myFunc(my_data *data)
{
size_t i;
for(i = 0; i < data->size; i++)
{
// data->ptr[i];
}
}
void myFunc2(my_data *data, size_t index)
{
if(index < data->size)
{
// memory location exists
}
}
Well, you could do such a thing according to your description, given an array and looking for an index (which is slightly different from "any raw pointer"). And with some more work, it is even possible to do such a thing for any pointer.
The malloc function necessarily stores information about how much was allocated. Unluckily, there is no standard how this must be done. Some compilers over-allocate and store the size immediately preceding the allocated data. Others may store addresses in a map, yet others may do something else, you don't know.
However, most (all?) C libraries and at least one linker that I know of have explicit support for overloading/hooking/replacing allocation functions.
For example in the GNU C library, you can set __malloc_hook. and GNU ld lets you do such a thing at linker level with __wrap_malloc.
You could thus overload/hook malloc and free with a function that simply calls the real malloc function and stores the information how much was allocated yourself somewhere (e.g. by over-allocating and using the first word, or whatever you like).
Then write a function which takes a base pointer and an index. That function looks at the allocation info (now you know where to find it!), and can trivially check whether the index is in range. This does not work for "just any pointer".
An alternative solution which works for "just any pointer" would be to write an allocator that satisfies allocations from separate arenas rather than simply wrapping the real malloc. All allocations coming from the same arena have the same allocation size. Given any pointer, you would then only need to iterate over all your arenas and look whether the address is within the arena's start and end address.
However, one should normally be quite sure how much one has allocated, this should not be guesswork, or random luck, or something to figure out at runtime.
Also, given the presence of ready-to-use memory debuggers, I doubt it is really worth investing time in doing such a thing application-side. Just use something like valgrind, no need to write any code at all.
No, there's no portable and reliable way to check this from within the code.
There exist tools -- such as valgrind -- that may help diagnose certain types of memory bugs.
No, there isn't.
This is when you break out your dynamic analysis tool (e.g. valgrind), or use a real container that keeps information about its size.
Some years ago i used one library, i forget its name. Using it, you can create try-catch block and try to access to unknown data e.g. x[79] in try-block, and, if memory is not allocated in it, exception was generated.

Already freed memory

Is there any way in C to know if a memory block has previously been freed with free()? Can i do something like...
if(isFree(pointer))
{
//code here
}
Ok if you need to check whether a pointer has already been freed you may want to check your design. You should never have to either track reference count on a pointer or if it's freed. Also some pointers are not dynamically allocated memory so I hope you mean ones called with malloc(). This is my opinion but again if you have a solid design you should know when the things your pointers point to are done being used.
The only place I have seen this not work is in monolithic kernels because pages in memory need a usage count because of shared mappings among other things.
In your case simply set unused pointers to NULL and check that. This gives you a guaranteed way of knowing in the case that you have unused fields in structures that were malloced. A simple rule is wherever you free a pointer that needs to be checked in the above way just set it to NULL and replace isFree() with if pointer == NULL. This way no reference count needs to be tracked and you know for sure if your pointer is valid and not pointing to garbage.
No, there is no way.
You can, however, use a little code discipline as follows:
Always always always guard allocations with malloc:
void * vp;
if((vp = malloc(SIZE))==NULL){
/* do something dreadful here to respond to the out of mem */
exit(-1);
}
After freeing a pointer, set it to 0
free(vp); vp = (void*)0;
/* I like to put them on one line and think of them as one peration */
Anywhere you'd be tempted to use your "is freed" function, just say
if(vp == NULL)[
/* it's been freed already */
}
Update
#Jesus in comments says:
I can't really recommend this because as soon as you're done with that
memory the pointer should go out of scope immediately (or at least at
the end of the function that releases it) these dangling pointers
existence just doesn't sit right with me.
That's generally good practice when possible; the problem is that in real life in C it's often not possible. Consider as an example a text editor that contains a doubly-linked list of lines. The list is really simple:
struct line {
struct line * prev;
struct line * next;
char * contents;
}
I define a guarded_malloc function that allocates memory
void * guarded_malloc(size_t sz){
return (malloc(sz)) ? : exit(-1); /* cute, eh? */
}
and create list nodes with newLine()
struct line * newLine(){
struct line * lp;
lp = (struct line *) guarded_malloc(sizeof(struct line));
lp->prev = lp->next = lp-contents = NULL ;
return lp;
}
I add text in string s to my line
lp->contents = guarded_malloc(strlen(s)+1);
strcpy(lp->contents,s);
and don't quibble that I should be using the bounded-length forms, this is just an example.
Now, how can I implement deleting the contents of a line I created with the char * contents going out of scope after freeing?
I see nobody has addressed the reason why what you want is fundamentally impossible. To free a resource (in this case memory, but the same applies to basically any resource) means to return it to a resource pool where it's available for reuse. The only way the system could provide a reasonable answer to "Has the memory block at address X already been freed?" is to prevent this address from ever being reused, and store with it a status flag indicating whether it was "freed". But in this case, it has not actually been freed, since it is not available for reuse.
As others have said, the fact that you're trying to answer this question means you have fundamental design errors you need to address.
In general the only way to do this portably is to replace the memory allocation functions. But if you're only concerned about your own code, a fairly common technique is to set pointers to NULL after you free() them, so any subsequent use will throw an exception or segfault:
free(pointer);
pointer = NULL;
For a platform-specific solution, you may be interested in the Win32 function IsBadReadPtr (and others like it). This function will be able to (almost) predict whether you will get a segmentation fault when reading from a particular chunk of memory.
Note: IsBadReadPtr has been deprecated by Microsoft.
However, this does not protect you in the general case, because the operating system knows nothing of the C runtime heap manager, and if a caller passes in a buffer that isn't as large as you expect, then the rest of the heap block will continue to be readable from an OS perspective.
Pointers have no information with them other than where they point. The best you can do is say "I know how this particular compiler version allocates memory, so I'll dereference memory, move the pointer back 4 bytes, check the size, makes sure it matches..." and so on. You cannot do it in a standard fashion, since memory allocation is implementation defined. Not to mention they might have not dynamically allocated it at all.
On a side note, I recommend reading 'Writing Solid Code' by Steve McGuire. Excellent sections on memory management.

Freeing all malloc()-created pointers with one command?

Is there a one-liner that will free the memory that is being taken by all pointers you created using mallocs? Or can this only be done manually by freeing every pointer separately?
you could do that by creating some kind of "wrapper" around malloc.
(warning that's only pseudo code showing the idea, there is no checking at all)
void* your_malloc(size_t size)
{
void* ptr = malloc(size);
// add ptr to a list of allocated ptrs here
return ptr;
}
void your_free(void *pointer)
{
for each pointer in your list
{
free( ptr_in_your_list );
}
}
But it doesn't sound like a good idea and I would certainly not do that, at least for general purpose allocation / deallocation. You'd better allocate and free memory responsibly when it is no longer needed.
You might want to look into memory pools. These are data structures built to do exactly this.
One common implementation is in the Apache Portable Runtime, which is used in the Apache web server, as well as other projects, such as Subversion.
malloc on it's own has implementation-defined behavior. So there isn't a necessity for it to keep track of all the pointers it has, which obviously puts a damper on the idea.
You'd need to make your own memory manager that tracks the pointers, and then provides a function called free_all or something that goes through the list of pointers it has and calls free on them.
Note, this sounds like a somewhat bad idea. It's better to be a bit more strict/responsible about your memory usage, and free things when you're done; not leave them hanging about.
Perhaps with a bit more background on where you want to apply your idea, we might find easier solutions.
Check out dlmalloc
ftp://g.oswego.edu/pub/misc/malloc.h
look at the following functions
/*
mspace is an opaque type representing an independent
region of space that supports mspace_malloc, etc.
*/
typedef void* mspace;
/*
create_mspace creates and returns a new independent space with the
given initial capacity, or, if 0, the default granularity size. It
returns null if there is no system memory available to create the
space. If argument locked is non-zero, the space uses a separate
lock to control access. The capacity of the space will grow
dynamically as needed to service mspace_malloc requests. You can
control the sizes of incremental increases of this space by
compiling with a different DEFAULT_GRANULARITY or dynamically
setting with mallopt(M_GRANULARITY, value).
*/
mspace create_mspace(size_t capacity, int locked);
/*
destroy_mspace destroys the given space, and attempts to return all
of its memory back to the system, returning the total number of
bytes freed. After destruction, the results of access to all memory
used by the space become undefined.
*/
size_t destroy_mspace(mspace msp);
...
/*
The following operate identically to their malloc counterparts
but operate only for the given mspace argument
*/
void* mspace_malloc(mspace msp, size_t bytes);
void mspace_free(mspace msp, void* mem);
void* mspace_calloc(mspace msp, size_t n_elements, size_t elem_size);
void* mspace_realloc(mspace msp, void* mem, size_t newsize);
You might want to do something called "arena allocation", where you allocate certain requests from a common "arena" which can be freed all at once when you're done.
If you're on Windows, you can use HeapCreate to create an arena, HeapAlloc to get memory from the heap/arena you just created, and HeapDestroy to free it all at once.
Note that when your program exit()s, all the memory you allocated with malloc() is freed.
Yes, you can do that unless you write your own defintion of malloc() and free(). You should probably call myCustomMalloc() instead of regular malloc() and you should be keeping track of all the pointers in some memory location and when you call the myCustomFree() method, you should be able to clear all the pointers that was created using your myCustomMalloc(). Note: both your custom methods will be calling malloc() and free() internally
By this way you can achieve your goal. I am a java person but I use to work a lot in C in my early days. I assume that you're trying to achieve a common solution where memory is being handled by the compiler. That has a cost of performance as it is seen in Java. You dont have to worry about allocation and freeing the memory. But that has a severe effect on performance. Its a tradeoff that you have to live with.

Resources