gsoap allocating mem for dynamic array of structures. - c

gSoap usefully creates stubs to help with memory management etc. One of these
commands is soap_malloc but there doesn't seem to be a corresponding soap_realloc.
Before I start to write my own push and pop methods I just want to ensure I'm not missing anything obvious.
//example.h generated with wsdl2h
struct ns1___Customer
{
int __sizeProduct;
ns1__Product *Product;
int customerid;
}
struct ns1__Product
{
int productid;
}
I am currently using soap_malloc and then realloc for dynamically growing the array.
//I could use soap_new_ns1__Product(&soap,XXX) and allocate mem for XXX
//number of ns1__Product structures but this is wasteful and doesn't solve
//anything
struct ns1__Customter cust;
soap_default_ns1__Product(soap, &cust);
struct ns1__Product *prod_array = NULL;
//allocate mem for 1 product
prod_array = soap_new_ns1__Product(soap,1) ;
soap_default_ns1__Product(soap, &prod_array[0]);
prod_array[0].productid=111;
//Need to add product therefore need to realloc mem.
//IS THIS THE BEST WAY IN gsoap?
prod_array = realloc( prod_array, 2 * sizeof(struct ns1__Product)) ;
soap_default_ns1__Product(soap, &emp_array[1]);
prod_array[1].product=222;
//assigning array ptr to Customer
cust.Product=prod_array;
// Remember to adjust sizeProduct
cust.__sizeProduct=2;
This seems wrong and clumsy, does gsoap suggest a better way? I can't find a clear example in the documentation or by searching online.

Before I start to write my own push and pop methods I just want to ensure I'm not missing anything obvious.
I suspect you're missing that soap_malloc() allocates memory that is automatically freed under at least some circumstances. As such, using realloc() to resize the allocated memory is begging for trouble. There's a fair chance that the reallocation as such will succeed, but you're likely at minimum to end up with a nasty mess when gSOAP's automatic freeing tries to kick in in soap_end().
On the other hand, I don't think you're overlooking any reallocation function. The docs indeed do not seem to describe any. You can always implement your own reallocation wrapper that allocates fresh memory with soap_malloc(), copies the contents of the original space (whose size you'll need to know somehow), and releases the original space with soap_dealloc().
The bottom line appears to be that soap_malloc() is not intended to be a general-purpose allocator, and it is not particularly well suited to your use case. It's primary objective appears to be internal, to relieve library users of any need to manually free the temporary objects that the library allocates. I take exposing it to library users for their direct use to be intended as a convenience.
If you want the ability to reallocate blocks, then I suggest you obtain them in the first place via regular malloc(). You'll want to read the docs carefully if you're going to mix malloc()ed data with soap_malloc()ed data, but it is likely possible. Alternatively, consider approaches that do not require reallocation, such as storing your data in a linked list instead of a dynamic array.

What you're doing there is indeed wrong. After using soap_new_T() (where T in your case is ns1__Product), the soap context now manages that memory by holding onto the ns1__Product* pointer internally. Later, when you call soap_destroy() to free all soap_new_T()-allocated objects managed by the soap context, the context will be trying to free a pointer that no longer points to valid memory since you called realloc().
As John Bollinger pointed out, there's no built-in way in gSOAP to do something similar to a realloc. You'd instead just need to do the reallocation manually, e.g.:
// allocate memory for 1 product (as you already do above)
prod_array = soap_new_ns1__Product(soap, 1);
// ... do some stuff, realize you need to "realloc" to add another product ...
// allocate a new array, managed by the soap context
struct ns1__Product* new_array = soap_new_ns1__Product(soap, 2);
// copy the old array into the new one (assuming old_size is however many elements are in prod_array)
for(std::size_t i = 0; i < old_size; ++i)
{
new_array[i] = prod_array[i];
}
// tell the soap context to destroy the old array
soap_dealloc(soap, prod_array);
Aside:
It seems like it should be possible to use an std::vector<ns1__Product> rather than an array, which would solve your problem in an arguably better manner, but that question was already asked here to no avail. Unfortunately I don't know the answer to that at this time.

Related

Security in returning arrays

Suppose the following function:
float *dosomething(const float *src, const int N)
{
float *dst = (float *)malloc(sizeof(float) * N);
if(!dst)
{
printf("Cannot allocate memory\n");
exit(EXIT_FAILURE);
}
for(int i = 0; i < N; i++)
dst[i] = src[i] * 2;
return dst;
}
In this case we don't need allocate memory previously if we want to use it right?
Now, just another case:
void dosomething(float *dst, const float *src, const int N)
{
for(int i = 0; i < N; i++)
dst[i] = src[i] * 2;
}
In the last case we need to allocate memory previously. So I share it and I'm wondering which is the best method for returning an array. Which of them provide more security to an user of the library or class? which method is most recommended? why?
What's better practice or a better idea depends on what you're actually trying to do.
A function like char *strdup(const char *s) (POSIX) is implemented like the first case, it takes a string as an argument, allocates memory for another of the same length and then copies the source to the new piece of memory. It's convenient and saves you from manually doing the common action of allocating a buffer for the copy of the string. You could assume this is simply like a call to malloc and then strcpy/memcpy.
Then you've got a function like char *strcpy(char *dest, const char *src), which is like the second case, where you have control of where the string is going to be copied to. This way you're not forced into having the string copied into a dynamically allocated, not of your choice, piece of memory.
The first way might come in handy if you needed to create and initialise some sort of dynamic structure (list, tree, etc), but then again the second way also suffices and gives you control of what piece of memory is being used; you can use dynamically allocated memory on the heap, or local variables on the stack, etc.
Personally, I would usually go the second way, because I have more control of what variable's being initialised, and I'm not forced into having to use a newly malloc'd piece of memory (what if I wanted my local variable to be initialised?). You could always then write a wrapper function that makes a call to malloc and then to your function using the newly allocated memory as the destination.
It's really up to you and your design and what you're trying to achieve, there are no right and wrong ways and as long as you remember the allocated memory you shouldn't have any problems. I wouldn't say either of the two is more "secure."
There is no RIGHT answer.
C language is inherently insecure, i.e. you can only make data secure if you make a copy and return the copy. Thus hiding the real location of the original from the caller.
What is more important is how to handle the memory de-allocation of shared data that usually dictates the approach is more correct.
In the example you cite the only data being accessed is the data the caller has already passed (and already owns). So the fact you allocate memory, do something with the data and return the allocated memory to the caller is just fine. Just document that is how the function works (like strdup() works on C strings, the caller is responsible for using free() on any returned non-NULL pointer).
FWIW you don't "share" the data. The caller invokes the function to do work on the data on its behalf, once the function returns no more access occurs. If there was a retained (by the function) memory pointer (or other data) it would be correct to describe the situation as sharing data. Since at some point in the future that retained memory pointer (or other data) maybe utilized in some way.
There is no definite "this is better than the other". I never actually think about these things, and just do whatever comes to mind. Which is likely to be the more "natural" solution for the problem at hand. And if it turns out to be "bad" along the way... well, luckily we are not programming by engraving on stone tablets.
In your case, without knowing anything about the software at all, nothing "feels" better. That's actually quite common; almost everything you do in programming can be done in different ways, and often there's no actual difference other than personal preference or just random "that's what I came up with first".
For example, your second solution lets the caller copy to existing memory, which might be part of a larger object. On the other hand, he has to provide the destination memory every time. Although this could also mean saving allocations by using just one memory block for multiple calls. The first solution seems slightly more convenient for the simple case, but 'locks' the user in that case: there's always a fresh memory block allocated.

Checking if a certain adress in memory is allocated

I have a function that recieves a pointer to dynamic array of 100 ints. But instead of 100 I have just 50 allocated by malloc or calloc before that.
Is there a way that I could check if any ellement (like 79th for example) is allocated rather than wonder what this SIGSEGV actually means ?
My question is purely theoretic and I have no actual code to show.
No, the pointer does not store its size. You may be better off storing the size and the pointer in a struct and passing it instead:
typedef struct
{
size_t size;
int *ptr;
} my_data;
void myFunc(my_data *data)
{
size_t i;
for(i = 0; i < data->size; i++)
{
// data->ptr[i];
}
}
void myFunc2(my_data *data, size_t index)
{
if(index < data->size)
{
// memory location exists
}
}
Well, you could do such a thing according to your description, given an array and looking for an index (which is slightly different from "any raw pointer"). And with some more work, it is even possible to do such a thing for any pointer.
The malloc function necessarily stores information about how much was allocated. Unluckily, there is no standard how this must be done. Some compilers over-allocate and store the size immediately preceding the allocated data. Others may store addresses in a map, yet others may do something else, you don't know.
However, most (all?) C libraries and at least one linker that I know of have explicit support for overloading/hooking/replacing allocation functions.
For example in the GNU C library, you can set __malloc_hook. and GNU ld lets you do such a thing at linker level with __wrap_malloc.
You could thus overload/hook malloc and free with a function that simply calls the real malloc function and stores the information how much was allocated yourself somewhere (e.g. by over-allocating and using the first word, or whatever you like).
Then write a function which takes a base pointer and an index. That function looks at the allocation info (now you know where to find it!), and can trivially check whether the index is in range. This does not work for "just any pointer".
An alternative solution which works for "just any pointer" would be to write an allocator that satisfies allocations from separate arenas rather than simply wrapping the real malloc. All allocations coming from the same arena have the same allocation size. Given any pointer, you would then only need to iterate over all your arenas and look whether the address is within the arena's start and end address.
However, one should normally be quite sure how much one has allocated, this should not be guesswork, or random luck, or something to figure out at runtime.
Also, given the presence of ready-to-use memory debuggers, I doubt it is really worth investing time in doing such a thing application-side. Just use something like valgrind, no need to write any code at all.
No, there's no portable and reliable way to check this from within the code.
There exist tools -- such as valgrind -- that may help diagnose certain types of memory bugs.
No, there isn't.
This is when you break out your dynamic analysis tool (e.g. valgrind), or use a real container that keeps information about its size.
Some years ago i used one library, i forget its name. Using it, you can create try-catch block and try to access to unknown data e.g. x[79] in try-block, and, if memory is not allocated in it, exception was generated.

Already freed memory

Is there any way in C to know if a memory block has previously been freed with free()? Can i do something like...
if(isFree(pointer))
{
//code here
}
Ok if you need to check whether a pointer has already been freed you may want to check your design. You should never have to either track reference count on a pointer or if it's freed. Also some pointers are not dynamically allocated memory so I hope you mean ones called with malloc(). This is my opinion but again if you have a solid design you should know when the things your pointers point to are done being used.
The only place I have seen this not work is in monolithic kernels because pages in memory need a usage count because of shared mappings among other things.
In your case simply set unused pointers to NULL and check that. This gives you a guaranteed way of knowing in the case that you have unused fields in structures that were malloced. A simple rule is wherever you free a pointer that needs to be checked in the above way just set it to NULL and replace isFree() with if pointer == NULL. This way no reference count needs to be tracked and you know for sure if your pointer is valid and not pointing to garbage.
No, there is no way.
You can, however, use a little code discipline as follows:
Always always always guard allocations with malloc:
void * vp;
if((vp = malloc(SIZE))==NULL){
/* do something dreadful here to respond to the out of mem */
exit(-1);
}
After freeing a pointer, set it to 0
free(vp); vp = (void*)0;
/* I like to put them on one line and think of them as one peration */
Anywhere you'd be tempted to use your "is freed" function, just say
if(vp == NULL)[
/* it's been freed already */
}
Update
#Jesus in comments says:
I can't really recommend this because as soon as you're done with that
memory the pointer should go out of scope immediately (or at least at
the end of the function that releases it) these dangling pointers
existence just doesn't sit right with me.
That's generally good practice when possible; the problem is that in real life in C it's often not possible. Consider as an example a text editor that contains a doubly-linked list of lines. The list is really simple:
struct line {
struct line * prev;
struct line * next;
char * contents;
}
I define a guarded_malloc function that allocates memory
void * guarded_malloc(size_t sz){
return (malloc(sz)) ? : exit(-1); /* cute, eh? */
}
and create list nodes with newLine()
struct line * newLine(){
struct line * lp;
lp = (struct line *) guarded_malloc(sizeof(struct line));
lp->prev = lp->next = lp-contents = NULL ;
return lp;
}
I add text in string s to my line
lp->contents = guarded_malloc(strlen(s)+1);
strcpy(lp->contents,s);
and don't quibble that I should be using the bounded-length forms, this is just an example.
Now, how can I implement deleting the contents of a line I created with the char * contents going out of scope after freeing?
I see nobody has addressed the reason why what you want is fundamentally impossible. To free a resource (in this case memory, but the same applies to basically any resource) means to return it to a resource pool where it's available for reuse. The only way the system could provide a reasonable answer to "Has the memory block at address X already been freed?" is to prevent this address from ever being reused, and store with it a status flag indicating whether it was "freed". But in this case, it has not actually been freed, since it is not available for reuse.
As others have said, the fact that you're trying to answer this question means you have fundamental design errors you need to address.
In general the only way to do this portably is to replace the memory allocation functions. But if you're only concerned about your own code, a fairly common technique is to set pointers to NULL after you free() them, so any subsequent use will throw an exception or segfault:
free(pointer);
pointer = NULL;
For a platform-specific solution, you may be interested in the Win32 function IsBadReadPtr (and others like it). This function will be able to (almost) predict whether you will get a segmentation fault when reading from a particular chunk of memory.
Note: IsBadReadPtr has been deprecated by Microsoft.
However, this does not protect you in the general case, because the operating system knows nothing of the C runtime heap manager, and if a caller passes in a buffer that isn't as large as you expect, then the rest of the heap block will continue to be readable from an OS perspective.
Pointers have no information with them other than where they point. The best you can do is say "I know how this particular compiler version allocates memory, so I'll dereference memory, move the pointer back 4 bytes, check the size, makes sure it matches..." and so on. You cannot do it in a standard fashion, since memory allocation is implementation defined. Not to mention they might have not dynamically allocated it at all.
On a side note, I recommend reading 'Writing Solid Code' by Steve McGuire. Excellent sections on memory management.

Freeing all malloc()-created pointers with one command?

Is there a one-liner that will free the memory that is being taken by all pointers you created using mallocs? Or can this only be done manually by freeing every pointer separately?
you could do that by creating some kind of "wrapper" around malloc.
(warning that's only pseudo code showing the idea, there is no checking at all)
void* your_malloc(size_t size)
{
void* ptr = malloc(size);
// add ptr to a list of allocated ptrs here
return ptr;
}
void your_free(void *pointer)
{
for each pointer in your list
{
free( ptr_in_your_list );
}
}
But it doesn't sound like a good idea and I would certainly not do that, at least for general purpose allocation / deallocation. You'd better allocate and free memory responsibly when it is no longer needed.
You might want to look into memory pools. These are data structures built to do exactly this.
One common implementation is in the Apache Portable Runtime, which is used in the Apache web server, as well as other projects, such as Subversion.
malloc on it's own has implementation-defined behavior. So there isn't a necessity for it to keep track of all the pointers it has, which obviously puts a damper on the idea.
You'd need to make your own memory manager that tracks the pointers, and then provides a function called free_all or something that goes through the list of pointers it has and calls free on them.
Note, this sounds like a somewhat bad idea. It's better to be a bit more strict/responsible about your memory usage, and free things when you're done; not leave them hanging about.
Perhaps with a bit more background on where you want to apply your idea, we might find easier solutions.
Check out dlmalloc
ftp://g.oswego.edu/pub/misc/malloc.h
look at the following functions
/*
mspace is an opaque type representing an independent
region of space that supports mspace_malloc, etc.
*/
typedef void* mspace;
/*
create_mspace creates and returns a new independent space with the
given initial capacity, or, if 0, the default granularity size. It
returns null if there is no system memory available to create the
space. If argument locked is non-zero, the space uses a separate
lock to control access. The capacity of the space will grow
dynamically as needed to service mspace_malloc requests. You can
control the sizes of incremental increases of this space by
compiling with a different DEFAULT_GRANULARITY or dynamically
setting with mallopt(M_GRANULARITY, value).
*/
mspace create_mspace(size_t capacity, int locked);
/*
destroy_mspace destroys the given space, and attempts to return all
of its memory back to the system, returning the total number of
bytes freed. After destruction, the results of access to all memory
used by the space become undefined.
*/
size_t destroy_mspace(mspace msp);
...
/*
The following operate identically to their malloc counterparts
but operate only for the given mspace argument
*/
void* mspace_malloc(mspace msp, size_t bytes);
void mspace_free(mspace msp, void* mem);
void* mspace_calloc(mspace msp, size_t n_elements, size_t elem_size);
void* mspace_realloc(mspace msp, void* mem, size_t newsize);
You might want to do something called "arena allocation", where you allocate certain requests from a common "arena" which can be freed all at once when you're done.
If you're on Windows, you can use HeapCreate to create an arena, HeapAlloc to get memory from the heap/arena you just created, and HeapDestroy to free it all at once.
Note that when your program exit()s, all the memory you allocated with malloc() is freed.
Yes, you can do that unless you write your own defintion of malloc() and free(). You should probably call myCustomMalloc() instead of regular malloc() and you should be keeping track of all the pointers in some memory location and when you call the myCustomFree() method, you should be able to clear all the pointers that was created using your myCustomMalloc(). Note: both your custom methods will be calling malloc() and free() internally
By this way you can achieve your goal. I am a java person but I use to work a lot in C in my early days. I assume that you're trying to achieve a common solution where memory is being handled by the compiler. That has a cost of performance as it is seen in Java. You dont have to worry about allocation and freeing the memory. But that has a severe effect on performance. Its a tradeoff that you have to live with.

Determining realloc() behaviour before calling it

As I understand it, when asked to reserve a larger block of memory, the realloc() function will do one of three different things:
if free contiguous block exists
grow current block
else if sufficient memory
allocate new memory
copy old memory to new
free old memory
else
return null
Growing the current block is a very cheap operation, so this is behaviour I'd like to take advantage of. However, if I'm reallocating memory because I want to (for example) insert a char at the start of an existing string, I don't want realloc() to copy the memory. I'll end up copying the entire string with realloc(), then copying it again manually to free up the first array element.
Is it possible to determine what realloc() will do? If so, is it possible to achieve in a cross-platform way?
realloc()'s behavior is likely dependent on its specific implementation. And basing your code on that would be a terrible hack which, to say the least, violates encapsulation.
A better solution for your specific example is:
Find the size of the current buffer
Allocate a new buffer (with malloc()), greater than the previous one
Copy the prefix you want to the new buffer
Copy the string in the previous buffer to the new buffer, starting after the prefix
Release the previous buffer
As noted in the comments, case 3 in the question (no memory) is wrong; realloc() will return NULL if there is no memory available [question now fixed].
Steve McConnell in 'Code Complete' points out that if you save the return value from realloc() in the only copy of the original pointer when realloc() fails, you've just leaked memory. That is:
void *ptr = malloc(1024);
...
if ((ptr = realloc(ptr, 2048)) == 0)
{
/* Oops - cannot free original memory allocation any more! */
}
Different implementations of realloc() will behave differently. The only safe thing to assume is that the data will always be moved - that you will always get a new address when you realloc() memory.
As someone else pointed out, if you are concerned about this, maybe it is time to look at your algorithms.
Would storing your string backwards help?
Otherwise...
just malloc() more space than you need, and when you run out of room, copy to a new buffer. A simple technique is to double the space each time; this works pretty well because the larger the string (i.e. the more time copying to a new buffer will takes) the less often it needs to occur.
Using this method you can also right-justify your string in the buffer, so it's easy to add characters to the start.
If obstacks are a good match for your memory allocation needs, you can use their fast growing functionality. Obstacks are a feature of glibc, but they are also available in the libiberty library, which is fairly portable.
No - and if you think about it, it can't work. Between you checking what it's going to do and actually doing it, another process could allocate memory.
In a multi-threaded application this can't work. Between you checking what it's going to do and actually doing it, another thread could allocate memory.
If you're worried about this sort of thing, it might be time to look at the data structures you're using to see if you can fix the problem there. Depending on how these strings are constructed, you can do so quite efficiently with a well designed buffer.
Why not keep some empty buffer space in the left of the string, like so:
char* buf = malloc(1024);
char* start = buf + 1024 - 3;
start[0]='t';
start[1]='o';
start[2]='\0';
To add "on" to the beginning of your string to make it "onto\0":
start-=2;
if(start < buf)
DO_MEMORY_STUFF(start, buf);//time to reallocate!
start[0]='o';
start[1]='n';
This way, you won't have to keep copying your buffer every single time you want to do an insertion at the beginning.
If you have to do insertions at both the beginning and end, just have some space allocated at both ends; insertions in the middle will still need you to shuffle elements around, obviously.
A better approach is to use a linked list. Have each of your data objects allocated on a page, and allocate another page and have a link to it, either from the previous page or from an index page. This way you know when the next alloc fails, and you never need to copy memory.
I don't think it's possible in cross platform way.
Here is the code for ulibc implementation that might give you a clue how to do itin platform dependent way, actually it's better to find glibc source but this one was on top of google search :)

Resources