Code:
union foo
{
char c;
int i;
};
void func(void * src)
{
union foo dest;
memcpy(&dest, src, sizeof(union foo)); //here
}
If I call func() like this:
int main()
{
char c;
int i;
func(&c);
func(&i);
return 0;
}
In the call func(&c), the size of c is less than sizeof(union foo), which may be dangerous, right?
Is the line with memcpy correct? And if not, how to fix it?
What I want is a safe call to memcpy that copy a void * pointer to a union.
A little background: this is extracted from a very complicated function, and the signature of func() including the void * parameter is out of my control. Of course the example does nothing useful, that's because I removed all the code that isn't relevant to provide an example with minimum code.
In the call func(&c), the size of c is less than sizeof(union foo), which may be dangerous, right?
Right, this will lead to undefined behaviour. dest will likely contain some bytes from memory areas surrounding c, and which these are depends on the internal workings of the compiler. Of course, as long as you only access dest.c, that shouldn't cause any problems in most cases.
But let me be more specific. According to the C standard, writing dest.c but reading dest.i will always yield undefined behaviour. But most compilers on most platforms will have some well-defined behaviour for those cases as well. So often writing dest.c but reading dest.i makes sense despite what the standard says. In this case, however, reading from dest.i will still be affected by unknown surrounding variables, so it is undefined not only from the standards point of view, but also in a very practical sense.
There also is a rare scenario you should consider: c might be located at the very end of allocated memory pages. (This refers to memory pages allocated from the operating system and eventually the memory management unit (MMU) hardware, not to the block-wise user space allocation done by malloc and friends.) In this case, reading more than that single byte might cause access to unmapped memory, and hence cause a severe error, most likely a program crash. Given the location of your c as an automatic variable in main, this seems unlikely, but I take it that this code snippet is only an example.
Is the line with memcpy correct? And if not, how to fix it?
Depends on what you want to do. As it stands, the code doesn't make too much sense, so I don't know what correct reasonable application you might have in mind. Perhaps you should pass the sizeof the src object to func.
Is the line with memcpy correct? And if not, how to fix it?
you should pass the size of memory pointed by void pointer so you can know src has this much size so you just need to copy this much of data...
Further more to be safe you should calculate the size of destination and based on that you should pass size so illegal access in reading and writing both can be avoided.
The memcpy is fine. By passing the address of the smallest member of the union, you will end with garbage in the larger member. A way to avoid the garbage-bit is to by default make all calls to func - which I assume you do control - use only pointers to the larger member - this can be achieved by setting the larger member to the smaller one: i = c and then call func(&i).
func itself is ok.
The problems lies in whether the caller really makes sure that the memory referenced when calling func() is at least sizeof(union foo).
If the latter is always the case, everything is fine. It is not then case for the two calls to func() in the OP's example.
If the memory referenced when calling func() is less then sizeof(union foo) then memcpy() provokes undefined behaviour.
Since you know what and what size to copy, why not give a more explicit function, let the function know how to copy the right size of memory which void pointer pointed to.
union foo
{
char c;
int i;
};
void func(void * src, const char * type)
{
union foo dest;
if(strcmp(type, "char") == 0){
memcpy(&dest, src, 1);
}else if(...){
}
}
Related
I am trying to debug a piece of code written by someone else that results in a segfault sometimes, but not all the time, during a memcpy operation.
Also, I would dearly appreciate it if anyone could give me a hand in translating what's going on in a piece of code that occurs before the memcpy.
First off, we have a function into which is being passed a void pointer and a pointer to a struct, like so:
void ExampleFunction(void *dest, StuffStruct *buf)
The struct looks something like this:
typedef struct {
char *stuff;
unsigned int totalStuff;
unsigned int stuffSize;
unsigned int validStuff;
} StuffStruct;
Back to ExampleFunction. Inside ExampleFunction, this is happening:
void *src;
int numStuff;
numStuff = buf->validStuff;
src = (void *)(buf->stuff);
I'm confused by the above line. What happens exactly when the char array in buf->stuff gets cast to a void pointer, then set as the value of src? I can't follow what is supposed to happen with that step.
Right after this, the memcpy happens:
memcpy(dest, src, buf->bufSize*numStuff)
And that's where the segfault often happens. I've checked for dest/src being null, neither are ever null.
Additionally, in the function that calls ExampleFunction, the array for dest is declared with a size of 5000, if that matters. However, when I printf the value in buf->bufSize*numStuff in the above code, the value is often high above 5000 -- it can go up as high as 80,000 -- WITHOUT segfaulting, though. That is, it runs fine with the length variable (buf->bufSize*numStuff) being much higher than the supposed length that the dest variable was initialized with. However, maybe that doesn't matter since it was cast to a void pointer?
For various reasons I'm unable to use dbg or install an IDE. I'm just using basic printf debugging. Does anyone have any ideas I could explore? Thank you in advance.
First of all, the cast and assignment just copies the address of buf->stuff into the pointer src. There is no magic there.
numStuff = buf->validStuff;
src = (void *)(buf->stuff);
If dest has only enough storage for 5000 bytes, and you are trying to write beyond that length, then you are corrupting your program stack, which can lead to a segfault either on the copy or sometimes a little later. Whether you cast to a void pointer or not makes no difference at all.
memcpy(dest, src, buf->bufSize*numStuff)
I think you need figure out exactly what buf->bufSize*numStuff is supposed to be computing, and either fix it if it is incorrect (not intended), truncate the copy to the size of the destination, or increase the size of the destination array.
A null-pointer dereference is not the only thing that can cause a segfault. When your program allocates memory, it is also possible to trigger a segfault when you attempt to access memory that is after the regions of memory that you have allocated.
Your code looks like it intends to copy the contents of a buffer pointed to by buf->stuff to a destination buffer. If either of those buffers are smaller than the size of the memcpy operation, the memcpy can be overrunning the bounds of allocated memory and triggering a segfault.
Because the memory allocator allocates memory in large chunks, and then divvies it up to various calls to malloc, your code won't consistently fail every time you run past the end of a malloc'ed buffer. You will get exactly the sporadic failure behavior you described.
The assumption that is baked into this code is that both the buffer pointed to by buf->stuff and by the dest pointer are at least "buf->bufSize * numStuff" bytes in length. One of those two assumptions is false.
I would suggest a couple of approaches:
check the code that allocates both the buffer pointed to by dest, and the buffer pointed to by buf->stuff, and ensure that they are always to be as big or larger than buf->bufSize * numStuff.
Failing that, there are a bunch of tools that can help you get better diagnostic information from your program. The simplest to use is efence ("Electric Fence") that will help identify places in your code where you overrun any of your buffers. (http://linux.die.net/man/3/efence). A more thorough analysis can be done using valgrind (http://valgrind.org/) -- but Valgrind is a bit more involved to use.
Good luck!
PS. There's nothing special about casting a char* pointer to a void* pointer -- it's still just an address to an allocated block of memory.
May be similar question found on SO. But, I didn't found that, here is the scenario
Case 1
void main()
{
char g[10];
char a[10];
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
Case 2
void main()
{
char *g=malloc(sizeof(char)*10);
char *a=malloc(sizeof(char)*10);
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
I'm getting same output in both case. So, my question is when should I prefer malloc() instead of array or vice-verse and why ?? I found common definition, malloc() provides dynamic allocation. So, it is the only difference between them ?? Please any one explain with example, what is the meaning of dynamic although we are specifying the size in malloc().
The principle difference relates to when and how you decide the array length. Using fixed length arrays forces you to decide your array length at compile time. In contrast using malloc allows you to decide the array length at runtime.
In particular, deciding at runtime allows you to base the decision on user input, on information not known at the time you compile. For example, you may allocate the array to be a size big enough to fit the actual data input by the user. If you use fixed length arrays, you have to decide at compile time an upper bound, and then force that limitation onto the user.
Another more subtle issue is that allocating very large fixed length arrays as local variables can lead to stack overflow runtime errors. And for that reason, you sometimes prefer to allocate such arrays dynamically using malloc.
Please any one explain with example, what is the meaning of dynamic although we are specifying the size.
I suspect this was significant before C99. Before C99, you couldn't have dynamically-sized auto arrays:
void somefunc(size_t sz)
{
char buf[sz];
}
is valid C99 but invalid C89. However, using malloc(), you can specify any value, you don't have to call malloc() with a constant as its argument.
Also, to clear up what other purpose malloc() has: you can't return stack-allocated memory from a function, so if your function needs to return allocated memory, you typically use malloc() (or some other member of the malloc familiy, including realloc() and calloc()) to obtain a block of memory. To understand this, consider the following code:
char *foo()
{
char buf[13] = "Hello world!";
return buf;
}
Since buf is a local variable, it's invalidated at the end of its enclosing function - returning it results in undefined behavior. The function above is erroneous. However, a pointer obtained using malloc() remains valid through function calls (until you don't call free() on it):
char *bar()
{
char *buf = malloc(13);
strcpy(buf, "Hello World!");
return buf;
}
This is absolutely valid.
I would add that in this particular example, malloc() is very wasteful, as there is more memory allocated for the array than what would appear [due to overhead in malloc] as well as the time it takes to call malloc() and later free() - and there's overhead for the programmer to remember to free it - memory leaks can be quite hard to debug.
Edit: Case in point, your code is missing the free() at the end of main() - may not matter here, but it shows my point quite well.
So small structures (less than 100 bytes) should typically be allocated on the stack. If you have large data structures, it's better to allocate them with malloc (or, if it's the right thing to do, use globals - but this is a sensitive subject).
Clearly, if you don't know the size of something beforehand, and it MAY be very large (kilobytes in size), it is definitely a case of "consider using malloc".
On the other hand, stacks are pretty big these days (for "real computers" at least), so allocating a couple of kilobytes of stack is not a big deal.
I'm trying to write a simple C program on Ubuntu using Eclipse CDT (yes, I'm more comfortable with an IDE and I'm used to Eclipse from Java development), and I'm stuck with something weird. On one part of my code, I initialize a char array in a function, and it is by default pointing to the same location with one of the inputs, which has nothing to do with that char array. Here is my code:
char* subdir(const char input[], const char dir[]){
[*] int totallen = strlen(input) + strlen(dir) + 2;
char retval[totallen];
strcpy(retval, input);
strcat(retval, dir);
...}
Ok at the part I've marked with [*], there is a checkpoint. Even at that breakpoint, when I check y locals, I see that retval is pointing to the same address with my argument input. It not even possible as input comes from another function and retval is created in this function. Is is me being unexperienced with C and missing something, or is there a bug somewhere with the C compiler?
It seems so obvious to me that they should't point to the same (and a valid, of course, they aren't NULL) location. When the code goes on, it literally messes up everything; I get random characters and shapes in console and the program crashes.
I don't think it makes sense to check the address of retval BEFORE it appears, it being a VLA and all (by definition the compiler and the debugger don't know much about it, it's generated at runtime on the stack).
Try checking its address after its point of definition.
EDIT
I just read the "I get random characters and shapes in console". It's obvious now that you are returning the VLA and expecting things to work.
A VLA is only valid inside the block where it was defined. Using it outside is undefined behavior and thus very dangerous. Even if the size were constant, it still wouldn't be valid to return it from the function. In this case you most definitely want to malloc the memory.
What cnicutar said.
I hate people who do this, so I hate me ... but ... Arrays of non-const size are a C99 extension and not supported by C++. Of course GCC has extensions to make it happen.
Under the covers you are essentially doing an _alloca, so your odds of blowing out the stack are proportional to who has access to abuse the function.
Finally, I hope it doesn't actually get returned, because that would be returning a pointer to a stack allocated array, which would be your real problem since that array is gone as of the point of return.
In C++ you would typically use a string class.
In C you would either pass a pointer and length in as parameters, or a pointer to a pointer (or return a pointer) and specify the calls should call free() on it when done. These solutions all suck because they are error prone to leaks or truncation or overflow. :/
Well, your fundamental problem is that you are returning a pointer to the stack allocated VLA. You can't do that. Pointers to local variables are only valid inside the scope of the function that declares them. Your code results in Undefined Behaviour.
At least I am assuming that somewhere in the ..... in the real code is the line return retval.
You'll need to use heap allocation, or pass a suitably sized buffer to the function.
As well as that, you only need +1 rather than +2 in the length calculation - there is only one null-terminator.
Try changing retval to a character pointer and allocating your buffer using malloc().
Pass the two string arguments as, char * or const char *
Rather than returning char *, you should just pass another parameter with a string pointer that you already malloc'd space for.
Return bool or int describing what happened in the function, and use the parameter you passed to store the result.
Lastly don't forget to free the memory since you're having to malloc space for the string on the heap...
//retstr is not a const like the other two
bool subdir(const char *input, const char *dir,char *retstr){
strcpy(retstr, input);
strcat(retstr, dir);
return 1;
}
int main()
{
char h[]="Hello ";
char w[]="World!";
char *greet=(char*)malloc(strlen(h)+strlen(w)+1); //Size of the result plus room for the terminator!
subdir(h,w,greet);
printf("%s",greet);
return 1;
}
This will print: "Hello World!" added together by your function.
Also when you're creating a string on the fly you must malloc. The compiler doesn't know how long the two other strings are going to be, thus using char greet[totallen]; shouldn't work.
I am new in C, trying to figure out about memory allocation in C that I kinda confused
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
int a;
} struct1_t;
int main()
{
funct1(); //init pointer
return 1;
}
int funct2(struct1_t *ptr2struct)
{
printf("print a is %d\n",ptr2struct->a);
//free(ptr2struct);
printf("value of ptr in funct2 is %p\n", ptr2struct);
return 1; //success
}
int funct1(){
struct1_t *ptr2struct = NULL;
ptr2struct = malloc(sizeof(*ptr2struct));
ptr2struct->a = 5;
printf("value of ptr before used is %p", ptr2struct);
if (funct2(ptr2struct) == 0) {
goto error;
}
free(ptr2struct);
printf("value of ptr in funct1 after freed is is %p\n", ptr2struct);
return 1;
error:
if(ptr2struct) free(ptr2struct);
return 0;
}
I have funct 1 that calls funct 2, and after using the allocated pointer in funct1, I try to free the pointer. And I create a case where if the return value in funct2 is not 1, then try again to free the pointer.
My question is below
which practice is better, if I should free the memory in funct2 (after I pass it) or in funct1 (after I finish getting the return value of funct1)
The second thing is whether this is correct to make a goto error, and error:
if(ptr2struct) free(ptr2struct);
My third question is , how do I check if the allocated value is already freed or not? because after getting the return value, I free the pointer, but if I print it, it shows the same location with the allocated one (so not a null pointer).
Calling free() on a pointer doesn't change it, only marks memory as free. Your pointer will still point to the same location which will contain the same value, but that value can now get overwritten at any time, so you should never use a pointer after it is freed. To ensure that, it is a good idea to always set the pointer to NULL after free'ing it.
1) Should I free it in the calling function or in the called function?
I try to do the free-ing in the same function that does the malloc-ing. This keeps the memory-management concerns in one place and also gives better separation of concerns, since the called function in this case can also work with pointers that have not been malloc-ed or use the same pointer twice (if you want to do that).
2) Is it correct to do a "goto error"?
Yes! By jumping to a single place at the end of the function you avoid having to duplicate the resource-releasing code. This is a common pattern and isn't that bad since the "goto" is just serving as a kind of "return" statement and isn't doing any of its really tricky and evil stuff it is more known for.
//in the middle of the function, whenever you would have a return statement
// instead do
return_value = something;
goto DONE;
//...
DONE:
//resorce management code all in one spot
free(stuff);
return return_value;
C++, on the other hand, has a neat way to do this kind of resource management. Since destructors are deterministically called right before a function exits they can be used to neatly package this king of resource management. They call this technique RAII
Another way other languages have to deal with this is finally blocks.
3) Can I see if a pointer has already been freed?
Sadly, you can't. What some people do is setting the pointer variable value to NULL after freeing it. It doesn't hurt (since its old value shouldn't be used after being freed anyway) and it has the nice property that freeing a null pointer is specified to be a no-op.
However, doing so is not foolproof. Be careful about having other variables aliasing the same pointer since they will still contain the old value, that is now a dangerous dangling pointer.
My question is below
which practice is better, if I should free the memory in funct2 (after I pass it) or in funct1 (after I finish getting the return value of funct1)
This is an "ownership" question. Who owns the allocated memory. Typically, this has to be decided based on the design of your program. For example, the only purpose of func1() could be to only allocate memory. That is, in your implementation, func1() is the function for memory allocation and then the "calling" function uses the memory. In that case, the ownership to free the memory is with the caller of func1 and NOT with func1().
The second thing is whether this is correct to make a goto error, and error:
The use of "goto" is generally frowned about. It causes mess in the code that could just be easily avoided. However, I say "generally". There are cases where goto can be quiet handy and useful. For example, in big systems, configuration of the system is a big step. Now, imagine you call a single Config() function for the system which allocates memory for its different data structures at different points in the function like
config()
{
...some config code...
if ( a specific feature is enabled)
{
f1 = allocateMemory();
level = 1;
}
....some more code....
if ( another feature is enabled)
{
f2 = allocateMemory();
level = 2;
}
....some more codee....
if ( another feature is enabled)
{
f3 = allocateMemor();
level =3;
}
/*some error happens */
goto level_3;
level_3:
free(f3);
level_2:
free(f2);
level_1:
free(f1);
}
In this case, you can use goto and elegantly free only that much memory that was allocated till the point the configuration failed.
However, suffice to say in your example goto is easily avoidable and should be avoided.
My third question is , how do I check if the allocated value is already freed or not? because after getting the return value, I free the pointer, but if I print it, it shows the same location with the allocated one (so not a null pointer).
Easy. Set the freed memory as NULL. The other advantage, apart from the one mentioned by MK, is that passing NULL pointer to free will cause a NOP i.e. no operation is performed. This will also help you avoid any double delete problems.
What i am about to share are my own development practices in C. They are by NO mean the ONLY way to organize yourself. I am just outlining a way not the way.
Okay, so, in many ways "C" is a loose language, so a lot of discipline and strictness comes from oneself as a developer. I've been developing in "C" for more than 20 years professionally, I've only very rarely have I had to fix any production-grade software that I have developed. While quite a bit of the success may be attributed to experience, a fair chunk of it is rooted in consistent practice.
I follow a set of development practices, which are quite extensive, and deal with everything as trivial as tabs to naming conventions and what not. I will limit my self to what I do about dealing with structures in general and there memory management in particular.
If I have a structure that's used throughout the software, I write create/destroy; init/done type functions for it:
struct foo * init_foo();
void done_foo(struct foo *);
and allocate and de-allocate the structure in these functions.
If I manipulate a structure elements directly all over the program then don't typedef it. I take the pain of using the struct keyword in each declaration so that I know it's a structure. This is sufficient where the pain threshold is NOT so much that I would get annoyed by it. :-)
If I find that the structure is acting VERY much like an object then I choose to manipulate the structure elements STRICTLY through an opaque API; then I define its interface through set/get type functions for each element, I create a 'forward declaration' in the header file used by every other part of the program, create a an opaque typedef to the pointer of the structure, and only declare the actual structure in the structure API implementation file.
foo.h:
struct foo;
typedef struct foo foo_t;
void set_e1(foo_t f, int e1);
int get_ei(foo_t f);
int set_buf(foo_t f, const char *buf);
char * get_buf_byref(foo_t f)
char * get_buf_byval(foo_t f, char *dest, size_t *dlen);
foo.c:
#include <foo.h>
struct foo {
int e1;
char *buf;
...
};
void set_e1(foo_t f, int e1) {
f->e1 = e1;
}
int get_ei(foo_t f) { return f->e1; }
void set_buf(foo_t f, const char *buf) {
if ( f->buf ) free ( f->buf );
f->buf = strdup(buf);
}
char *get_buf_byref(foo_t f) { return f->buf; }
char *get_buf_byval(foo_t f, char **dest, size_t *dlen) {
*dlen = snprintf(*dest, (*dlen) - 1, "%s", f->buf); /* copy at most dlen-1 bytes */
return *dest;
}
If the related structures are very complicated you may even want to implement function pointers right into a base structure and then provide actual manipulators in particular extensions of that structure.
You will see a strong similarity between the approach i've outlined above and object oriented programming. It is meant to be that ...
If you keep your interfaces clean like this, whether or not you have to set instance variables to NULL all over the place won't matter. The code will, hopefully, yield itself to a tighter structure where silly mistakes are less likely.
Hope this helps.
I know this is answered but, I wanted to give my input. As far as I understand, when you call a function with parameters such as here (the pointer), the parameters are pushed to the stack(FILO).
Therefore the pointer passed to the function will be automagically popped off the stack but not freeing the pointer in funct1(). Therefore you would need to free the pointer in funct1() Correct me if I am wrong.
I have a piece of code written by a very old school programmer :-) . it goes something like this
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[1];
} ts_request_def;
ts_request_def* request_buffer =
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));
the programmer basically is working on a buffer overflow concept. I know the code looks dodgy. so my questions are:
Does malloc always allocate contiguous block of memory? because in this code if the blocks are not contiguous, the code will fail big time
Doing free(request_buffer) , will it free all the bytes allocated by malloc i.e sizeof(ts_request_def) + (2 * 1024 * 1024),
or only the bytes of the size of the structure sizeof(ts_request_def)
Do you see any evident problems with this approach, I need to discuss this with my boss and would like to point out any loopholes with this approach
To answer your numbered points.
Yes.
All the bytes. Malloc/free doesn't know or care about the type of the object, just the size.
It is strictly speaking undefined behaviour, but a common trick supported by many implementations. See below for other alternatives.
The latest C standard, ISO/IEC 9899:1999 (informally C99), allows flexible array members.
An example of this would be:
int main(void)
{
struct { size_t x; char a[]; } *p;
p = malloc(sizeof *p + 100);
if (p)
{
/* You can now access up to p->a[99] safely */
}
}
This now standardized feature allowed you to avoid using the common, but non-standard, implementation extension that you describe in your question. Strictly speaking, using a non-flexible array member and accessing beyond its bounds is undefined behaviour, but many implementations document and encourage it.
Furthermore, gcc allows zero-length arrays as an extension. Zero-length arrays are illegal in standard C, but gcc introduced this feature before C99 gave us flexible array members.
In a response to a comment, I will explain why the snippet below is technically undefined behaviour. Section numbers I quote refer to C99 (ISO/IEC 9899:1999)
struct {
char arr[1];
} *x;
x = malloc(sizeof *x + 1024);
x->arr[23] = 42;
Firstly, 6.5.2.1#2 shows a[i] is identical to (*((a)+(i))), so x->arr[23] is equivalent to (*((x->arr)+(23))). Now, 6.5.6#8 (on the addition of a pointer and an integer) says:
"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
For this reason, because x->arr[23] is not within the array, the behaviour is undefined. You might still think that it's okay because the malloc() implies the array has now been extended, but this is not strictly the case. Informative Annex J.2 (which lists examples of undefined behaviour) provides further clarification with an example:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
3 - That's a pretty common C trick to allocate a dynamic array at the end of a struct. The alternative would be to put a pointer into the struct and then allocate the array separately, and not forgetting to free it too. That the size is fixed to 2mb seems a bit unusual though.
This is a standard C trick, and isn't more dangerous that any other buffer.
If you are trying to show to your boss that you are smarter than "very old school programmer", this code isn't a case for you. Old school not necessarily bad. Seems the "old school" guy knows enough about memory management ;)
1) Yes it does, or malloc will fail if there isn't a large enough contiguous block available. (A failure with malloc will return a NULL pointer)
2) Yes it will. The internal memory allocation will keep track of the amount of memory allocated with that pointer value and free all of it.
3)It's a bit of a language hack, and a bit dubious about it's use. It's still subject to buffer overflows as well, just may take attackers slightly longer to find a payload that will cause it. The cost of the 'protection' is also pretty hefty (do you really need >2mb per request buffer?). It's also very ugly, although your boss may not appreciate that argument :)
I don't think the existing answers quite get to the essence of this issue. You say the old-school programmer is doing something like this;
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[1];
} ts_request_def;
ts_request_buffer_def* request_buffer =
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));
I think it's unlikely he's doing exactly that, because if that's what he wanted to do he could do it with simplified equivalent code that doesn't need any tricks;
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[2*1024*1024 + 1];
} ts_request_def;
ts_request_buffer_def* request_buffer =
malloc(sizeof(ts_request_def));
I'll bet that what he's really doing is something like this;
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[1]; // effectively package[x]
} ts_request_def;
ts_request_buffer_def* request_buffer =
malloc( sizeof(ts_request_def) + x );
What he wants to achieve is allocation of a request with a variable package size x. It is of course illegal to declare the array's size with a variable, so he is getting around this with a trick. It looks as if he knows what he's doing to me, the trick is well towards the respectable and practical end of the C trickery scale.
As for #3, without more code it's hard to answer. I don't see anything wrong with it, unless its happening a lot. I mean, you don't want to allocate 2mb chunks of memory all the time. You also don't want to do it needlessly, e.g. if you only ever use 2k.
The fact that you don't like it for some reason isn't sufficient to object to it, or justify completely re-writing it. I would look at the usage closely, try to understand what the original programmer was thinking, look closely for buffer overflows (as workmad3 pointed out) in the code that uses this memory.
There are lots of common mistakes that you may find. For example, does the code check to make sure malloc() succeeded?
The exploit (question 3) is really up to the interface towards this structure of yours. In context this allocation might make sense, and without further information it is impossible to say if it's secure or not.
But if you mean problems with allocating memory bigger than the structure, this is by no means a bad C design (I wouldn't even say it's THAT old school... ;) )
Just a final note here - the point with having a char[1] is that the terminating NULL will always be in the declared struct, meaning there can be 2 * 1024 * 1024 characters in the buffer, and you don't have to account for the NULL by a "+1". Might look like a small feat, but I just wanted to point out.
I've seen and used this pattern frequently.
Its benefit is to simplify memory management and thus avoid risk of memory leaks. All it takes is to free the malloc'ed block. With a secondary buffer, you'll need two free. However one should define and use a destructor function to encapsulate this operation so you can always change its behavior, like switching to secondary buffer or add additional operations to be performed when deleting the structure.
Access to array elements is also slightly more efficient but that is less and less significant with modern computers.
The code will also correctly work if memory alignment changes in the structure with different compilers as it is quite frequent.
The only potential problem I see is if the compiler permutes the order of storage of the member variables because this trick requires that the package field remains last in the storage. I don't know if the C standard prohibits permutation.
Note also that the size of the allocated buffer will most probably be bigger than required, at least by one byte with the additional padding bytes if any.
Yes. malloc returns only a single pointer - how could it possibly tell a requester that it had allocated multiple discontiguous blocks to satisfy a request?
Would like to add that not is it common but I might also called it a standard practice because Windows API is full of such use.
Check the very common BITMAP header structure for example.
http://msdn.microsoft.com/en-us/library/aa921550.aspx
The last RBG quad is an array of 1 size, which depends on exactly this technique.
This common C trick is also explained in this StackOverflow question (Can someone explain this definition of the dirent struct in solaris?).
In response to your third question.
free always releases all the memory allocated at a single shot.
int* i = (int*) malloc(1024*2);
free(i+1024); // gives error because the pointer 'i' is offset
free(i); // releases all the 2KB memory
The answer to question 1 and 2 is Yes
About ugliness (ie question 3) what is the programmer trying to do with that allocated memory?
the thing to realize here is that malloc does not see the calculation being made in this
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));
Its the same as
int sz = sizeof(ts_request_def) + (2 * 1024 * 1024);
malloc(sz);
YOu might think that its allocating 2 chunks of memory , and in yr mind they are "the struct", "some buffers". But malloc doesnt see that at all.