malloc once, then distribute memory over struct arrays - c

I have a struct that has the following memory layout:
uint32_t
variable length array of type uint16_t
variable length array of type uint16_t
Because of the variable length of the arrays I have pointers to these arrays, effectively:
struct struct1 {
uint32_t n;
uint16_t *array1;
uint16_t *array2;
};
typedef struct struct1 struct1;
Now, when allocation these structs I see two options:
A) malloc the struct itself, then malloc space for the arrays individually and set the pointers in the struct to point to the correct memory location:
uint32_t n1 = 10;
uint32_t n2 = 20;
struct1 *s1 = malloc(sizeof(struct1));
uint16 *array1 = malloc(sizeof(uint16) * n1));
uint16 *array2 = malloc(sizeof(uint16) * n2));
s1->n = n1;
s1->array1 = array1;
s1->array2 = array2;
B) malloc memory for everything combined, then "distribute" the memory over the struct:
struct1 *s1 = malloc(sizeof(struct1) + (n1 + n2) * sizeof(uint16_t));
s1->n = n1;
s1->array1 = s1 + sizeof(struct1);
s1->array2 = s1 + sizeof(struct1) + n1 * sizeof(uint16_t);
Note that array1 and array2 are not bigger than a few KB and usually not a lot of struct1s are needed. However, cache efficiency is a concern as numeric data crunching is done with this struct.
Is approach B) possible and if so better (faster) than A in terms of memory locality?
I am not very familiar with C, is there a better way of doing B (or A), ie. using memcpy or realloc or something?
Anything else to be mindful about in this situation?
Note, that right now I'm using gcc (C89?) on linux but could use C99/C11 if necessary. Thanks in advance.
EDIT: To clarify further: The size of the arrays will never change after creation. Multiple struct1s will not be always be allocated at once but rather occasionally during the program's runtime.

I think your option A is much cleaner and would scale in a more sensible way. Imagine having to realloc space when the array in one of the structures becomes larger: in option A, you can realloc that memory since it isn't logically attached to anything else. In option B, you need to add in additional logic to ensure you don't break your other array.
I also think (even in C89, but I could be wrong) that there is nothing wrong with this:
struct1 *s1 = malloc(sizeof(struct1));
s1->array1 = malloc(sizeof(uint16) * n1));
s1->array2 = malloc(sizeof(uint16) * n2));
s1->n = n1;
The above takes out the middle-man arrays. I think it is cleaner because you immediately see that you are allocating space for a pointer in a structure.
I have used your option B before for 2D arrays, where I just allocate a single space and use logical rules in my code to use it as a 2D space. This is useful when I want it to be a rectangular 2D space, so when I increase it, I always increase each row or column. In other words, I never want to have heterogeneous array sizes.
Update: 'Arrays will never change in size'
Because you clarified that your structures/arrays will never need to be reallocated, I think option B is less bad. It still seems to be a worse solution for this application than option A, and here are my reasons for thinking this:
malloc is optimized such that there wouldn't be much optimization from allocating a single space compared to allocating the spaces individually.
The ability of other engineers to look at and immediately understand your code would be reduced. To be clear, any competent software engineer should be able to look at option B and figure out what the writer of the code was doing, but it very well could waste that engineers' brain-cycles and could cause a junior engineer to misunderstand the code and create a bug.
So, if you comment the code thoroughly, and your application absolutely requires you to optimize everything you possibly can, at the expense of clean and logically sensible code (where memory space and data structures are logically separated in a similar way), and you know that this optimization is better than what a good compiler (like Clang) can do, then option B could be a better option.
Update: Testing
In the spirit of self-criticism I wanted to see if I could evaluate the difference. So I wrote two programs (one for option A and one for option B) and compiled them with optimizations off. I used a FreeBSD virtual machine to get as clean of an environment as possible, and I used gcc.
Here are the programs that I used to test the two methods:
optionA.c:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#define NSIZE 100000
#define NTESTS 10000000
struct test_struct {
int n;
int *array1;
int *array2;
};
void freeA(struct test_struct *input) {
free(input->array1);
free(input->array2);
free(input);
return;
}
void optionA() {
struct test_struct *s1 = malloc(sizeof(*s1));
s1->array1 = malloc(sizeof(*(s1->array1)) * NSIZE);
s1->array2 = malloc(sizeof(*(s1->array1)) * NSIZE);
s1->n = NSIZE;
freeA(s1);
s1 = 0;
return;
}
int main() {
clock_t beginA = clock();
int i;
for (i=0; i<NTESTS; i++) {
optionA();
}
clock_t endA = clock();
int time_spent_A = (endA - beginA);
printf("Time spent for option A: %d\n", time_spent_A);
return 0;
}
optionB.c:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#define NSIZE 100000
#define NTESTS 10000000
struct test_struct {
int n;
int *array1;
int *array2;
};
void freeB(struct test_struct *input) {
free(input);
return;
}
void optionB() {
struct test_struct *s1 = malloc(sizeof(*s1) + 2*NSIZE*sizeof(*(s1->array1)));
s1->array1 = s1 + sizeof(*s1);
s1->array2 = s1 + sizeof(*s1) + NSIZE*sizeof(*(s1->array1));
s1->n = NSIZE;
freeB(s1);
s1 = 0;
return;
}
int main() {
clock_t beginB = clock();
int i;
for (i=0; i<NTESTS; i++) {
optionB();
}
clock_t endB = clock();
int time_spent_B = (endB - beginB);
printf("Time spent for option B: %d\n", time_spent_B);
return 0;
}
Results for these tests are given in clocks (see clock(3) for more information).
Series | Option A | Option B
------------------------------
1 | 332 | 158
------------------------------
2 | 334 | 155
------------------------------
3 | 334 | 156
------------------------------
4 | 333 | 154
------------------------------
5 | 339 | 156
------------------------------
6 | 334 | 155
------------------------------
avg | 336.0 | 155.7
------------------------------
Note that these speeds are still incredibly fast and translate to milliseconds over millions of tests. I have also found that Clang (cc) is better than gcc at optimizing. On my machine, even after writing a method that writes data to the arrays (to ensure they don't get optimized out of existence) I got no differential between the two methods when compiling with cc.

I would advice a hybrid of the two:
allocate the structs in one call (it is now an array of structs);
allocate the arrays in one call, and make sure the size includes any padding for the allignment required by your compiler/platform;
distribute the arrays over the structs, taking the allignment into acount.
However, malloc is already optimized, so your first solution would still be prefered.
Note: as user Greg Schmit's solution points out, allocating all the arrays in one time, will cause difficulties if the array size needs to be changed in run-time

Because the two arrays have the same type, there are much more options than that, based on creative use of the C99 flexible array member. I'd recommend you use a pointer only for the second array,
struct foo {
uint16_t *array2;
uint32_t field;
uint16_t array1[];
};
and allocate memory for both at the same time:
struct foo *foo_new(const size_t length1, const size_t length2)
{
struct foo *result;
result = malloc( sizeof (struct foo)
+ length1 * sizeof (uint16_t)
+ length2 * sizeof (uint16_t) );
if (!result)
return NULL;
result->array2 = result->array1 + length1;
return result;
}
Note that with struct foo *bar, accessing element i in the two arrays uses the same notation, bar->array1[i] and bar->array2[i], respectively.
In the context of scientific computing, I would consider completely other options, depending on the access patterns. For example, if the two arrays are accessed in lockstep (in any direction), I would use
typedef uint16_t pair16[2];
struct bar {
uint32_t field;
pair16 array[];
};
If the arrays were large, then copying them into temporary buffers (arrays of pair16 above, if accessed in lockstep) would possibly help, but with at most a few thousand entries, it is likely not going to give a significant speed boost.
In cases where the access pattern depends on the other, but you still do enough of computation on each entry, it may be useful to compute the address of the next entry early, and use __builtin_prefetch() GCC built-in to tell the CPU you'll need it soon, before doing the computation on the current entry. It may reduce the data access latencies (although the access predictors are pretty darn good on current processors already).
With GCC (and to a lesser extent on other common compilers like Intel Compiler Collection, Portland Group, and Pathscale C compilers), I've noticed that code that manipulates pointers (instead of array pointers and array indexing) compiles to better machine code on x86 and x86-64. (The reason is actually quite simple: with array pointers and array indexing, you need at least two separate registers, and x86 has relatively few of those. Even x86-64 doesn't have that many of them. GCC in particular is not very strong at optimizing register usage -- it's much better now than in the version 3 era --, so this seems to help a lot in some cases). For example, if you were to access the first array in a struct foo sequentially, then
void do_something(struct foo *ref)
{
uint16_t *array1 = ref->array1;
uint16_t *const limit1 = ref->array1 + (number of elements in array1);
for (; array1 < limit1; array1++) {
/* ... */
}
}

Approach B is possible, (why don't you just try it?) and it is better, not so much because of memory locality, but because malloc() costs, so the fewer times you call it, the better off you are. (Assuming that 'better' means 'faster', which admittedly, is not necessarily the case.)
Memory locality is only marginally improved, since all memory blocks would most likely be continuous (one after the other) in memory, so if you went with approach A your arrays would only be separated by block headers, which are not very big. (Of the order of 32 bytes each, maybe a bit larger, but not by much.) The only situation in which your blocks would not be continuous is if you had previously been doing malloc() and free(), so your memory would be fragmented.

Related

How to improve performance of a dynamic array implemented with void**?

I need to implemenet a simple dynamic array that can work with any type.
Right now my void** implementation is ~50% slower than using int* directly:
#define N 1000000000
// Takes ~6 seconds
void** a = malloc(sizeof(void*) * N);
for (int i =0; i < N; i++) {
*(int*)(&a[i]) = i;
}
printf("%d\n", *(int*)&a[N-1]);
// Takes ~3 seconds
int* b = malloc(sizeof(int) * N) ;
for (int i =0; i < N; i++) {
b[i] = i;
}
printf("%d\n", b[N-1]);
I'm not a C expert. Is there a better way to do this?
Thanks
edit
Look like using void** is a bad idea. Is there a way to implement this with void*?
Here's how it's implemented in Go:
type slice struct {
array unsafe.Pointer
len int
cap int
}
I'd like to do something similar.
edit2
I managed to implement this with void*.
The solution was really simple:
void* a = malloc(sizeof(int) * N);
for (int i = 0; i < N; i++) {
((int*)a)[i] = i;
}
printf("%d\n", ((int*)a)[N-1]);
Performance is the same now.
Your two alternatives programs are not analogous. In the second one, which is valid, you allocate space sufficient to hold N integers, and then assign values to the int-size members of that space. In the first one, however, you allocate space large enough to accommodate N pointers to void and then, without initializing those pointers, you try to assign values to the objects to which they point. Even if those pointers had been initialized to point to int objects, there is an extra level of indirection.
Your first code could be corrected, in a sense, like so:
void** a = malloc(sizeof(void*) * N);
for (int i =0; i < N; i++) {
a[i] = (void *) i;
}
printf("%d\n", (int) a[N-1]);
That relies on the fact that C allows conversions between pointer and integer types (although not necessarily without data loss), and note that there is only a single level of indirection (array indexing), not two.
Inasmuch as the behavior of your implementation of the first alternative is undefined, we can only speculate about why it runs slower in practice. If we assume a straightforward implementation, however, then such a performance penalty as you observe might arise from poor cache locality for all the array writes.
Be aware that sizeof(void *) is the double of sizeof(int) on 64 bits processors (8 bytes address versus 4 bytes signed integer). If that's your case, I bet the difference only is page cache miss. You memory unit is required to load two times more pages, which is slow (link for more information here).
Please also note that C++ vectors aren't "dynamic array that can work with any type". They are bound to a type, for instance: std::vector<int> is a dynamic array but where you can only store int.
A solution to your problem would be to implement some sort of std::vector<void *> in C. But it's not efficient:.
You need to do 2 allocations for each element (1 for the container and 1 for the element itself)
You need to do 2 levels of indirection each time you access the data (1 to get the pointer in the container and 1 to get the data in the element)
You need to store some kind of type information in each element. If not, you don't know what is in your dynamic array
I managed to implement this with void*.
The solution was really simple:
void* a = malloc(sizeof(int) * N);
for (int i =0;i<N;i++) {
((int*)a)[i] = i;
}
printf("%d\n", ((int*)a)[N-1]);
Performance is the same now.
I also came across this great article that explains how to implement a generic data structure in C:
http://jiten-thakkar.com/posts/writing-generic-stack-in-c

Allocate Pointer and pointee at once

If I want to reduce malloc()s (espacially if the data is small and allocated often) I would like to allocate the pointer and pointee at once.
If you assume something like the following:
struct entry {
size_t buf_len;
char *buf;
int something;
};
I would like to allocate memory in the following way (don't care about error checking here):
size_t buf_len = 4; // size of the buffer
struct entry *e = NULL;
e = malloc( sizeof(*e) + buf_len ); // allocate struct and buffer
e->buf_len = buf_len; // set buffer size
e->buf = e + 1; // the buffer lies behind the struct
This could even be extende, so that a whole array is allocated at once.
How would you assess such a technuique with regard to:
Portability
Maintainability / Extendability
Performance
Readability
Is this reasonable? If it is ok to use, are there any ideas on how to design a possible interface for that?
You could use a flexible array member instead of a pointer:
struct entry {
size_t buf_len;
int something;
char buf[];
};
// ...
struct entry *e = malloc(sizeof *e + buf_len);
e->buf_len = buf_len;
Portability and performance are fine. Readability: not perfect but good enough.
Extendability: you can't use this for more than one member at a time, you'd have to fall back to your explicit pointer version. Also, the explicit pointer version means that you have to muck around to ensure correct alignment if you use it with a type that doesn't have an alignment of 1.
If you are seriously thinking about this I'd consider revisiting your entire data structure's design to see if there is another way of doing it. (Maybe this way is actually the best way, but have a good think about it first).
As to portability, I am unaware of any issues, as long as the sizes are found via suitable calls to sizeof(), as in your code.
Regarding maintainability, extendability and readability, you should certainly wrap allocation and de-allocation in a well-commented function. Calls to...
entry *allocate_entry_with_buffer();
void deallocate_entry_with_buffer(entry **entry_with_buffer);
...do not need to know implementation details of how the memory actually gets handled. People use stranger things like custom allocators and memory pools quite frequently.
As for speed, this is certainly faster than making lots of small allocations. I used to allocate whole 2D matrices with a similar strategy...
It should work, but in fact you are using a pointer for a useless indirection. Windows API (for example) uses another method for variable size structs : the variable size buffer is last in struct and is declared to be char buf[1].
Your struct would become :
struct entry {
size_t buf_len;
int something;
char buf[1];
};
The allocation is (still no error checking) :
size_t buf_len = 4; // size of the buffer
struct entry *e;
e = malloc( sizeof(*e) + buf_len - 1); // struct already has room for 1 char
e->buf_len = buf_len; // set buffer size
That's all e.buf is guaranteed to be a char array of size buf_len.
That way ensures that even if the variable part was not a character array but a int, long, or anything array, the alignement would be given by the last element being a array of proper type and size 1.
For starters, the line:
e->buf = e + sizeof(*e); // the buffer lies behind the struct
Should be:
e->buf = e + 1; // the buffer lies behind the struct
This is because e + 1 will be equal to the address at the end of the structure. As you have it, it will only be the number of bytes into the structure equal to the number of bytes in a pointer.
And, yes, it's reasonable. However, I prefer this approach:
struct entry {
size_t buf_len;
int something;
char buf[1];
};
This way, you don't mess with the pointers. Just append as many bytes as needed, and they will grow the size of your buf array.
Note: I wrote a text editor using an approach similar to this but used a Microsoft c++ extension that allowed me to declare the last member as char buf[]. So it was an empty array that was exactly as long as the number of extra bytes I allocated.
seems fine to me - put comments in though
Or you could do this - which is quite common
struct entry {
size_t buf_len;
int something;
char buf;
};
ie make the struct itself variable length. and do
size_t buf_len = 4; // size of the buffer
struct entry *e = NULL;
// check that it packs right
e = malloc(sizeof(size_t) + sizeof(int) + buf_len ); // allocate struct and buffer
e->buf_len = buf_len; // set buffer size
...... later
printf(&e.buf);

Get the length of an array with a pointer? [duplicate]

I've allocated an "array" of mystruct of size n like this:
if (NULL == (p = calloc(sizeof(struct mystruct) * n,1))) {
/* handle error */
}
Later on, I only have access to p, and no longer have n. Is there a way to determine the length of the array given just the pointer p?
I figure it must be possible, since free(p) does just that. I know malloc() keeps track of how much memory it has allocated, and that's why it knows the length; perhaps there is a way to query for this information? Something like...
int length = askMallocLibraryHowMuchMemoryWasAlloced(p) / sizeof(mystruct)
I know I should just rework the code so that I know n, but I'd rather not if possible. Any ideas?
No, there is no way to get this information without depending strongly on the implementation details of malloc. In particular, malloc may allocate more bytes than you request (e.g. for efficiency in a particular memory architecture). It would be much better to redesign your code so that you keep track of n explicitly. The alternative is at least as much redesign and a much more dangerous approach (given that it's non-standard, abuses the semantics of pointers, and will be a maintenance nightmare for those that come after you): store the lengthn at the malloc'd address, followed by the array. Allocation would then be:
void *p = calloc(sizeof(struct mystruct) * n + sizeof(unsigned long int),1));
*((unsigned long int*)p) = n;
n is now stored at *((unsigned long int*)p) and the start of your array is now
void *arr = p+sizeof(unsigned long int);
Edit: Just to play devil's advocate... I know that these "solutions" all require redesigns, but let's play it out.
Of course, the solution presented above is just a hacky implementation of a (well-packed) struct. You might as well define:
typedef struct {
unsigned int n;
void *arr;
} arrInfo;
and pass around arrInfos rather than raw pointers.
Now we're cooking. But as long as you're redesigning, why stop here? What you really want is an abstract data type (ADT). Any introductory text for an algorithms and data structures class would do it. An ADT defines the public interface of a data type but hides the implementation of that data type. Thus, publicly an ADT for an array might look like
typedef void* arrayInfo;
(arrayInfo)newArrayInfo(unsignd int n, unsigned int itemSize);
(void)deleteArrayInfo(arrayInfo);
(unsigned int)arrayLength(arrayInfo);
(void*)arrayPtr(arrayInfo);
...
In other words, an ADT is a form of data and behavior encapsulation... in other words, it's about as close as you can get to Object-Oriented Programming using straight C. Unless you're stuck on a platform that doesn't have a C++ compiler, you might as well go whole hog and just use an STL std::vector.
There, we've taken a simple question about C and ended up at C++. God help us all.
keep track of the array size yourself; free uses the malloc chain to free the block that was allocated, which does not necessarily have the same size as the array you requested
Just to confirm the previous answers: There is no way to know, just by studying a pointer, how much memory was allocated by a malloc which returned this pointer.
What if it worked?
One example of why this is not possible. Let's imagine the code with an hypothetic function called get_size(void *) which returns the memory allocated for a pointer:
typedef struct MyStructTag
{ /* etc. */ } MyStruct ;
void doSomething(MyStruct * p)
{
/* well... extract the memory allocated? */
size_t i = get_size(p) ;
initializeMyStructArray(p, i) ;
}
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
doSomething(s) ;
}
Why even if it worked, it would not work anyway?
But the problem of this approach is that, in C, you can play with pointer arithmetics. Let's rewrite doSomethingElse():
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
MyStruct * s2 = s + 5 ; /* s2 points to the 5th item */
doSomething(s2) ; /* Oops */
}
How get_size is supposed to work, as you sent the function a valid pointer, but not the one returned by malloc. And even if get_size went through all the trouble to find the size (i.e. in an inefficient way), it would return, in this case, a value that would be wrong in your context.
Conclusion
There are always ways to avoid this problem, and in C, you can always write your own allocator, but again, it is perhaps too much trouble when all you need is to remember how much memory was allocated.
Some compilers provide msize() or similar functions (_msize() etc), that let you do exactly that
May I recommend a terrible way to do it?
Allocate all your arrays as follows:
void *blockOfMem = malloc(sizeof(mystruct)*n + sizeof(int));
((int *)blockofMem)[0] = n;
mystruct *structs = (mystruct *)(((int *)blockOfMem) + 1);
Then you can always cast your arrays to int * and access the -1st element.
Be sure to free that pointer, and not the array pointer itself!
Also, this will likely cause terrible bugs that will leave you tearing your hair out. Maybe you can wrap the alloc funcs in API calls or something.
malloc will return a block of memory at least as big as you requested, but possibly bigger. So even if you could query the block size, this would not reliably give you your array size. So you'll just have to modify your code to keep track of it yourself.
For an array of pointers you can use a NULL-terminated array. The length can then determinate like it is done with strings. In your example you can maybe use an structure attribute to mark then end. Of course that depends if there is a member that cannot be NULL. So lets say you have an attribute name, that needs to be set for every struct in your array you can then query the size by:
int size;
struct mystruct *cur;
for (cur = myarray; cur->name != NULL; cur++)
;
size = cur - myarray;
Btw it should be calloc(n, sizeof(struct mystruct)) in your example.
Other have discussed the limits of plain c pointers and the stdlib.h implementations of malloc(). Some implementations provide extensions which return the allocated block size which may be larger than the requested size.
If you must have this behavior you can use or write a specialized memory allocator. This simplest thing to do would be implementing a wrapper around the stdlib.h functions. Some thing like:
void* my_malloc(size_t s); /* Calls malloc(s), and if successful stores
(p,s) in a list of handled blocks */
void my_free(void* p); /* Removes list entry and calls free(p) */
size_t my_block_size(void* p); /* Looks up p, and returns the stored size */
...
really your question is - "can I find out the size of a malloc'd (or calloc'd) data block". And as others have said: no, not in a standard way.
However there are custom malloc implementations that do it - for example http://dmalloc.com/
I'm not aware of a way, but I would imagine it would deal with mucking around in malloc's internals which is generally a very, very bad idea.
Why is it that you can't store the size of memory you allocated?
EDIT: If you know that you should rework the code so you know n, well, do it. Yes it might be quick and easy to try to poll malloc but knowing n for sure would minimize confusion and strengthen the design.
One of the reasons that you can't ask the malloc library how big a block is, is that the allocator will usually round up the size of your request to meet some minimum granularity requirement (for example, 16 bytes). So if you ask for 5 bytes, you'll get a block of size 16 back. If you were to take 16 and divide by 5, you would get three elements when you really only allocated one. It would take extra space for the malloc library to keep track of how many bytes you asked for in the first place, so it's best for you to keep track of that yourself.
This is a test of my sort routine. It sets up 7 variables to hold float values, then assigns them to an array, which is used to find the max value.
The magic is in the call to myMax:
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
And that was magical, wasn't it?
myMax expects a float array pointer (float *) so I use &arr to get the address of the array, and cast it as a float pointer.
myMax also expects the number of elements in the array as an int. I get that value by using sizeof() to give me byte sizes of the array and the first element of the array, then divide the total bytes by the number of bytes in each element. (we should not guess or hard code the size of an int because it's 2 bytes on some system and 4 on some like my OS X Mac, and could be something else on others).
NOTE:All this is important when your data may have a varying number of samples.
Here's the test code:
#include <stdio.h>
float a, b, c, d, e, f, g;
float myMax(float *apa,int soa){
int i;
float max = apa[0];
for(i=0; i< soa; i++){
if (apa[i]>max){max=apa[i];}
printf("on i=%d val is %0.2f max is %0.2f, soa=%d\n",i,apa[i],max,soa);
}
return max;
}
int main(void)
{
a = 2.0;
b = 1.0;
c = 4.0;
d = 3.0;
e = 7.0;
f = 9.0;
g = 5.0;
float arr[] = {a,b,c,d,e,f,g};
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
printf("mmax = %0.2f\n",mmax);
return 0;
}
In uClibc, there is a MALLOC_SIZE macro in malloc.h:
/* The size of a malloc allocation is stored in a size_t word
MALLOC_HEADER_SIZE bytes prior to the start address of the allocation:
+--------+---------+-------------------+
| SIZE |(unused) | allocation ... |
+--------+---------+-------------------+
^ BASE ^ ADDR
^ ADDR - MALLOC_HEADER_SIZE
*/
/* The amount of extra space used by the malloc header. */
#define MALLOC_HEADER_SIZE \
(MALLOC_ALIGNMENT < sizeof (size_t) \
? sizeof (size_t) \
: MALLOC_ALIGNMENT)
/* Set up the malloc header, and return the user address of a malloc block. */
#define MALLOC_SETUP(base, size) \
(MALLOC_SET_SIZE (base, size), (void *)((char *)base + MALLOC_HEADER_SIZE))
/* Set the size of a malloc allocation, given the base address. */
#define MALLOC_SET_SIZE(base, size) (*(size_t *)(base) = (size))
/* Return base-address of a malloc allocation, given the user address. */
#define MALLOC_BASE(addr) ((void *)((char *)addr - MALLOC_HEADER_SIZE))
/* Return the size of a malloc allocation, given the user address. */
#define MALLOC_SIZE(addr) (*(size_t *)MALLOC_BASE(addr))
malloc() stores metadata regarding space allocation before 8 bytes from space actually allocated. This could be used to determine space of buffer. And on my x86-64 this always return multiple of 16. So if allocated space is multiple of 16 (which is in most cases) then this could be used:
Code
#include <stdio.h>
#include <malloc.h>
int size_of_buff(void *buff) {
return ( *( ( int * ) buff - 2 ) - 17 ); // 32 bit system: ( *( ( int * ) buff - 1 ) - 17 )
}
void main() {
char *buff = malloc(1024);
printf("Size of Buffer: %d\n", size_of_buff(buff));
}
Output
Size of Buffer: 1024
This is my approach:
#include <stdio.h>
#include <stdlib.h>
typedef struct _int_array
{
int *number;
int size;
} int_array;
int int_array_append(int_array *a, int n)
{
static char c = 0;
if(!c)
{
a->number = NULL;
a->size = 0;
c++;
}
int *more_numbers = NULL;
a->size++;
more_numbers = (int *)realloc(a->number, a->size * sizeof(int));
if(more_numbers != NULL)
{
a->number = more_numbers;
a->number[a->size - 1] = n;
}
else
{
free(a->number);
printf("Error (re)allocating memory.\n");
return 1;
}
return 0;
}
int main()
{
int_array a;
int_array_append(&a, 10);
int_array_append(&a, 20);
int_array_append(&a, 30);
int_array_append(&a, 40);
int i;
for(i = 0; i < a.size; i++)
printf("%d\n", a.number[i]);
printf("\nLen: %d\nSize: %d\n", a.size, a.size * sizeof(int));
free(a.number);
return 0;
}
Output:
10
20
30
40
Len: 4
Size: 16
If your compiler supports VLA (variable length array), you can embed the array length into the pointer type.
int n = 10;
int (*p)[n] = malloc(n * sizeof(int));
n = 3;
printf("%d\n", sizeof(*p)/sizeof(**p));
The output is 10.
You could also choose to embed the information into the allocated memory yourself with a structure including a flexible array member.
struct myarray {
int n;
struct mystruct a[];
};
struct myarray *ma =
malloc(sizeof(*ma) + n * sizeof(struct mystruct));
ma->n = n;
struct mystruct *p = ma->a;
Then to recover the size, you would subtract the offset of the flexible member.
int get_size (struct mystruct *p) {
struct myarray *ma;
char *x = (char *)p;
ma = (void *)(x - offsetof(struct myarray, a));
return ma->n;
}
The problem with trying to peek into heap structures is that the layout might change from platform to platform or from release to release, and so the information may not be reliably obtainable.
Even if you knew exactly how to peek into the meta information maintained by your allocator, the information stored there may have nothing to do with the size of the array. The allocator simply returned memory that could be used to fit the requested size, but the actual size of the memory may be larger (perhaps even much larger) than the requested amount.
The only reliable way to know the information is to find a way to track it yourself.

allocate struct and memory for elements in one malloc

I am sure this is a basic question but I haven't been able to find whether or not this is a legitimate memory allocation strategy or not. I am reading in data from a file and I am filling in a struct. The size of the members are variable on each read so my struct elements are pointers like so
struct data_channel{
char *chan_name;
char *chan_type;
char *chan_units;
};
So before reading I figure out what the size of each string is so I can allocate memory for them my question is can I allocate the memory for the struct and the strings all in one malloc and then fill the pointer in?
Say the size of chan_name is 9, chan_type 10, and chan_units 5. So I would allocate the and do something like this.
struct data_channel *chan;
chan = malloc(sizeof(struct data_channel) + 9 + 10 + 5);
chan->chan_name = chan[1];
chan->chan_type = chan->chan_name + 9;
chan->chan_units = chan->chan_type + 10;
So I read a couple of articles on memory alignment but I don't know if doing the above is a problem or not or what kind of unintended consequences it could have. I have already implemented it in my code and it seems to work fine. I just don't want to have to keep track of all those pointers because in reality each of my structs has 7 elements and I could have upwards of 100 channels. That of course means 700 pointers plus the pointers for each struct so total 800. The I also have to devise a way to free them all. I also want to apply this strategy to arrays of strings of which I then need to have an array of pointers to. I don't have any structures right now that would mix data types could that be a problem but I might could that be a problem?
If chan_name is a 8 character string, chan_type is a 9 character string and chan_units is a 4 character string, then yes it will work fine when you fix the compilation error you have when assigning to chan_name.
If you allocate enough memory for the structure plus all the strings (including their string terminator) then it's okay to use such a method. Maybe not recommended by all, but it will work.
It depends in part on the element types. You will certainly be able to do it with character strings; with some other types, you have to worry about alignment and padding issues.
struct data_channel
{
char *chan_name;
char *chan_type;
char *chan_units;
};
struct data_channel *chan;
size_t name_size = 9;
size_t type_size = 10;
size_t unit_size = 5;
chan = malloc(sizeof(struct data_channel) + name_size + type_size + unit_size);
if (chan != 0)
{
chan->chan_name = (char *)chan + sizeof(*chan);
chan->chan_type = chan->chan_name + name_size;
chan->chan_units = chan->chan_type + type_size;
}
This will work OK in practice — it was being done for ages before the standard was standardized. I can't immediately see why the standard would disallow this.
What gets trickier is if you needed to allocate an array of int, say, as well as two strings. Then you have to worry about alignment issues.
struct data_info
{
char *info_name;
int *info_freq;
char *info_unit;
};
size_t name_size = 9;
size_t freq_size = 10;
size_t unit_size = 5;
size_t nbytes = sizeof(struct data_info) + name_size + freq_size * sizeof(int) + unit_size;
struct data_info *info = malloc(nbytes);
if (info != 0)
{
info->info_freq = (int *)((char *)info + sizeof(*info));
info->info_name = (char *)info->info_freq + freq_size * sizeof(int);
info->info_unit = info->info_name + name_size;
}
This has adopted the simple expedient of allocating the most stringently aligned type (the array of int) first, then allocating the strings afterwards. This part is, however, where you have to make judgement calls about portability. I'm confident that the code is portable in practice.
C11 has alignment facilities (_Alignof and _Alignas and <stdalign.h>, plus max_align_t in <stddef.h>) that could alter this answer (but I've not studied them sufficiently so I'm not sure how, yet), but the techniques outlined here will work in any version of C provided you are careful about the alignment of data.
Note that if you have a single array in the structure, then C99 provides an alternative to the older 'struct hack' called a flexible array member (FAM). This allows you to have an array explicitly as the last element of the structure.
struct data_info
{
char *info_name;
char *info_units;
int info_freq[];
};
size_t name_size = 9;
size_t freq_size = 10;
size_t unit_size = 5;
size_t nbytes = sizeof(struct data_info) + name_size + freq_size * sizeof(int) + unit_size;
struct data_info *info = malloc(nbytes);
if (info != 0)
{
info->info_name = ((char *)info + sizeof(*info) + freq_size * sizeof(int));
info->info_units = info->info_name + name_size;
}
Note that there was no step to initialize the FAM, info_freq in this example. You cannot have multiple arrays like this.
Note that the techniques outlined cannot readily be applied to arrays of structures (at least, arrays of the outer structure). If you go to considerable effort, you can make it work. Also, beware of realloc(); if you reallocate space, you have to fix up the pointers if the data has moved.
One other point: especially on 64-bit machines, if the sizes of the strings are uniform enough, you'd probably do better allocating the arrays in the structure, instead of using the pointers.
struct data_channel
{
char chan_name[16];
char chan_type[16];
char chan_units[8];
};
This occupies 40 bytes. On a 64-bit machine, the original data structure would occupy 24 bytes for the three pointers and another 24 bytes for the (9 + 10 + 5) bytes of data, for a total of 48 bytes allocated.
I know there is a sure way to do this when you have ONE array at the end of a structure, but since all your arrays have the same type, you may be in luck. The sure method is:
#include <stddef.h>
#include <stdlib.h>
struct StWithArray
{
int blahblah;
float arr[1];
};
struct StWithArray * AllocWithArray(size_t nb)
{
size_t size = nb*sizeof(float) + offsetof(structStWithArray, arr);
return malloc(size);
}
The use of an actual array in the structure guarantees alignment is respected.
Now to apply it to your case:
#include <stddef.h>
#include <stdlib.h>
struct data_channel
{
char *chan_name;
char *chan_type;
char *chan_units;
char actualCharArray[1];
};
struct data_channel * AllocDataChannel(size_t nb)
{
size_t size = nb*sizeof(char) + offsetof(data_channel, actualCharArray);
return malloc(size);
}
struct data_channel * CreateDataChannel(size_t length1, size_t length2, size_t length3)
{
struct data_channel * pt = AllocDataChannel(length1 + length2 + length3);
if(pt != NULL)
{
pt->chan_name = &pt->actualCharArray[0];
pt->chan_type = &pt->actualCharArray[length1];
pt->chan_name = &pt->actualCharArray[length1+length2];
}
return pt;
}
Joachim and Jonathan's answers are nice. Only addition I would like to mention is this.
Separate mallocs and frees buy you some basic protection like buffer overrun, access after
free, etc. I mean basic and not Valgrind like features. Allocating one single chunk and internally doling it out will lead to a loss of this feature.
In future, if the mallocs are for different sizes totally, then separate mallocs may buy you the efficiency of coming from different allocation buckets inside of the malloc implementation, especially if you are going to free them at different times.
The last thing you have to consider is how frequently you are calling mallocs. If it is frequent, then cost of multiple mallocs can be costly.

keeping track of how much memory malloc has allocated

After a quick scan of related questions on SO, I have deduced that there's no function that would check the amount of memory that malloc has allocated to a pointer. I'm trying to replicate some of std::string basic functionality (mainly dynamic size) using simple char*'s in C and don't want to call realloc all the time. I guess I'll need to keep track of how much memory has been allocated. In order to do that, I'm considering creating a typedef that will contain the string itself and an integer with the amount of memory currently allocated, something like this:
typedef struct {
char * str;
int mem;
} my_string_t;
Is that an optimal solution, or perhaps you can suggest something that will bear better results? Thanks in advance for your help.
You will want to allocate the space for both the length and the string in the same block of memory. This may be what you intended with your struct, but you have reserved space for only a pointer to the string.
There must be space allocated to contain the characters of the string.
For example:
typedef struct
{
int num_chars;
char string[];
} my_string_t;
my_string_t * alloc_my_string(char *src)
{
my_string_t * p = NULL;
int N_chars = strlen(src) + 1;
p = malloc( N_chars + sizeof(my_string_t));
if (p)
{
p->num_chars = N_chars;
strcpy(p->string, src);
}
return p;
}
In my example, to access the pointer to your string, you address the string member of the my_string_t:
my_string_t * p = alloc_my_string("hello free store.");
printf("String of %d bytes is '%s'\n", p->num_chars, p->string);
Be careful to realize that you are obtaining the pointer for the string as a consequence of allocating space to store the characters. The resource you are allocating is the storage for the characters, the pointer obtained is a reference to the allocated storage.
In my example, the memory allocated is laid out sequentially as follows:
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 00 | 00 | 00 | 11 | 'h'| 'e'| 'l'| 'l'| 'o'| 20 | 'f'| 'r'| 'e'| 'e'| 20 | 's'| 't'| 'o'| 'r'| 'e'| '.'| 00 |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
^^ ^
|| |
p| |
p->num_chars p->string
Notice that the value of p->string is not stored in the allocated memory, it is four bytes from the beginning of the allocated memory, immediately subsequent to the (presumed 32-bit, four-byte) integer.
Your compiler may require that you declare the flexible C array as:
typedef struct
{
int num_chars;
char string[0];
} my_string_t;
but the version lacking the zero is supposedly C99-compliant.
You can accomplish the equivalent thing with no array member as follows:
typedef struct
{
int num_chars;
} mystr2;
char * str_of_mystr2(mystr2 * ms)
{
return (char *)(ms + 1);
}
mystr2 * alloc_mystr2(char *src)
{
mystr2* p = NULL;
size_t N_chars = strlen(src) + 1;
if (N_chars num_chars = (int)N_chars;
strcpy(str_of_mystr2(p), src);
}
return p;
}
printf("String of %d bytes is '%s'\n", p->num_chars, str_of_mystr2 (p));
In this second example, the value equivalent to p->string is calculated by str_of_mystr2(). It will have approximately the same value as the first example, depending on how the end of structs are packed by your compiler settings.
While some would suggest tracking the length in a size_t I would look up some old Dr. Dobb's article on why I disagree. Supporting values greater than INT_MAX is of doubtful value to your program's correctness. By using an int, you can write assert(p->num_chars >= 0); and have that test something. With an unsigned, you would write the equivalent test something like assert(p->num_chars < UINT_MAX / 2); As long as you write code which contains checks on run-time data, using a signed type can be useful.
On the other hand, if you are writing a library which handles strings in excess of UINT_MAX / 2 characters, I salute you.
This is the obvious solution. And while you are at it, you might want to have a struct member that maintains the amount of allocated memory actually in use. This will avoid having to call strlen() all the time, and would enable you to support non null-terminated strings, as the C++ std::string class does.
That is how it was done in the Pleistocene, and that's how you should do it today. You are dead on the money that malloc does not offer any portable, supported, mechanism to query the size of an allocated block.
A more common way is to wrap malloc (and realloc) and keep a list of sizes and pointers
That way you don't need to change any string functions.
write wrapper functions. If you are using malloc then you should do that anyway.
For an example look in "writing solid code"
I think you could use malloc_usable_size.

Resources