What is the opposite of calloc in C - c

It is more than a funny question. :-)
I wish to initialize an array in C, but instead of zeroing out the array with calloc. I want to set all element to one. Is there a single function that does just that?
I have used my question above to search in google, no answer. Hope you can help me out! FYI, I am first year CS student just starting to program in C.

There isn't a standard C memory allocation function that allows you to specify a value other than 0 that the allocated memory is initialized to.
You could easily enough write a cover function to do the job:
void *set_alloc(size_t nbytes, char value)
{
void *space = malloc(nbytes);
if (space != 0)
memset(space, value, nbytes);
return space;
}
Note that this assumes you want to set each byte to the same value. If you have a more complex initialization requirement, you'll need a more complex function. For example:
void *set_alloc2(size_t nelems, size_t elemsize, void *initializer)
{
void *space = malloc(nelems * elemsize);
if (space != 0)
{
for (size_t i = 0; i < nelems; i++)
memmove((char *)space + i * elemsize, initializer, elemsize);
}
return space;
}
Example usage:
struct Anonymous
{
double d;
int i;
short s;
char t[2];
};
struct Anonymous a = { 3.14159, 23, -19, "A" };
struct Anonymous *b = set_alloc2(20, sizeof(struct Anonymous), &a);

memset is there for you:
memset(array, value, length);

There is no such function. You can implement it yourself with a combination of malloc() and either memset() (for character data) or a for loop (for other integer data).
The impetus for the calloc() function's existence (vs. malloc() + memset()) is that it can be a nice performance optimization in some cases. If you're allocating a lot of data, the OS might be able to give you a range of virtual addresses that are already initialized to zero, which saves you the extra cost of manually writing out 0's into that memory range. This can be a large performance gain because you don't need to page all of those pages in until you actually use them.
Under the hood, calloc() might look something like this:
void *calloc(size_t count, size_t size)
{
// Error checking omitted for expository purposes
size_t total_size = count * size;
if (total_size < SOME_THRESHOLD) // e.g. the OS's page size (typically 4 KB)
{
// For small allocations, allocate from normal malloc pool
void *mem = malloc(total_size);
memset(mem, 0, total_size);
return mem;
}
else
{
// For large allocations, allocate directory from the OS, already zeroed (!)
return mmap(NULL, total_size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, -1, 0);
// Or on Windows, use VirtualAlloc()
}
}

Related

How to concat byte arrays in C

My current concat function:
char* concat(char* a, int a_size,
char* b, int b_size) {
char* c = malloc(a_size + b_size);
memcpy(c, a, a_size);
memcpy(c + a_size, b, b_size);
free(a);
free(b);
return c;
}
But this used extra memory. Is it possible to append two byte arrays using realloc without making extra memory space?
Like:
void append(char* a, int a_size, char* b, int b_size)
...
char* a = malloc(2);
char* b = malloc(2);
void append(a, 2, b, 2);
//The size of a will be 4.
While Jean-François Fabre answered the stated question, I'd like to point out that you can manage such byte arrays better by using a structure:
typedef struct {
size_t max; /* Number of chars allocated for */
size_t len; /* Number of chars in use */
unsigned char *data;
} bytearray;
#define BYTEARRAY_INIT { 0, 0, NULL }
void bytearray_init(bytearray *barray)
{
barray->max = 0;
barray->len = 0;
barray->data = NULL;
}
void bytearray_free(bytearray *barray)
{
free(barray->data);
barray->max = 0;
barray->len = 0;
barray->data = NULL;
}
To declare an empty byte array, you can use either bytearray myba = BYTEARRAY_INIT; or bytearray myba; bytearray_init(&myba);. The two are equivalent.
When you no longer need the array, call bytearray_free(&myba);. Note that free(NULL) is safe and does nothing, so it is perfectly safe to free a bytearray that you have initialized, but not used.
To append to a bytearray:
int bytearray_append(bytearray *barray, const void *from, const size_t size)
{
if (barray->len + size > barray->max) {
const size_t len = barray->len + size;
size_t max;
void *data;
/* Example policy: */
if (len < 8)
max = 8; /* At least 8 chars, */
else
if (len < 4194304)
max = (3*len) / 2; /* grow by 50% up to 4,194,304 bytes, */
else
max = (len | 2097151) + 2097153 - 24; /* then pad to next multiple of 2,097,152 sans 24 bytes. */
data = realloc(barray->data, max);
if (!data) {
/* Not enough memory available. Old data is still valid. */
return -1;
}
barray->max = max;
barray->data = data;
}
/* Copy appended data; we know there is room now. */
memmove(barray->data + barray->len, from, size);
barray->len += size;
return 0;
}
Since this function can at least theoretically fail to reallocate memory, it will return 0 if successful, and nonzero if it cannot reallocate enough memory.
There is no need for a malloc() call, because realloc(NULL, size) is exactly equivalent to malloc(size).
The "growth policy" is a very debatable issue. You can just make max = barray->len + size, and be done with it. However, dynamic memory management functions are relatively slow, so in practice, we don't want to call realloc() for every small little addition.
The above policy tries to do something better, but not too aggressive: it always allocates at least 8 characters, even if less is needed. Up to 4,194,304 characters, it allocates 50% extra. Above that, it rounds the allocation size to the next multiple of 2,097,152 and substracts 24. The reasoning behid this is complex, but it is more for illustration and understanding than anything else; it is definitely NOT "this is best, and this is what you should do too". This policy ensures that each byte array allocates at most 4,194,304 = 222 unused characters. However, 2,097,152 = 221 is the size of a huge page on AMD64 (x86-64), and is a power-of-two multiple of a native page size on basically all architectures. It is also large enough to switch from so-called sbrk() allocation to memory mapping on basically all architectures that do that. It means that such huge allocations use a separate part of the heap for each, and the unused part is usually just virtual memory, not necessarily backed by any RAM, until accessed. As a result, this policy tends to work quite well for both very short byte arrays, and very long byte arrays, on most architectures.
Of course, if you know (or measure!) the typical size of the byte arrays in typical workloads, you can optimize the growth policy for that, and get even better results.
Finally, it uses memmove() instead of memcpy(), just in case someone wishes to repeat a part of the same byte array: memcpy() only works if the source and target areas do not overlap; memmove() works even in that case.
When using more advanced data structures, like hash tables, a variant of the above structure is often useful. (That is, this is much better in cases where you have lots of empty byte arrays.)
Instead of having a pointer to the data, the data is part of the structure itself, as a C99 flexible array member:
typedef struct {
size_t max;
size_t len;
unsigned char data[];
} bytearray;
You cannot declare a byte array itself (i.e. bytearray myba; will not work); you always declare a pointer to a such byte arrays: bytearray *myba = NULL;. The pointer being NULL is just treated the same as an empty byte array.
In particular, to see how many data items such an array has, you use an accessor function (also defined in the same header file as the data structure), rather than myba.len:
static inline size_t bytearray_len(bytearray *const barray)
{
return (barray) ? barray->len : 0;
}
static inline size_t bytearray_max(bytearray *const barray)
{
return (barray) ? barray->max : 0;
}
The (expression) ? (if-true) : (if-false) is a ternary operator. In this case, the first function is exactly equivalent to
static inline size_t bytearray_len(bytearray *const barray)
{
if (barray)
return barray->len;
else
return 0;
}
If you wonder about the bytearray *const barray, remember that pointer declarations are read from right to left, with * as "a pointer to". So, it just means that barray is constant, a pointer to a byte array. That is, we may change the data it points to, but we won't change the pointer itself. Compilers can usually detect such stuff themselves, but it may help; the main point is however to remind us human programmers that the pointer itself is not to be changed. (Such changes would only be visible within the function itself.)
Since such arrays often need to be resized, the resizing is often put into a separate helper function:
bytearray *bytearray_resize(bytearray *const barray, const size_t len)
{
bytearray *temp;
if (!len) {
free(barray);
errno = 0;
return NULL;
}
if (!barray) {
temp = malloc(sizeof (bytearray) + len * sizeof barray->data[0]);
if (!temp) {
errno = ENOMEM;
return NULL;
}
temp->max = len;
temp->len = 0;
return temp;
}
if (barray->len > len)
barray->len = len;
if (barray->max == len)
return barray;
temp = realloc(barray, sizeof (bytearray) + len * sizeof barray->data[0]);
if (!temp) {
free(barray);
errno = ENOMEM;
return NULL;
}
temp->max = len;
return temp;
}
What does that errno = 0 do in there? The idea is that because resizing/reallocating a byte array may change the pointer, we return the new one. If the allocation fails, we return NULL with errno == ENOMEM, just like malloc()/realloc() do. However, since the desired new length was zero, this saves memory by freeing the old byte array if any, and returns NULL. But since that is not an error, we set errno to zero, so that it is easier for callers to check if an error occurred or not. (If the function returns NULL, check errno. If errno is nonzero, an error occurred; you can use strerror(errno) to get a descriptive error message.)
You probably also noted the sizeof barray->data[0], used even when barray is NULL. This is okay, because sizeof is not a function, but an operator: it does not access the right side at all, it only evaluates to the size of the thing the right side refers to. (You only need to use parentheses when the right size is a type.) This form is nice, because it lets a programmer change the type of the data member, without changing any other code.
To append data to such a byte array, we probably want to be able to specify whether we anticipate further appends to the same array, or whether this is probably the final append, so that only the exact needed amount of memory is needed. For simplicity, I'll only implement the exact size version here. Note that this function returns a pointer to the (modified) byte array:
bytearray *bytearray_append(bytearray *barray,
const void *from, const size_t size,
int exact)
{
size_t len = bytearray_len(barray) + size;
if (exact) {
barray = bytearray_resize(barray, len);
if (!barray)
return NULL; /* errno already set by bytearray_resize(). */
} else
if (bytearray_max(barray) < len) {
if (!exact) {
/* Apply growth policy */
if (len < 8)
len = 8;
else
if (len < 4194304)
len = (3 * len) / 2;
else
len = (len | 2097151) + 2097153 - 24;
}
barray = bytearray_resize(barray, len);
if (!barray)
return NULL; /* errno already set by the bytearray_resize() call */
}
if (size) {
memmove(barray->data + barray->len, from, size);
barray->len += size;
}
return barray;
}
This time, we declared bytearray *barray, because we change where barray points to in the function. If the fourth parameter, final, is nonzero, then the resulting byte array is exactly the size needed; otherwise the growth policy is applied.
yes, since realloc will preserve the start of your buffer if the new size is bigger:
char* concat(char* a, size_t a_size,
char* b, size_t b_size) {
char* c = realloc(a, a_size + b_size);
memcpy(c + a_size, b, b_size); // dest is after "a" data, source is b with b_size
free(b);
return c;
}
c may be different from a (if the original memory block cannot be resized in-place contiguously to the new size by the system) but if that's the case, the location pointed by a will be freed (you must not free it), and the original data will be "moved".
My advice is to warn the users of your function that the input buffers must be allocated using malloc, else it will crash badly.

realloc of array inside a struct

I'm trying to write a function that uses realloc() to extend the array as pointed to within in instance of a struct, however I can't seem to get it to work.
The relevant part of my code is:
struct data_t {
int data_size;
uint16_t *data;
};
void extend_data(data_t container, uint16_t value) {
// adds an additional uint16_t to the array of DATA, updates its internal
// variables, and initialises the new uint to VALUE.
int len_data = sizeof(*(container->data)) / sizeof(uint16_t);
printf("LENGTH OF DATA: %d\n", len_data);
container->data = realloc(container->data, sizeof(*(container->data))+sizeof(uint16_t));
container->data_size++;
container->data[container->data_size-1] = value;
len_data = sizeof(*(container->data)) / sizeof(uint16_t);
printf("LENGTH OF DATA: %d\n", len_data);
printf("data_size: %d\n", container->data_size);
return;
}
Can anybody see what the problem is with this?
Edit
As R. Sahu points out, container is not a pointer in this function - when you said the code "wasn't working", I assumed you meant that you weren't growing your array, but what you've written here won't even compile.
Are you sure you've copied this code correctly? If so, does "not working" mean you're getting a compile-time error, a run-time error, or just unexpected output?
If you've copied the code as written, then the first thing you need to do is change the function prototype to
void extend_data(data_t *container, uint16_t value) {
and make sure you're passing a pointer to your data_t type, otherwise the update won't be reflected in calling code.
Original
In the line
container->data = realloc(container->data, sizeof(*(container->data))+sizeof(uint16_t));
sizeof(*(container->data)) evaluates to sizeof (uint16_t). container->data is a pointer to, not an array of, uint16_t; sizeof will give you the size of the pointer object, not the number of elements you've allocated. What you want to do is something like the following:
/**
* Don't assign the result of a realloc call back to the original
* pointer - if the call fails, realloc will return NULL and you'll
* lose the reference to your original buffer. Assign the result to
* a temporary, then after making sure the temporary is not NULL,
* assign that back to your original pointer.
*/
uint16_t *tmp = realloc(container-data, sizeof *container->data * (container->data_size + 1) );
if ( tmp )
{
/**
* Only add to container->data and update the value of container->data_size
* if the realloc call succeeded.
*/
container->data = tmp;
container->data[container->data_size++] = value;
}
You don't calculate the new size correctly. Consider this:
typedef struct {
size_t size;
int *data;
} int_array;
#define INT_ARRAY_INIT { 0, NULL}
void int_array_resize(int_array *const array,
const size_t newsize)
{
if (!array) {
fprintf(stderr, "int_array_resize(): NULL int_array.\n");
exit(EXIT_FAILURE);
}
if (!newsize) {
free(array->data);
array->data = 0;
array->size = 0;
} else
if (newsize != array->size) {
void *temp;
temp = realloc(array->data, newsize * sizeof array->data[0]);
if (!temp) {
fprintf(stderr, "int_array_resize(): Out of memory.\n");
exit(EXIT_FAILURE);
}
array->data = temp;
array->size = newsize;
}
}
/* int_array my_array = INT_ARRAY_INIT;
is equivalent to
int_array my_array;
int_array_init(&my_array);
*/
void int_array_init(int_array *const array)
{
if (array) {
array->size = 0;
array->data = NULL;
}
}
void int_array_free(int_array *const array)
{
if (array) {
free(array->data);
array->size = 0;
array->data = NULL;
}
}
The key point is newsize * sizeof array->data[0]. This is the number of chars needed for newsize elements of whatever type array->data[0] has. Both malloc() and realloc() take the size in chars.
If you initialize new structures of that type using int_array my_array = INT_ARRAY_INIT; you can just call int_array_resize() to resize it. (realloc(NULL, size) is equivalent to malloc(size); free(NULL) is safe and does nothing.)
The int_array_init() and int_array_free() are just helper functions to initialize and free such arrays.
Personally, whenever I have dynamically resized arrays, I keep both the allocated size (size) and the size used (used):
typedef struct {
size_t size; /* Number of elements allocated for */
size_t used; /* Number of elements used */
int *data;
} int_array;
#define INT_ARRAY_INIT { 0, 0, NULL }
A function that ensures there are at least need elements that can be added is then particularly useful. To avoid unnecessary reallocations, the function implements a policy that calculates the new size to allocate for, as a balance between amount of memory "wasted" (allocated but not used) and number of potentially slow realloc() calls:
void int_array_need(int_array *const array,
const size_t need)
{
size_t size;
void *data;
if (!array) {
fprintf(stderr, "int_array_need(): NULL int_array.\n");
exit(EXIT_FAILURE);
}
/* Large enough already? */
if (array->size >= array->used + need)
return;
/* Start with the minimum size. */
size = array->used + need;
/* Apply growth/reallocation policy. This is mine. */
if (size < 256)
size = (size | 15) + 1;
else
if (size < 2097152)
size = (3 * size) / 2;
else
size = (size | 1048575) + 1048577 - 8;
/* TODO: Verify (size * sizeof array->data[0]) does not overflow. */
data = realloc(array->data, size * sizeof array->data[0]);
if (!data) {
/* Fallback: Try minimum allocation. */
size = array->used + need;
data = realloc(array->data, size * sizeof array->data[0]);
}
if (!data) {
fprintf(stderr, "int_array_need(): Out of memory.\n");
exit(EXIT_FAILURE);
}
array->data = data;
array->size = size;
}
There are many opinions on what kind of reallocation policy you should use, but it really depends on the use case.
There are three things in the balance: number of realloc() calls, as they might be "slow"; memory fragmentation if different arrays are grown requiring many realloc() calls; and amount of memory allocated but not used.
My policy above tries to do many things at once. For small allocations (up to 256 elements), it rounds the size up to the next multiple of 16. That is my attempt at a good balance between memory used for small arrays, and not very many realloc() calls.
For larger allocations, 50% is added to the size. This reduces the number of realloc() calls, while keeping the allocated but unused/unneeded memory below 50%.
For really large allocations, when you have 221 elements or more, the size is rounded up to the next multiple of 220, less a few elements. This caps the number of allocated but unused elements to about 221, or two million elements.
(Why less a few elements? Because it does not harm on any systems, and on certain systems it may help a lot. Some systems, including x86-64 (64-bit Intel/AMD) on certain operating systems and configurations, support large ("huge") pages that can be more efficient in some ways than normal pages. If they are used to satisfy an allocation, I want to avoid the case where an extra large page is allocated just to cater for the few bytes the C library needs internally for the allocation metadata.)
It appears you aren't using sizeof correctly. In your struct you've defined a uint16_t pointer, not an array. The size of the uint16_t* data type is the size of a pointer on your system. You need to store the size of the allocated memory along with the pointer if you want to be able to accurately resize it. It appears you already have a field for this with data_size. Your example might be able to be fixed as,
// I was unsure of the typedef-ing happening with data_t so I made it more explicit in this example
typedef struct {
int data_size;
uint16_t* data;
} data_t;
void extend_data(data_t* container, uint16_t value) {
// adds an additional uint16_t to the array of DATA, updates its internal
// variables, and initialises the new uint to VALUE.
// CURRENT LENGTH OF DATA
int len_data = container->data_size * sizeof(uint16_t);
printf("LENGTH OF DATA: %d\n", len_data);
uint16_t* tmp = realloc(container->data, (container->data_size + 1) * sizeof(uint16_t));
if (tmp) {
// realloc could fail and return false.
// If this is not handled it could overwrite the pointer in `container` and cause a memory leak
container->data = tmp;
container->data_size++;
container->data[container->data_size-1] = value;
} else {
// Handle allocation failure
}
len_data = container->data_size * sizeof(uint16_t);
printf("LENGTH OF DATA: %d\n", len_data);
printf("data_size: %d\n", container->data_size);
return;
}
void extend_data(data_t container, ...
In your function container is not the pointer but the struct itself passed by the value so you cant use the -> operator.
The realloced memory will be lost as you work on the local copy of the passed strucure and it will be lost on the function return.
sizeof(*(container.data)) / sizeof(uint16_t)
it will be always 1 as the *(uint16_t *) / sizeof(uint16_t) is always one.
Why: data member is pointer to the uint16_t. *data has the type of uint16_t
sizeof is calculated during the compilation not the runtime and it does not return the ammount of memory allocated by the malloc.

How does MRI Ruby store the contents of a String?

Primer: This question is quite long, because I want to give an overview of my current understanding of the inner mechanisms of MRI and how I came to my conclusions. I want to understand the code better, so please correct me if any assumption I'm making is wrong.
I'm trying to find out where MRI Ruby stores the data part (aka the contents) of a String, because I'd like to create String objects which reuse memory allocated by another binary (same allocator of course).
Here's what I know so far:
RString: internal representation of a String.
struct RString {
struct RBasic basic;
union {
struct {
long len;
char *ptr;
union {
long capa;
VALUE shared;
} aux;
} heap;
char ary[RSTRING_EMBED_LEN_MAX + 1];
} as;
};
reference
From the above snippet I conclude that there are 2 ways the data can be stored:
on the heap via the heap struct (ptr points to data)
in the ary char array directly (probably some optimization)
I'm only interested in the heap case.
str_new0() seems to be the most common way to create a String from a pointer to some string data and a length.
static VALUE
str_new0(VALUE klass, const char *ptr, long len, int termlen)
{
VALUE str;
if (len < 0) {
rb_raise(rb_eArgError, "negative string size (or size too big)");
}
RUBY_DTRACE_CREATE_HOOK(STRING, len);
str = str_alloc(klass);
if (len > RSTRING_EMBED_LEN_MAX) {
RSTRING(str)->as.heap.aux.capa = len;
RSTRING(str)->as.heap.ptr = ALLOC_N(char, len + termlen);
STR_SET_NOEMBED(str);
}
else if (len == 0) {
ENC_CODERANGE_SET(str, ENC_CODERANGE_7BIT);
}
if (ptr) {
memcpy(RSTRING_PTR(str), ptr, len);
}
STR_SET_LEN(str, len);
TERM_FILL(RSTRING_PTR(str) + len, termlen);
return str;
}
reference
Memory is allocated with the macro ALLOC_N which is an alias for RB_ALLOC_N which expands to ruby_xmalloc2() which calls objspace_xmalloc2() which calls objspace_xmalloc0().
Phew
static void *
objspace_xmalloc0(rb_objspace_t *objspace, size_t size)
{
void *mem;
size = objspace_malloc_prepare(objspace, size);
TRY_WITH_GC(mem = malloc(size));
size = objspace_malloc_size(objspace, mem, size);
objspace_malloc_increase(objspace, mem, size, 0, MEMOP_TYPE_MALLOC);
return objspace_malloc_fixup(objspace, mem, size);
}
reference
So here we are. TRY_WITH_GC seems to check if the allocation mem = malloc(size) succeeds and if not it tries again after a GC run I think.
#define TRY_WITH_GC(alloc) do { \
objspace_malloc_gc_stress(objspace); \
if (!(alloc) && \
(!garbage_collect_with_gvl(objspace, TRUE, TRUE, TRUE, GPR_FLAG_MALLOC) || /* full/immediate mark && immediate sweep */ \
!(alloc))) { \
ruby_memerror(); \
} \
} while (0)
reference
Here's the first thing I'm unsure about: It seems to malloc just some memory (important: not in objspace). Is this the case? I don't know if they overwrote malloc somewhere to allocate GC friendly or whatever.
OK after that they mutate objspace with objspace_malloc_increase() and friends. I don't understand what these functions do. They do not seem to store the pointer mem in objspace, but maybe I overlooked it. I need clarification here.
As noted in the beginning I want to write code that creates a Ruby String, which uses memory allocated by some other binary, eg. C via FFI, of course with the system allocator. Do I have to register my "foreign" memory via the objspace_* functions? If yes, how does that exactly work? And are there subtleties when it comes to freeing the memory again? (I guess the GC does that, but what conditions must be true for this to work?)
I hope my question is not too vague, I can ask more precisely if necessary!
Thanks in advance!

Get the length of an array with a pointer? [duplicate]

I've allocated an "array" of mystruct of size n like this:
if (NULL == (p = calloc(sizeof(struct mystruct) * n,1))) {
/* handle error */
}
Later on, I only have access to p, and no longer have n. Is there a way to determine the length of the array given just the pointer p?
I figure it must be possible, since free(p) does just that. I know malloc() keeps track of how much memory it has allocated, and that's why it knows the length; perhaps there is a way to query for this information? Something like...
int length = askMallocLibraryHowMuchMemoryWasAlloced(p) / sizeof(mystruct)
I know I should just rework the code so that I know n, but I'd rather not if possible. Any ideas?
No, there is no way to get this information without depending strongly on the implementation details of malloc. In particular, malloc may allocate more bytes than you request (e.g. for efficiency in a particular memory architecture). It would be much better to redesign your code so that you keep track of n explicitly. The alternative is at least as much redesign and a much more dangerous approach (given that it's non-standard, abuses the semantics of pointers, and will be a maintenance nightmare for those that come after you): store the lengthn at the malloc'd address, followed by the array. Allocation would then be:
void *p = calloc(sizeof(struct mystruct) * n + sizeof(unsigned long int),1));
*((unsigned long int*)p) = n;
n is now stored at *((unsigned long int*)p) and the start of your array is now
void *arr = p+sizeof(unsigned long int);
Edit: Just to play devil's advocate... I know that these "solutions" all require redesigns, but let's play it out.
Of course, the solution presented above is just a hacky implementation of a (well-packed) struct. You might as well define:
typedef struct {
unsigned int n;
void *arr;
} arrInfo;
and pass around arrInfos rather than raw pointers.
Now we're cooking. But as long as you're redesigning, why stop here? What you really want is an abstract data type (ADT). Any introductory text for an algorithms and data structures class would do it. An ADT defines the public interface of a data type but hides the implementation of that data type. Thus, publicly an ADT for an array might look like
typedef void* arrayInfo;
(arrayInfo)newArrayInfo(unsignd int n, unsigned int itemSize);
(void)deleteArrayInfo(arrayInfo);
(unsigned int)arrayLength(arrayInfo);
(void*)arrayPtr(arrayInfo);
...
In other words, an ADT is a form of data and behavior encapsulation... in other words, it's about as close as you can get to Object-Oriented Programming using straight C. Unless you're stuck on a platform that doesn't have a C++ compiler, you might as well go whole hog and just use an STL std::vector.
There, we've taken a simple question about C and ended up at C++. God help us all.
keep track of the array size yourself; free uses the malloc chain to free the block that was allocated, which does not necessarily have the same size as the array you requested
Just to confirm the previous answers: There is no way to know, just by studying a pointer, how much memory was allocated by a malloc which returned this pointer.
What if it worked?
One example of why this is not possible. Let's imagine the code with an hypothetic function called get_size(void *) which returns the memory allocated for a pointer:
typedef struct MyStructTag
{ /* etc. */ } MyStruct ;
void doSomething(MyStruct * p)
{
/* well... extract the memory allocated? */
size_t i = get_size(p) ;
initializeMyStructArray(p, i) ;
}
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
doSomething(s) ;
}
Why even if it worked, it would not work anyway?
But the problem of this approach is that, in C, you can play with pointer arithmetics. Let's rewrite doSomethingElse():
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
MyStruct * s2 = s + 5 ; /* s2 points to the 5th item */
doSomething(s2) ; /* Oops */
}
How get_size is supposed to work, as you sent the function a valid pointer, but not the one returned by malloc. And even if get_size went through all the trouble to find the size (i.e. in an inefficient way), it would return, in this case, a value that would be wrong in your context.
Conclusion
There are always ways to avoid this problem, and in C, you can always write your own allocator, but again, it is perhaps too much trouble when all you need is to remember how much memory was allocated.
Some compilers provide msize() or similar functions (_msize() etc), that let you do exactly that
May I recommend a terrible way to do it?
Allocate all your arrays as follows:
void *blockOfMem = malloc(sizeof(mystruct)*n + sizeof(int));
((int *)blockofMem)[0] = n;
mystruct *structs = (mystruct *)(((int *)blockOfMem) + 1);
Then you can always cast your arrays to int * and access the -1st element.
Be sure to free that pointer, and not the array pointer itself!
Also, this will likely cause terrible bugs that will leave you tearing your hair out. Maybe you can wrap the alloc funcs in API calls or something.
malloc will return a block of memory at least as big as you requested, but possibly bigger. So even if you could query the block size, this would not reliably give you your array size. So you'll just have to modify your code to keep track of it yourself.
For an array of pointers you can use a NULL-terminated array. The length can then determinate like it is done with strings. In your example you can maybe use an structure attribute to mark then end. Of course that depends if there is a member that cannot be NULL. So lets say you have an attribute name, that needs to be set for every struct in your array you can then query the size by:
int size;
struct mystruct *cur;
for (cur = myarray; cur->name != NULL; cur++)
;
size = cur - myarray;
Btw it should be calloc(n, sizeof(struct mystruct)) in your example.
Other have discussed the limits of plain c pointers and the stdlib.h implementations of malloc(). Some implementations provide extensions which return the allocated block size which may be larger than the requested size.
If you must have this behavior you can use or write a specialized memory allocator. This simplest thing to do would be implementing a wrapper around the stdlib.h functions. Some thing like:
void* my_malloc(size_t s); /* Calls malloc(s), and if successful stores
(p,s) in a list of handled blocks */
void my_free(void* p); /* Removes list entry and calls free(p) */
size_t my_block_size(void* p); /* Looks up p, and returns the stored size */
...
really your question is - "can I find out the size of a malloc'd (or calloc'd) data block". And as others have said: no, not in a standard way.
However there are custom malloc implementations that do it - for example http://dmalloc.com/
I'm not aware of a way, but I would imagine it would deal with mucking around in malloc's internals which is generally a very, very bad idea.
Why is it that you can't store the size of memory you allocated?
EDIT: If you know that you should rework the code so you know n, well, do it. Yes it might be quick and easy to try to poll malloc but knowing n for sure would minimize confusion and strengthen the design.
One of the reasons that you can't ask the malloc library how big a block is, is that the allocator will usually round up the size of your request to meet some minimum granularity requirement (for example, 16 bytes). So if you ask for 5 bytes, you'll get a block of size 16 back. If you were to take 16 and divide by 5, you would get three elements when you really only allocated one. It would take extra space for the malloc library to keep track of how many bytes you asked for in the first place, so it's best for you to keep track of that yourself.
This is a test of my sort routine. It sets up 7 variables to hold float values, then assigns them to an array, which is used to find the max value.
The magic is in the call to myMax:
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
And that was magical, wasn't it?
myMax expects a float array pointer (float *) so I use &arr to get the address of the array, and cast it as a float pointer.
myMax also expects the number of elements in the array as an int. I get that value by using sizeof() to give me byte sizes of the array and the first element of the array, then divide the total bytes by the number of bytes in each element. (we should not guess or hard code the size of an int because it's 2 bytes on some system and 4 on some like my OS X Mac, and could be something else on others).
NOTE:All this is important when your data may have a varying number of samples.
Here's the test code:
#include <stdio.h>
float a, b, c, d, e, f, g;
float myMax(float *apa,int soa){
int i;
float max = apa[0];
for(i=0; i< soa; i++){
if (apa[i]>max){max=apa[i];}
printf("on i=%d val is %0.2f max is %0.2f, soa=%d\n",i,apa[i],max,soa);
}
return max;
}
int main(void)
{
a = 2.0;
b = 1.0;
c = 4.0;
d = 3.0;
e = 7.0;
f = 9.0;
g = 5.0;
float arr[] = {a,b,c,d,e,f,g};
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
printf("mmax = %0.2f\n",mmax);
return 0;
}
In uClibc, there is a MALLOC_SIZE macro in malloc.h:
/* The size of a malloc allocation is stored in a size_t word
MALLOC_HEADER_SIZE bytes prior to the start address of the allocation:
+--------+---------+-------------------+
| SIZE |(unused) | allocation ... |
+--------+---------+-------------------+
^ BASE ^ ADDR
^ ADDR - MALLOC_HEADER_SIZE
*/
/* The amount of extra space used by the malloc header. */
#define MALLOC_HEADER_SIZE \
(MALLOC_ALIGNMENT < sizeof (size_t) \
? sizeof (size_t) \
: MALLOC_ALIGNMENT)
/* Set up the malloc header, and return the user address of a malloc block. */
#define MALLOC_SETUP(base, size) \
(MALLOC_SET_SIZE (base, size), (void *)((char *)base + MALLOC_HEADER_SIZE))
/* Set the size of a malloc allocation, given the base address. */
#define MALLOC_SET_SIZE(base, size) (*(size_t *)(base) = (size))
/* Return base-address of a malloc allocation, given the user address. */
#define MALLOC_BASE(addr) ((void *)((char *)addr - MALLOC_HEADER_SIZE))
/* Return the size of a malloc allocation, given the user address. */
#define MALLOC_SIZE(addr) (*(size_t *)MALLOC_BASE(addr))
malloc() stores metadata regarding space allocation before 8 bytes from space actually allocated. This could be used to determine space of buffer. And on my x86-64 this always return multiple of 16. So if allocated space is multiple of 16 (which is in most cases) then this could be used:
Code
#include <stdio.h>
#include <malloc.h>
int size_of_buff(void *buff) {
return ( *( ( int * ) buff - 2 ) - 17 ); // 32 bit system: ( *( ( int * ) buff - 1 ) - 17 )
}
void main() {
char *buff = malloc(1024);
printf("Size of Buffer: %d\n", size_of_buff(buff));
}
Output
Size of Buffer: 1024
This is my approach:
#include <stdio.h>
#include <stdlib.h>
typedef struct _int_array
{
int *number;
int size;
} int_array;
int int_array_append(int_array *a, int n)
{
static char c = 0;
if(!c)
{
a->number = NULL;
a->size = 0;
c++;
}
int *more_numbers = NULL;
a->size++;
more_numbers = (int *)realloc(a->number, a->size * sizeof(int));
if(more_numbers != NULL)
{
a->number = more_numbers;
a->number[a->size - 1] = n;
}
else
{
free(a->number);
printf("Error (re)allocating memory.\n");
return 1;
}
return 0;
}
int main()
{
int_array a;
int_array_append(&a, 10);
int_array_append(&a, 20);
int_array_append(&a, 30);
int_array_append(&a, 40);
int i;
for(i = 0; i < a.size; i++)
printf("%d\n", a.number[i]);
printf("\nLen: %d\nSize: %d\n", a.size, a.size * sizeof(int));
free(a.number);
return 0;
}
Output:
10
20
30
40
Len: 4
Size: 16
If your compiler supports VLA (variable length array), you can embed the array length into the pointer type.
int n = 10;
int (*p)[n] = malloc(n * sizeof(int));
n = 3;
printf("%d\n", sizeof(*p)/sizeof(**p));
The output is 10.
You could also choose to embed the information into the allocated memory yourself with a structure including a flexible array member.
struct myarray {
int n;
struct mystruct a[];
};
struct myarray *ma =
malloc(sizeof(*ma) + n * sizeof(struct mystruct));
ma->n = n;
struct mystruct *p = ma->a;
Then to recover the size, you would subtract the offset of the flexible member.
int get_size (struct mystruct *p) {
struct myarray *ma;
char *x = (char *)p;
ma = (void *)(x - offsetof(struct myarray, a));
return ma->n;
}
The problem with trying to peek into heap structures is that the layout might change from platform to platform or from release to release, and so the information may not be reliably obtainable.
Even if you knew exactly how to peek into the meta information maintained by your allocator, the information stored there may have nothing to do with the size of the array. The allocator simply returned memory that could be used to fit the requested size, but the actual size of the memory may be larger (perhaps even much larger) than the requested amount.
The only reliable way to know the information is to find a way to track it yourself.

How to make the bytes of the block be initialized so that they contain all 0s

I am writing the calloc function in a memory management assignment (I am using C). I have one question, I wrote the malloc function and thinking about using it for calloc as it says calloc will take num and size and return a block of memory that is (num * size) which I can use malloc to create, however, it says that I need to initialize all bytes to 0 and I am confused about how to do that in general?
If you need more info please ask me :)
So malloc will return a pointer (Void pointer) to the first of the usable memory and i have to go through the bytes, initialize them to zero, and return the pointer to that front of the usable memory.
I am assuming you can't use memset because it's a homework assignment assignment, and deals with memory management. So, I would just go in a loop and set all bytes to 0. Pseudocode:
for i = 1 to n:
data[i] = 0
Oh, if you're having trouble understanding how to dereference void *, remember you can do:
void *b;
/* now make b point to somewhere useful */
unsigned char *a = b;
When you need to set a block of memory to the same value, use the memset function.
It looks like this: void * memset ( void * ptr, int value, size_t num );
You can find more information about the function at: http://www.cplusplus.com/reference/clibrary/cstring/memset/
If you can't use memset, then you'll need to resort to setting each byte individually.
Since you're calling malloc from your calloc function, I'm going to assume it looks something like this:
void *calloc (size_t count, size_t sz) {
size_t realsz = count * sz;
void *block = malloc (realsz);
if (block != NULL) {
// Zero memory here.
}
return block;
}
and you just need the code for "// Zero memory here.".
Here's what you need to know.
In order to process the block one byte at a time, you need to cast the pointer to a type that references bytes (char would be good). To cast your pointer to (for example) an int pointer, you would use int *block2 = (int*)block;.
Once you have the right type of pointer, you can use that to store the correct data value based on the type. You would do this by storing the desired value in a loop which increments the pointer and decrements the count until the count reaches zero.
Hopefully that's enough to start with without giving away every detail of the solution. If you still have problems, leave a comment and I'll flesh out the answer until you have it correct (since it's homework, I'll be trying to get you to do most of the thinking).
Update: Since an answer's already been accepted, I'll post my full solution. To write a basic calloc in terms of just malloc:
void *calloc (size_t count, size_t sz) {
size_t realsz, i;
char *cblock;
// Get size to allocate (detect size_t overflow as well).
realsz = count * sz;
if (count != 0)
if (realsz / count != sz)
return NULL;
// Allocate the block.
cblock = malloc (realsz);
// Initialize all elements to zero (if allocation worked).
if (cblock != NULL) {
for (i = 0; i < realsz; i++)
cblock[i] = 0;
}
// Return allocated, cleared block.
return cblock;
}
Note that you can work directly with char pointers within the function since they freely convert to and from void pointers.
Hints:
there is already a posix library function for zeroing a block of memory
consider casting the void * to some pointer type that you can dereference / assign to.

Resources