Understanding the purpose of pointers - c

I'm reading a book on data structures and having difficulty grasping the concept of pointers. Let me preface this by saying that I don't have a lot of experience with C. But here goes....
If I do the following:
int num = 5;
int *ptrNum;
ptrNum = #
It is my understanding that the pointer reserves enough memory for a 32 bit int along with the memory required for the actual pointer although its value is simply the memory address of the variable.
What is the purpose of doing this if the same amount of memory is reserved? Why would I use the pointer instead of the variable, num? Am I totally off base here?

You use pointers in situations where a value won't work. In your example, you're correct; there's no benefit. The archtetypal border-line useful example is a swap function:
void swap_int(int *i1, int *i2)
{
int t1 = *i1;
*i1 = *i2;
*i2 = t1;
}
Calling sequence:
int main(void)
{
int v1 = 0;
int v2 = 31;
printf("v1 = %d; v2 = %d\n", v1, v2);
swap_int(&v1, &v2);
printf("v1 = %d; v2 = %d\n", v1, v2);
return 0;
}
If you write that without using pointers — like this:
void swap_int(int i1, int i2)
{
int t1 = i1;
i1 = i2;
i2 = t1;
}
int main(void)
{
int v1 = 0;
int v2 = 31;
printf("v1 = %d; v2 = %d\n", v1, v2);
swap_int(v1, v2);
printf("v1 = %d; v2 = %d\n", v1, v2);
return 0;
}
then you simply swap two local variables in the function without affecting the values in the calling function. Using pointers, you can affect the variables in the calling function.
See also:
scanf()-family of functions
strcpy() et al
It is my understanding that the pointer reserves enough memory for a 32 bit int along with the memory required for the actual pointer although its value is simply the memory address of the variable.
What you appear to be describing is as if:
int *p1;
does the same job as:
int _Anonymous;
int *p1 = &_Anonymous;
It doesn't; this is C. Creating p1 allocates enough space for the pointer. As first written, it doesn't initialize it, so it points to an indeterminate location (or no location). It (the pointer) needs to be initialized before it is used. Hence:
int i1 = 37;
int *p1 = &i1;
But the allocation of p1 only reserves enough space for a pointer (normally, 32-bits for a 32-bit compilation, 64-bits for a 64-bit compilation); you have to allocate the space it points at separately, and you have to initialize the pointer. Another way of initializing pointers is with dynamically allocated memory:
int *p2 = malloc(1000 * sizeof(*p2));
if (p2 != 0)
{
...use p2 as an array of 1000 integers...
free(p2);
}
Have you covered structures yet? If not, examples covering structures, such as trees or linked lists, won't help. However, once you have covered structures too, you'll be able to use trees or linked lists:
struct list
{
int data;
struct list *next;
};
struct tree
{
int data;
struct tree *l_child;
struct tree *r_child;
};
Such structures rely heavily on pointers to connect entries correctly.

A couple of the other answers focus on taking the address of a variable and storing it in a pointer. That's only one use for pointers. An entirely different use for pointers is to point to dynamically allocated storage, and for structuring that storage.
For example, suppose you want to read in a file and work on it in memory. But, you don't know how big the file is ahead of time. You could put an arbitrary upper limit in your code:
#define MAX_FILE_SIZE (640 * 1024) /* 640K should be large enough for anyone */
char data[ MAX_FILE_SIZE ];
That wastes memory for smaller files, and isn't large enough for larger files. A better approach would be to actually allocate what you need. For example:
FILE *f = fopen("myfile", "rb");
off_t len;
char *data;
fseek(f, 0, SEEK_END); /* go to the end of the file */
len = ftell(f); /* get the actual file size */
fseek(f, 0, SEEK_SET); /* rewind to the beginning */
data = malloc( len ); /* Allocate just as much as you need */
Another major use of pointers is to structure data, say in lists, or trees, or other fun structures. (Your data structures book will go into many of these.) If you want to reorganize your data, moving pointers is often much cheaper than copying data around. For example, suppose you have a list of these:
struct mystruct
{
int x[1000];
int y[1000];
};
That's a lot of data. If you just store that in an array, then sorting that data might be very expensive:
struct mystruct array[1000];
Try qsort on that... it will be very slow.
You can speed this up by instead storing pointers to elements and sorting the pointers. ie.
struct mystruct *array[1000];
int i;
struct mystruct *temp;
/* be sure to allocate the storage, though: */
temp = malloc( 1000 * sizeof( struct mystruct ) );
for (i = 0; i < 1000; i++)
array[i] = temp + i;
Now if you had to sort those structures, you'd swap pointers in array[] rather than entire structures.
I won't go into the fancier data structures that are better covered by your book. But, I thought I might give you a taste of some other uses for pointers.

How would add an element to a dynamic list? By creating a new array each time?
You just add pointer to next element instead and link the previous cell's next pointer to it.
Without pointers, you are constrained to order of arrays and alignment of variables.
With pointers, you can select any address in the allocated area to have any alignment of you like, you can have list elements pointing to and from any area you allocated.
So, pointers give you more freedom while needing only 32 or 64 bit space per pointer.

Pointers serve 3 main purposes in C:
Fake pass-by-reference semantics;
Track dynamically-allocated memory;
Build dynamic data structures.
Fake pass-by-reference semantics: in C, all function arguments are passed by value. Given the following snippet:
void foo( int a, int b )
{
a = 1;
b = 2;
}
void bar( void )
{
int x=0, y=1;
foo( x, y );
printf( "x = %d, y = %d\n", x, y );
}
The formal parameters a and b in foo are different objects in memory from the actual parameters x and y in bar, so any changes to a and b are not reflected in x and y. The output will be "x = 0, y = 1". If you want foo to alter the values of x and y, you will need to pass pointers to those variables instead:
void foo( int *a, int *b )
{
*a = 1;
*b = 2;
}
void bar( void )
{
int x = 0, y = 1;
foo( &x, &y );
printf( "x = %d, y = %d\n", x, y );
}
This time, the formal parameters a and b are pointers to the variables x and y; writing to the expressions *a and *b int foo is equivalent to writing to x and y in bar. Thus, the output is "x = 1, y = 2".
This is how scanf() and scores of other library functions work; they use pointers to reference the actual memory we want to operate on.
Track dynamically allocated memory: The library functions malloc, calloc, and realloc allow us to allocate memory at runtime, and all three return pointers to the allocated memory (as of C89, all three return void *). For example, if we want to allocate an array of int at run time:
int *p = NULL;
size_t numItems;
// get numItems;
p = malloc( sizeof *p * numItems );
if ( p )
{
// do stuff with p[0] through p[numItems - 1];
}
free( p );
The pointer variable p will contain the address of the newly allocated block of memory large enough to hold numItems integers. We can access that memory by dereferencing p using either the * operator or the [] subscript operator (*(p+i) == p[i]).
So why not just declare an array of size numItems and be done with it? After all, as of C99, you can use a variable-length array, where the size doesn't have to be known until runtime:
// get numItems
int p[numItems];
Three reasons: first, VLA's are not universally supported, and as of the 2011 standard, VLA support is now optional; second, we cannot change the size of the array after it has been declared, whereas we can use realloc to resize the memory block we've allocated; and finally, VLAs are limited both in where they can be used and how large they can be - if you need to allocate a lot of memory at runtime, it's better to do it through malloc/calloc/realloc than VLAs.
A quick note on pointer arithmetic: for any pointer T *p, the expression p+1 will evaluate to the address of the next element of type T, which is not necessariy the address value + 1. For example:
T sizeof T Original value of p p + 1
- -------- ------------------- -----
char 1 0x8000 0x8001
int 4 0x8000 0x8004
double 8 0x8000 0x8008
Build dynamic data structures: There are times when we want to store data in such a way that makes it easy to insert new elements into a list, or quickly search for a value, or force a specific order of access. There are a number of different data structures used for these purposes, and in almost all cases they use pointers. For example, we can use a binary search tree to organize our data in such a way that searching for a particular value is pretty fast. Each node in the tree has two children, each of which points to the next element in the tree:
struct node {
T key;
Q data;
struct node *left;
struct node *right;
};
The left and right members point to other nodes in the tree, or NULL if there is no child. Typically, the left child points to a node whose value is somehow "less than" the value of the current node, while the right child points to a node whose value is somehow "greater than" the current node. We can search the tree for a value like so:
int find( struct node *root, T key, Q *data )
{
int result = 0;
if ( root == NULL ) // we've reached the bottom of the tree
{ // without finding anything
result = 0;
}
else if ( root->key == key ) // we've found the element we're looking for
{
*data = root->data;
result = 1;
}
else if ( root->key < key )
{
// The input key is less than the current node's key,
// so we search the left subtree
result = find( root->left, key, data );
}
else
{
// The input key is greater than the current node's key,
// so we search the right subtree
result = find( root->right, key, data );
}
return result;
}
Assuming the tree is balanced (that is, the number of elements in the left subtree is equal to the number of elements in the right subtree), then the number of elements checked is around log2 N, where N is the total number of elements in the tree.

Related

How to properly free a dynamically allocated array in C

I am trying to write a set of functions that will support a dynamically allocated array where a struct contains the array and other metadata. The goal is to return the function to the user, and the struct information can be called from a function. The code seems to work just fine until I get to the function to free the memory from heap. For reasons I do not understand, the code fails with a segmentation fault, which would indicate that the variable vec in the free_vector function is not pointing to the correct address. However, I have verified with print statements that it is pointing to the correct address. I am hoping someone can help me understand why the free_vector function is not working, specifically the free command. My code and implementation is shown below.
typedef struct
{
size_t allocated_length;
size_t active_length;
size_t num_bytes;
char *vector;
} Vector;
void *init_vector(size_t num_indices, size_t num_bytes) {
// Allocate memory for Vector struct
Vector *vec = malloc(sizeof(*vec));
vec->active_length = 0;
vec->num_bytes = num_bytes;
// Allocate heap memory for vector
void *ptr = malloc(num_bytes * num_indices);
if (ptr == NULL) {
printf("WARNING: Unable to allocate memory, exiting!\n");
return &vec->vector;
}
vec->allocated_length = num_indices;
vec->vector = ptr;
return &vec->vector;
}
// --------------------------------------------------------------------------------
int push_vector(void *vec, void *elements, size_t num_indices) {
Vector *a = get_vector_data(vec);
if(a->active_length + num_indices > a->allocated_length) {
printf("TRUE\n");
size_t size = (a->allocated_length + num_indices) * 2;
void *ptr = realloc(a->vector, size * a->num_bytes);
if (ptr == NULL) {
printf("WARNING: Unable to allocate memory, exiting!\n");
return 0;
}
a->vector = ptr;
a->allocated_length = size;
}
memcpy((char *)vec + a->active_length * a->num_bytes, elements,
num_indices * a->num_bytes);
a->active_length += num_indices;
return 1;
}
// --------------------------------------------------------------------------------
Vector *get_vector_data(void *vec) {
// - The Vector struct has three size_t variables that proceed the vector
// variable. These variables consume 24 bytes of daya. THe code below
// points backwards in memory by 24 bytes to the beginning of the Struct.
char *a = (char *)vec - 24;
return (Vector *)a;
}
// --------------------------------------------------------------------------------
void free_vector(void *vec) {
// Free all Vector struct elements
Vector *a = get_vector_data(vec);
// - This print statement shows that the variable is pointing to the
// correct data.
printf("%d\n" ((int *)vec)[2]);
// The function fails on the next line and I do not know why
free(a->vector);
a->vector = NULL;
a->allocated_length = 0;
a->active_length = 0;
a->num_bytes = 0;
}
int main() {
int *a = init_vector(3, sizeof(int));
int b[3] = {1, 2, 3};
push_vector(a, b, 3);
// The code begins to fails here
free_vector(a);
}
This program suffers from Undefined Behaviour.
The return value from init_vector is of type char **, a pointer-to-pointer-to-char,
return &vec->vector;
converted to void *.
In main, this value is converted to an int *
int *a = init_vector(3, sizeof(int));
This value is then converted back into a void * when passed to push_vector.
In push_vector, this value is cast to a char * in order to perform pointer arithmetic
memcpy((char *)vec + a->active_length * a->num_bytes, elements,
num_indices * a->num_bytes);
where this operation overwrites the original pointer returned by malloc contained in the vector member.
On my system, this attempts to write 12 bytes (three int) to memory starting with the position of the vector member in the Vector structure.
Vector *vec
| &vec->vector
| |
v v
+------+------+------+------+-----+
|size_t|size_t|size_t|char *|?????|
+------+------+------+------+-----+
This overflows, as sizeof (char *) is 8 on my system.
This is the wrong place to write data. The correct place to write data is *(char **) vec - or just a->vector.
If the write does not crash the program directly (UB), this surely results in free being passed a pointer value that was not returned by malloc, calloc, or realloc, or the pointer value NULL.
Aside: In free_vector, this value is also cast to an int *
printf("%d\n", ((int *)vec)[2]); /* added a missing semi-colon. */
Additionally, it is unclear if free_vector should free the original allocation, or just the vector member. You do go to lengths to zero-out the structure here.
Still, as is, you have a memory leak - albeit a small one.
void free_vector(void *vec) {
Vector *a = get_vector_data(vec);
/* ... */
free(a); /* This has to happen at some point. */
}
Note, you should be using offsetof to calculate the position of members within a structure. A static offset of 24 assumes two thing that may not hold true:
sizeof (size_t) is always 8 (actual minimum sizeof (size_t) is 2), and
the structure contains no padding to satisfy alignment (this seems likely given the form, but not strictly true).
The source you linked in the comments uses a flexible array member, not a pointer member, meaning the entirety of the data (allocation sizes and the vector) is stored in contiguous memory. That is why the & operator yields a valid location to copy data to in this implementation.
(Aside: the linked implementation appears to be broken by effectively using sizeof to get the base of the container structure from a pointer to the flexible array member (e.g., &((vector_container *) pointer_to_flexible_member)[-1]), which does not take into account the possibility of trailing padding, which would result in a larger offset than expected.)

Dynamic allocation of array in C that points to linked list

I have these structs and I want to initialize the PageTable and PageEntry. I want to create the shape below.
typedef struct PageEntry {
unsigned int page_number;
char mode;
int count, R;
struct PageEntry* next;
} PE;
typedef struct PageTable {
int p_faults, reads, writes, disk_writes, maxFrames, curFrames;
char* algorithm;
struct PE **pe;
} PT;
I want to create a hash table, so I allocate for maxFrames PE*. My PageTable needs to have a pointer to the array and each element has to point to a linked list.
Here is my init function:
PT *initialize_Table(int maxFrames, char *algorithm) {
PT *ptr = malloc(sizeof(PT)); //Aloc
ptr->p_faults = 0;
ptr->reads = 0;
ptr->writes = 0;
ptr->curFrames = 0;
ptr->disk_writes = 0;
ptr->maxFrames = maxFrames;
ptr->algorithm = malloc(strlen(algorithm) + 1);
strcpy(ptr->algorithm, algorithm);
ptr->pe = malloc((ptr->maxFrames) * sizeof(PE*));
return ptr;
}
So Ptr->pe must be an array, but it isn't.
I get this error:
What should I do ?
No, ptr->pe is a pointer, not an array. You allocated memory for an array and you can index ptr->pe as if it was an array. So ptr->pe[i] is valid, if i is within range.
The contents of this freshly malloced piece of memory are undefined. Use memset to set it to all zeros, or use calloc (iso malloc) to allocate cleared memory.
Arrays in C are second rate citizens. You can declare them , initialize them and query their size with sizeof, but you can't do anything else with them. For all other purposes an array variable decays (or is treated as) a pointer to the first element.

can i use "int" as my dynamic array inside a struct?

In general, i'm trying to allocate values of first.a and first.b
to a array's in struct secon.
typedef struct {
int a;
int b;
} firs;
//secon is my struct which contains dynamic array
//can i use int here ?
typedef struct {
int *aa;
int *bb;
} secon;
//pointer to secon intialised to NULL;
secon* sp=NULL;
int main()
{
firs first;
//plz assume 2 is coming from user ;
sp=malloc(sizeof(secon)*2);
//setting values
first.a=10;
first.b=11;
/* what i'm trying to do is assign values of first.a and first.b to my
dynamically created array*/
/* plz assume first.a and first.b are changing else where .. that means ,not
all arrays will have same values */
/* in general , i'm trying to allocate values of first.a and first.b
to a array's in struct second. */
for(int i=0; i<2; i++) {
*( &(sp->aa ) + (i*4) ) = &first.a;
*( &(sp->bb ) + (i*4) ) = &first.b;
}
for(int i=0; i<2; i++) {
printf("%d %d \n", *((sp->aa) + (i*4) ),*( (sp->bb) +(i*4) ) );
}
return 0;
}
MY output :
10 11
4196048 0
Problems with my code:
1. whats wrong with my code?
2. can i use int inside struct for dynamic array?
3. what are the alternatives?
4. why am i not getting correct answer?
Grigory Rechistov has done a really good job of untangling the code and you should probably accept his answer, but I want to emphasize one particular point.
In C pointer arithmetic, the offsets are always in units of the size of the type pointed to. Unless the type of the pointer is char* or void* if you find yourself multiplying by the size of the type, you are almost certainly doing it wrong.
If I have
int a[10];
int *p = &(a[5]);
int *q = &(a[7]);
Then a[6] is the same as *(p + 1) not *(p + 1 * sizeof(int)). Likewise a[4] is *(p - 1)
Furthermore, you can subtract pointers when they both point to objects in the same array and the same rule applies; the result is in the units of the size of the type pointed to. q - p is 2, not 2 * sizeof(int). Replace the type int in the example with any other type and the p - q will always be 2. For example:
struct Foo { int n ; char x[37] ; };
struct Foo a[10];
struct Foo *p = &(a[5]);
struct Foo *q = &(a[7]);
q - p is still 2. Incidentally, never be tempted to hard code a type's size anywhere. If you are tempted to malloc a struct like this:
struct Foo *r = malloc(41); // int size is 4 + 37 chars
Don't.
Firstly, sizeof(int) is not guaranteed to be 4. Secondly, even if it is, sizeof(struct Foo) is not guaranteed to be 41. Compilers often add padding to struct types to ensure that the members are properly aligned. In this case it is almost a certainty that the compiler will add 3 bytes (or 7 bytes) of padding to the end of struct Foo to ensure that, in arrays, the address of the n member is aligned to the size of an int. always always always use sizeof.
It looks like your understanding how pointer arithmetic works in C is wrong. There is also a problem with data layout assumptions. Finally, there are portability issues and a bad choice of syntax that complicates understanding.
I assume that wit this expression: *( &(sp->aa ) + (i*4) ) you are trying to access the i-th item in the array by taking address of the 0-th item and then adding a byte offset to it. This is wrong of three reasons:
You assume that after sp[0].aa comes sp[1].aa in memory, but you forget that there is sp[0].bb in between.
You assume that size of int is always 4 bytes, which is not true.
You assume that adding an int to secon* will give you a pointer that is offset by specified number of bytes, while in fact it will be offset in specified number of records of size secon.
The second line of output that you see is random junk from unallocated heap memory because when i == 1 your constructions reference memory that is outside of limits allocated for *secon.
To access an i-th item of array referenced by a pointer, use []:
secon[0].aa is the same as (secon +0)->aa, and secon[1].aa is equal to (secon+1)->aa.
This is a complete mess. If you want to access an array of secons, use []
for(int i=0;i<2;i++)
{
sp[i].aa = &first.a; // Same pointer both times
sp[i].bb = &first.b;
}
You have two copies of pointers to the values in first, they point to the same value
for(int i=0;i<2;i++)
{
sp[i].aa = malloc(sizeof(int)); // new pointer each time
*sp[i].aa = first.a; // assigned with the current value
sp[i].bb = malloc(sizeof(int));
*sp[i].bb = first.b;
}
However the compiler is allowed to assume that first does not change, and it is allowed to re-order these expressions, so you are not assured to have different values in your secons
Either way, when you read back the values in second, you can still use []
for(int i=0;i<2;i++)
{
printf("%d %d \n",*sp[i].aa ),*sp[i].bb );
}

Simulating a List with array

Good morning!
I must handle a struct array (global variable) that simulates a list. In practice, every time I call a method, I have to increase the size of the array 1 and insert it into the new struct.
Since the array size is static, my idea is to use pointers like this:
The struct array is declared as a pointer to a second struct array.
Each time I call the increaseSize () method, the content of the old array is copied to a new n + 1 array.
The global array pointer is updated to point to a new array
In theory, the solution seems easy ... but I'm a noob of c. Where is that wrong?
struct task {
char title[50];
int execution;
int priority;
};
struct task tasks = *p;
int main() {
//he will call the increaseSize() somewhere...
}
void increaseSize(){
int dimension = (sizeof(*p) / sizeof(struct task));
struct task newTasks[dimension+1];
for(int i=0; i<dimension; i++){
newTasks[i] = *(p+i);
}
free(&p);
p = newTasks;
}
You mix up quite a lot here!
int dimension = (sizeof(*p) / sizeof(struct task));
p is a pointer, *p points to a struct task, so sizeof(*p) will be equal to sizeof(struct task), and dimension always will be 1...
You cannot use sizeof in this situation. You will have to store the size (number of elements) in a separate variable.
struct task newTasks[dimension+1];
This will create a new array, yes – but with scope local to the current function (so normally, it is allocated on the stack). This means that the array will be cleaned up again as soon as you leave your function.
What you need is creating the array on the heap. You need to use malloc function for (or calloc or realloc).
Additionally, I recomment not increasing the array by 1, but rather duplicating its size. You need to store the number of elements contained in then, too, though.
Putting all together:
struct task* p;
size_t count;
size_t capacity;
void initialize()
{
count = 0;
capacity = 16;
p = (struct task*) malloc(capacity * sizeof(struct task));
if(!p)
// malloc failed, appropriate error handling!
}
void increase()
{
size_t c = capacity * 2;
// realloc is very convenient here:
// if allocation is successful, it copies the old values
// to the new location and frees the old memory, so nothing
// so nothing to worry about except for allocation failure
struct task* pp = realloc(p, c * sizeof(struct task));
if(pp)
{
p = pp;
capacity = c;
}
// else: apprpriate error handling
}
Finally, as completion:
void push_back(struct task t)
{
if(count == capacity)
increase();
p[count++] = t;
}
Removing elements is left to you – you'd have to copy the subsequent elements all to one position less and then decrease count.

Get the length of an array with a pointer? [duplicate]

I've allocated an "array" of mystruct of size n like this:
if (NULL == (p = calloc(sizeof(struct mystruct) * n,1))) {
/* handle error */
}
Later on, I only have access to p, and no longer have n. Is there a way to determine the length of the array given just the pointer p?
I figure it must be possible, since free(p) does just that. I know malloc() keeps track of how much memory it has allocated, and that's why it knows the length; perhaps there is a way to query for this information? Something like...
int length = askMallocLibraryHowMuchMemoryWasAlloced(p) / sizeof(mystruct)
I know I should just rework the code so that I know n, but I'd rather not if possible. Any ideas?
No, there is no way to get this information without depending strongly on the implementation details of malloc. In particular, malloc may allocate more bytes than you request (e.g. for efficiency in a particular memory architecture). It would be much better to redesign your code so that you keep track of n explicitly. The alternative is at least as much redesign and a much more dangerous approach (given that it's non-standard, abuses the semantics of pointers, and will be a maintenance nightmare for those that come after you): store the lengthn at the malloc'd address, followed by the array. Allocation would then be:
void *p = calloc(sizeof(struct mystruct) * n + sizeof(unsigned long int),1));
*((unsigned long int*)p) = n;
n is now stored at *((unsigned long int*)p) and the start of your array is now
void *arr = p+sizeof(unsigned long int);
Edit: Just to play devil's advocate... I know that these "solutions" all require redesigns, but let's play it out.
Of course, the solution presented above is just a hacky implementation of a (well-packed) struct. You might as well define:
typedef struct {
unsigned int n;
void *arr;
} arrInfo;
and pass around arrInfos rather than raw pointers.
Now we're cooking. But as long as you're redesigning, why stop here? What you really want is an abstract data type (ADT). Any introductory text for an algorithms and data structures class would do it. An ADT defines the public interface of a data type but hides the implementation of that data type. Thus, publicly an ADT for an array might look like
typedef void* arrayInfo;
(arrayInfo)newArrayInfo(unsignd int n, unsigned int itemSize);
(void)deleteArrayInfo(arrayInfo);
(unsigned int)arrayLength(arrayInfo);
(void*)arrayPtr(arrayInfo);
...
In other words, an ADT is a form of data and behavior encapsulation... in other words, it's about as close as you can get to Object-Oriented Programming using straight C. Unless you're stuck on a platform that doesn't have a C++ compiler, you might as well go whole hog and just use an STL std::vector.
There, we've taken a simple question about C and ended up at C++. God help us all.
keep track of the array size yourself; free uses the malloc chain to free the block that was allocated, which does not necessarily have the same size as the array you requested
Just to confirm the previous answers: There is no way to know, just by studying a pointer, how much memory was allocated by a malloc which returned this pointer.
What if it worked?
One example of why this is not possible. Let's imagine the code with an hypothetic function called get_size(void *) which returns the memory allocated for a pointer:
typedef struct MyStructTag
{ /* etc. */ } MyStruct ;
void doSomething(MyStruct * p)
{
/* well... extract the memory allocated? */
size_t i = get_size(p) ;
initializeMyStructArray(p, i) ;
}
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
doSomething(s) ;
}
Why even if it worked, it would not work anyway?
But the problem of this approach is that, in C, you can play with pointer arithmetics. Let's rewrite doSomethingElse():
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
MyStruct * s2 = s + 5 ; /* s2 points to the 5th item */
doSomething(s2) ; /* Oops */
}
How get_size is supposed to work, as you sent the function a valid pointer, but not the one returned by malloc. And even if get_size went through all the trouble to find the size (i.e. in an inefficient way), it would return, in this case, a value that would be wrong in your context.
Conclusion
There are always ways to avoid this problem, and in C, you can always write your own allocator, but again, it is perhaps too much trouble when all you need is to remember how much memory was allocated.
Some compilers provide msize() or similar functions (_msize() etc), that let you do exactly that
May I recommend a terrible way to do it?
Allocate all your arrays as follows:
void *blockOfMem = malloc(sizeof(mystruct)*n + sizeof(int));
((int *)blockofMem)[0] = n;
mystruct *structs = (mystruct *)(((int *)blockOfMem) + 1);
Then you can always cast your arrays to int * and access the -1st element.
Be sure to free that pointer, and not the array pointer itself!
Also, this will likely cause terrible bugs that will leave you tearing your hair out. Maybe you can wrap the alloc funcs in API calls or something.
malloc will return a block of memory at least as big as you requested, but possibly bigger. So even if you could query the block size, this would not reliably give you your array size. So you'll just have to modify your code to keep track of it yourself.
For an array of pointers you can use a NULL-terminated array. The length can then determinate like it is done with strings. In your example you can maybe use an structure attribute to mark then end. Of course that depends if there is a member that cannot be NULL. So lets say you have an attribute name, that needs to be set for every struct in your array you can then query the size by:
int size;
struct mystruct *cur;
for (cur = myarray; cur->name != NULL; cur++)
;
size = cur - myarray;
Btw it should be calloc(n, sizeof(struct mystruct)) in your example.
Other have discussed the limits of plain c pointers and the stdlib.h implementations of malloc(). Some implementations provide extensions which return the allocated block size which may be larger than the requested size.
If you must have this behavior you can use or write a specialized memory allocator. This simplest thing to do would be implementing a wrapper around the stdlib.h functions. Some thing like:
void* my_malloc(size_t s); /* Calls malloc(s), and if successful stores
(p,s) in a list of handled blocks */
void my_free(void* p); /* Removes list entry and calls free(p) */
size_t my_block_size(void* p); /* Looks up p, and returns the stored size */
...
really your question is - "can I find out the size of a malloc'd (or calloc'd) data block". And as others have said: no, not in a standard way.
However there are custom malloc implementations that do it - for example http://dmalloc.com/
I'm not aware of a way, but I would imagine it would deal with mucking around in malloc's internals which is generally a very, very bad idea.
Why is it that you can't store the size of memory you allocated?
EDIT: If you know that you should rework the code so you know n, well, do it. Yes it might be quick and easy to try to poll malloc but knowing n for sure would minimize confusion and strengthen the design.
One of the reasons that you can't ask the malloc library how big a block is, is that the allocator will usually round up the size of your request to meet some minimum granularity requirement (for example, 16 bytes). So if you ask for 5 bytes, you'll get a block of size 16 back. If you were to take 16 and divide by 5, you would get three elements when you really only allocated one. It would take extra space for the malloc library to keep track of how many bytes you asked for in the first place, so it's best for you to keep track of that yourself.
This is a test of my sort routine. It sets up 7 variables to hold float values, then assigns them to an array, which is used to find the max value.
The magic is in the call to myMax:
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
And that was magical, wasn't it?
myMax expects a float array pointer (float *) so I use &arr to get the address of the array, and cast it as a float pointer.
myMax also expects the number of elements in the array as an int. I get that value by using sizeof() to give me byte sizes of the array and the first element of the array, then divide the total bytes by the number of bytes in each element. (we should not guess or hard code the size of an int because it's 2 bytes on some system and 4 on some like my OS X Mac, and could be something else on others).
NOTE:All this is important when your data may have a varying number of samples.
Here's the test code:
#include <stdio.h>
float a, b, c, d, e, f, g;
float myMax(float *apa,int soa){
int i;
float max = apa[0];
for(i=0; i< soa; i++){
if (apa[i]>max){max=apa[i];}
printf("on i=%d val is %0.2f max is %0.2f, soa=%d\n",i,apa[i],max,soa);
}
return max;
}
int main(void)
{
a = 2.0;
b = 1.0;
c = 4.0;
d = 3.0;
e = 7.0;
f = 9.0;
g = 5.0;
float arr[] = {a,b,c,d,e,f,g};
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
printf("mmax = %0.2f\n",mmax);
return 0;
}
In uClibc, there is a MALLOC_SIZE macro in malloc.h:
/* The size of a malloc allocation is stored in a size_t word
MALLOC_HEADER_SIZE bytes prior to the start address of the allocation:
+--------+---------+-------------------+
| SIZE |(unused) | allocation ... |
+--------+---------+-------------------+
^ BASE ^ ADDR
^ ADDR - MALLOC_HEADER_SIZE
*/
/* The amount of extra space used by the malloc header. */
#define MALLOC_HEADER_SIZE \
(MALLOC_ALIGNMENT < sizeof (size_t) \
? sizeof (size_t) \
: MALLOC_ALIGNMENT)
/* Set up the malloc header, and return the user address of a malloc block. */
#define MALLOC_SETUP(base, size) \
(MALLOC_SET_SIZE (base, size), (void *)((char *)base + MALLOC_HEADER_SIZE))
/* Set the size of a malloc allocation, given the base address. */
#define MALLOC_SET_SIZE(base, size) (*(size_t *)(base) = (size))
/* Return base-address of a malloc allocation, given the user address. */
#define MALLOC_BASE(addr) ((void *)((char *)addr - MALLOC_HEADER_SIZE))
/* Return the size of a malloc allocation, given the user address. */
#define MALLOC_SIZE(addr) (*(size_t *)MALLOC_BASE(addr))
malloc() stores metadata regarding space allocation before 8 bytes from space actually allocated. This could be used to determine space of buffer. And on my x86-64 this always return multiple of 16. So if allocated space is multiple of 16 (which is in most cases) then this could be used:
Code
#include <stdio.h>
#include <malloc.h>
int size_of_buff(void *buff) {
return ( *( ( int * ) buff - 2 ) - 17 ); // 32 bit system: ( *( ( int * ) buff - 1 ) - 17 )
}
void main() {
char *buff = malloc(1024);
printf("Size of Buffer: %d\n", size_of_buff(buff));
}
Output
Size of Buffer: 1024
This is my approach:
#include <stdio.h>
#include <stdlib.h>
typedef struct _int_array
{
int *number;
int size;
} int_array;
int int_array_append(int_array *a, int n)
{
static char c = 0;
if(!c)
{
a->number = NULL;
a->size = 0;
c++;
}
int *more_numbers = NULL;
a->size++;
more_numbers = (int *)realloc(a->number, a->size * sizeof(int));
if(more_numbers != NULL)
{
a->number = more_numbers;
a->number[a->size - 1] = n;
}
else
{
free(a->number);
printf("Error (re)allocating memory.\n");
return 1;
}
return 0;
}
int main()
{
int_array a;
int_array_append(&a, 10);
int_array_append(&a, 20);
int_array_append(&a, 30);
int_array_append(&a, 40);
int i;
for(i = 0; i < a.size; i++)
printf("%d\n", a.number[i]);
printf("\nLen: %d\nSize: %d\n", a.size, a.size * sizeof(int));
free(a.number);
return 0;
}
Output:
10
20
30
40
Len: 4
Size: 16
If your compiler supports VLA (variable length array), you can embed the array length into the pointer type.
int n = 10;
int (*p)[n] = malloc(n * sizeof(int));
n = 3;
printf("%d\n", sizeof(*p)/sizeof(**p));
The output is 10.
You could also choose to embed the information into the allocated memory yourself with a structure including a flexible array member.
struct myarray {
int n;
struct mystruct a[];
};
struct myarray *ma =
malloc(sizeof(*ma) + n * sizeof(struct mystruct));
ma->n = n;
struct mystruct *p = ma->a;
Then to recover the size, you would subtract the offset of the flexible member.
int get_size (struct mystruct *p) {
struct myarray *ma;
char *x = (char *)p;
ma = (void *)(x - offsetof(struct myarray, a));
return ma->n;
}
The problem with trying to peek into heap structures is that the layout might change from platform to platform or from release to release, and so the information may not be reliably obtainable.
Even if you knew exactly how to peek into the meta information maintained by your allocator, the information stored there may have nothing to do with the size of the array. The allocator simply returned memory that could be used to fit the requested size, but the actual size of the memory may be larger (perhaps even much larger) than the requested amount.
The only reliable way to know the information is to find a way to track it yourself.

Resources