One element array in struct - arrays

Why some struct uses a single element array, such as follows:
typedef struct Bitmapset
{
int nwords;
uint32 words[1];
} Bitmapset;
To make it convenient for latter dynamic allocation?

In a word, yes.
Basically, the C99 way to do it is with an flexible array member:
uint32 words[];
Some pre-C99 compilers let you get away with:
uint32 words[0];
But the way to guarantee it to work across all compilers is:
uint32 words[1];
And then, no matter how it's declared, you can allocate the object with:
Bitmapset *allocate(int n)
{
Bitmapset *p = malloc(offsetof(Bitmapset, words) + n * sizeof(p->words[0]));
p->nwords = n;
return p;
}
Though for best results you should use size_t instead of int.

This is usually to allow idiomatic access to variable-sized struct instances. Considering your example, at runtime, you may have a Bitmapset that is laid out in memory like this:
-----------------
| nwords | 3 |
| words[0] | 10 |
| words[1] | 20 |
| words[2] | 30 |
-----------------
So you end up with a runtime-variable number of uint32 "hanging off" the end of your struct, but accessible as if they're defined inline in the struct. This is basically (ab)using the fact that C does no runtime array-bounds checking to allow you to write code like:
for (int i = 0; i < myset.nwords; i++) {
printf("%d\n", myset.words[i]);
}

Related

How is memory allocated to multi-nested structs in C?

A couple of days ago, I asked this question. I duplicated 90% of the code given in the answer to my previous question. However, when I used Valgrind to do memcheck, it told me that there were memory leaks. But I don't think it was that 10% difference that caused the memory leaks. In addition to the memory leak issue, I have a couple of other questions.
A brief summary of my previous post:
I have a multi-nested struct. I need to correctly allocate memory to it and free the memory later on. The structure of the entire struct should look like this:
College Sys
| | | ... |
ColleA ColleB ColleC ... ColleX
| | | | | | | | | | | | ... | | | |
sA sB sC sD sA sB sC sD sA sB sC sD ... sA sB sC sD
| | | | | | | | | | | | ... | | | |
fam fam ...
// Colle is short for college
// s is short for stu (which is short for student)
There could be arbitrary number of colleges and students, which is controllable by #define MAX_NUM _num_.
As per the previous answer, I should allocate memory in the order of "outermost to innermost" and free the memory "innermost to outermost". I basically understand the logic behind the pattern. The following questions are extensions to it.
1) Does
CollegeSys -> Colle = malloc(sizeof(*(CollegeSys -> Colle)) * MAX_NUM);
CollegeSys -> Colle -> stu = malloc(sizeof(*(CollegeSys -> Colle -> stu)) * MAX_NUM);
CollegeSys -> Colle -> stu -> fam = malloc(sizeof(*(CollegeSys -> Colle -> stu -> fam)));
mean " there are MAX_NUM colleges under the college system, each of which has MAX_NUM students — each of which has one family"?
1.a)If yes, do I still need for loops to initialize every single value contained in this huge struct?
For example, the possibly correct way:
for (int i = 0; i < MAX_NUM; i++) {
strcpy(CollegeSys -> Colle[i].name, "collestr");
for (int n = 0; n < MAX_NUM; n++) {
strcpy(system -> Colle[i].stu[n].name, "stustr");
...
}
}
the possibly incorrect way:
strcpy(CollegeSys -> Colle -> name, "collestr");
strcpy(CollegeSys -> Colle -> stu -> name, "stustr");
I tried the "possibly incorrect way". There was no syntax error, but it would only initialize CollegeSys -> Colle[0].name and ... -> stu[0].name. So, the second approach is very likely to be incorrect if I want to initialize every single attribute.
2) If I modularize this whole process, separating the process into several functions that return corresponding struct pointers — newSystem(void), newCollege(void), newStudent(void) (arguments might not necessarily be void; we might also pass a str as the name to the functions; besides, there might be a series of addStu(), etc... to assign those returned pointers to the corresponding part of CollegeSys). When I create a new CollegeSys in newSystem(), is it correct to malloc memory to every nested struct once and for all within newSystem()?
2.a) If I allocate memory to all parts of the struct in newSystem(), the possible consequence I can think of so far is there would be memory leaks. Since we've allocated memory to all parts when creating the system, we inevitably have to create a new struct pointer and allocate adequate memory to it in the other two functions too. For instance,
struct Student* newStudent(void) {
struct Student* newStu = malloc(sizeof(struct Student));
newStu -> fam = malloc(sizeof(*(newStu -> fam)));
// I'm not sure if I'm supposed to also allocate memoty to fam struct
...
return newStu;
}
If so, we actually allocate the same amount of memory to an instance at least twice — one in the newSystem(void), the other in newStudent(void). If I'm correct so far, this is definitely memory leak. And the memory we allocate to the pointer newStu in newStudent(void) can never be freed (I think so). Then, what is the correct way to allocate memory to the whole structure when we separate the whole memory allocation process into several small steps?
3) Do we have to use sizeof(*(layer1 -> layer2 -> ...)) when malloc'ing memory to a struct nested in a struct? Can we directly specify the type of it? For example, doing this
CollegeSys -> Colle = malloc(sizeof(struct College) * MAX_NUM);
// instead of
// CollegeSys -> Colle = malloc(sizeof(*(CollegeSys -> Colle)) * MAX_NUM);
4) It seems even if we allocate a certain amount of memory to a pointer, we still can't prevent segfault. For example, we code
// this is not completely correct C code, just to show what I mean
struct* ptr = malloc(sizeof(struct type) * 3);
We still could call ptr[3], ptr[4] and so on, and the compiler will print out nonsense. Sometimes, the compiler may throw an error but sometimes may not. So, essentially, we can't rely on malloc (or calloc and so forth) to avoid the appearance of segfault?
I'm sorry about writing such a long text. Thanks for your patience.

malloc once, then distribute memory over struct arrays

I have a struct that has the following memory layout:
uint32_t
variable length array of type uint16_t
variable length array of type uint16_t
Because of the variable length of the arrays I have pointers to these arrays, effectively:
struct struct1 {
uint32_t n;
uint16_t *array1;
uint16_t *array2;
};
typedef struct struct1 struct1;
Now, when allocation these structs I see two options:
A) malloc the struct itself, then malloc space for the arrays individually and set the pointers in the struct to point to the correct memory location:
uint32_t n1 = 10;
uint32_t n2 = 20;
struct1 *s1 = malloc(sizeof(struct1));
uint16 *array1 = malloc(sizeof(uint16) * n1));
uint16 *array2 = malloc(sizeof(uint16) * n2));
s1->n = n1;
s1->array1 = array1;
s1->array2 = array2;
B) malloc memory for everything combined, then "distribute" the memory over the struct:
struct1 *s1 = malloc(sizeof(struct1) + (n1 + n2) * sizeof(uint16_t));
s1->n = n1;
s1->array1 = s1 + sizeof(struct1);
s1->array2 = s1 + sizeof(struct1) + n1 * sizeof(uint16_t);
Note that array1 and array2 are not bigger than a few KB and usually not a lot of struct1s are needed. However, cache efficiency is a concern as numeric data crunching is done with this struct.
Is approach B) possible and if so better (faster) than A in terms of memory locality?
I am not very familiar with C, is there a better way of doing B (or A), ie. using memcpy or realloc or something?
Anything else to be mindful about in this situation?
Note, that right now I'm using gcc (C89?) on linux but could use C99/C11 if necessary. Thanks in advance.
EDIT: To clarify further: The size of the arrays will never change after creation. Multiple struct1s will not be always be allocated at once but rather occasionally during the program's runtime.
I think your option A is much cleaner and would scale in a more sensible way. Imagine having to realloc space when the array in one of the structures becomes larger: in option A, you can realloc that memory since it isn't logically attached to anything else. In option B, you need to add in additional logic to ensure you don't break your other array.
I also think (even in C89, but I could be wrong) that there is nothing wrong with this:
struct1 *s1 = malloc(sizeof(struct1));
s1->array1 = malloc(sizeof(uint16) * n1));
s1->array2 = malloc(sizeof(uint16) * n2));
s1->n = n1;
The above takes out the middle-man arrays. I think it is cleaner because you immediately see that you are allocating space for a pointer in a structure.
I have used your option B before for 2D arrays, where I just allocate a single space and use logical rules in my code to use it as a 2D space. This is useful when I want it to be a rectangular 2D space, so when I increase it, I always increase each row or column. In other words, I never want to have heterogeneous array sizes.
Update: 'Arrays will never change in size'
Because you clarified that your structures/arrays will never need to be reallocated, I think option B is less bad. It still seems to be a worse solution for this application than option A, and here are my reasons for thinking this:
malloc is optimized such that there wouldn't be much optimization from allocating a single space compared to allocating the spaces individually.
The ability of other engineers to look at and immediately understand your code would be reduced. To be clear, any competent software engineer should be able to look at option B and figure out what the writer of the code was doing, but it very well could waste that engineers' brain-cycles and could cause a junior engineer to misunderstand the code and create a bug.
So, if you comment the code thoroughly, and your application absolutely requires you to optimize everything you possibly can, at the expense of clean and logically sensible code (where memory space and data structures are logically separated in a similar way), and you know that this optimization is better than what a good compiler (like Clang) can do, then option B could be a better option.
Update: Testing
In the spirit of self-criticism I wanted to see if I could evaluate the difference. So I wrote two programs (one for option A and one for option B) and compiled them with optimizations off. I used a FreeBSD virtual machine to get as clean of an environment as possible, and I used gcc.
Here are the programs that I used to test the two methods:
optionA.c:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#define NSIZE 100000
#define NTESTS 10000000
struct test_struct {
int n;
int *array1;
int *array2;
};
void freeA(struct test_struct *input) {
free(input->array1);
free(input->array2);
free(input);
return;
}
void optionA() {
struct test_struct *s1 = malloc(sizeof(*s1));
s1->array1 = malloc(sizeof(*(s1->array1)) * NSIZE);
s1->array2 = malloc(sizeof(*(s1->array1)) * NSIZE);
s1->n = NSIZE;
freeA(s1);
s1 = 0;
return;
}
int main() {
clock_t beginA = clock();
int i;
for (i=0; i<NTESTS; i++) {
optionA();
}
clock_t endA = clock();
int time_spent_A = (endA - beginA);
printf("Time spent for option A: %d\n", time_spent_A);
return 0;
}
optionB.c:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#define NSIZE 100000
#define NTESTS 10000000
struct test_struct {
int n;
int *array1;
int *array2;
};
void freeB(struct test_struct *input) {
free(input);
return;
}
void optionB() {
struct test_struct *s1 = malloc(sizeof(*s1) + 2*NSIZE*sizeof(*(s1->array1)));
s1->array1 = s1 + sizeof(*s1);
s1->array2 = s1 + sizeof(*s1) + NSIZE*sizeof(*(s1->array1));
s1->n = NSIZE;
freeB(s1);
s1 = 0;
return;
}
int main() {
clock_t beginB = clock();
int i;
for (i=0; i<NTESTS; i++) {
optionB();
}
clock_t endB = clock();
int time_spent_B = (endB - beginB);
printf("Time spent for option B: %d\n", time_spent_B);
return 0;
}
Results for these tests are given in clocks (see clock(3) for more information).
Series | Option A | Option B
------------------------------
1 | 332 | 158
------------------------------
2 | 334 | 155
------------------------------
3 | 334 | 156
------------------------------
4 | 333 | 154
------------------------------
5 | 339 | 156
------------------------------
6 | 334 | 155
------------------------------
avg | 336.0 | 155.7
------------------------------
Note that these speeds are still incredibly fast and translate to milliseconds over millions of tests. I have also found that Clang (cc) is better than gcc at optimizing. On my machine, even after writing a method that writes data to the arrays (to ensure they don't get optimized out of existence) I got no differential between the two methods when compiling with cc.
I would advice a hybrid of the two:
allocate the structs in one call (it is now an array of structs);
allocate the arrays in one call, and make sure the size includes any padding for the allignment required by your compiler/platform;
distribute the arrays over the structs, taking the allignment into acount.
However, malloc is already optimized, so your first solution would still be prefered.
Note: as user Greg Schmit's solution points out, allocating all the arrays in one time, will cause difficulties if the array size needs to be changed in run-time
Because the two arrays have the same type, there are much more options than that, based on creative use of the C99 flexible array member. I'd recommend you use a pointer only for the second array,
struct foo {
uint16_t *array2;
uint32_t field;
uint16_t array1[];
};
and allocate memory for both at the same time:
struct foo *foo_new(const size_t length1, const size_t length2)
{
struct foo *result;
result = malloc( sizeof (struct foo)
+ length1 * sizeof (uint16_t)
+ length2 * sizeof (uint16_t) );
if (!result)
return NULL;
result->array2 = result->array1 + length1;
return result;
}
Note that with struct foo *bar, accessing element i in the two arrays uses the same notation, bar->array1[i] and bar->array2[i], respectively.
In the context of scientific computing, I would consider completely other options, depending on the access patterns. For example, if the two arrays are accessed in lockstep (in any direction), I would use
typedef uint16_t pair16[2];
struct bar {
uint32_t field;
pair16 array[];
};
If the arrays were large, then copying them into temporary buffers (arrays of pair16 above, if accessed in lockstep) would possibly help, but with at most a few thousand entries, it is likely not going to give a significant speed boost.
In cases where the access pattern depends on the other, but you still do enough of computation on each entry, it may be useful to compute the address of the next entry early, and use __builtin_prefetch() GCC built-in to tell the CPU you'll need it soon, before doing the computation on the current entry. It may reduce the data access latencies (although the access predictors are pretty darn good on current processors already).
With GCC (and to a lesser extent on other common compilers like Intel Compiler Collection, Portland Group, and Pathscale C compilers), I've noticed that code that manipulates pointers (instead of array pointers and array indexing) compiles to better machine code on x86 and x86-64. (The reason is actually quite simple: with array pointers and array indexing, you need at least two separate registers, and x86 has relatively few of those. Even x86-64 doesn't have that many of them. GCC in particular is not very strong at optimizing register usage -- it's much better now than in the version 3 era --, so this seems to help a lot in some cases). For example, if you were to access the first array in a struct foo sequentially, then
void do_something(struct foo *ref)
{
uint16_t *array1 = ref->array1;
uint16_t *const limit1 = ref->array1 + (number of elements in array1);
for (; array1 < limit1; array1++) {
/* ... */
}
}
Approach B is possible, (why don't you just try it?) and it is better, not so much because of memory locality, but because malloc() costs, so the fewer times you call it, the better off you are. (Assuming that 'better' means 'faster', which admittedly, is not necessarily the case.)
Memory locality is only marginally improved, since all memory blocks would most likely be continuous (one after the other) in memory, so if you went with approach A your arrays would only be separated by block headers, which are not very big. (Of the order of 32 bytes each, maybe a bit larger, but not by much.) The only situation in which your blocks would not be continuous is if you had previously been doing malloc() and free(), so your memory would be fragmented.

Freeing 2D array - Heap Corruption Detected

EDIT: Sorry guys, I forgot to mention that this is coded in VS2013.
I have a globally declared struct:
typedef struct data //Struct for storing search & sort run-time statistics.
{
int **a_collision;
} data;
data data1;
I then allocate my memory:
data1.a_collision = (int**)malloc(sizeof(int)*2); //Declaring outer array size - value/key index.
for (int i = 0; i < HASH_TABLE_SIZE; i++)
data1.a_collision[i] = (int*)malloc(sizeof(int)*HASH_TABLE_SIZE); //Declaring inner array size.
I then initialize all the elements:
//Initializing 2D collision data array.
for (int i = 0; i < 2; i++)
for (int j = 0; j < HASH_TABLE_SIZE; j++)
data1.a_collision[i][j] = NULL;
And lastly, I wish to free the memory (which FAILS). I have unsuccessfully tried following some of the answers given on SO already.
free(data1.a_collision);
for (int i = 0; i < HASH_TABLE_SIZE; i++)
free(data1.a_collision[i]);
A heap corruption detected error is given at the first free statement. Any suggestions?
There are multiple mistakes in your code. logically wrong how to allocate memory for two dimension array as well as some typos.
From comment in your code "outer array size - value/key index" it looks like you wants to allocate memory for "2 * HASH_TABLE_SIZE" size 2D array, whereas from your code in for loop breaking condition "i < HASH_TABLE_SIZE;" it seems you wants to create an array of size "HASH_TABLE_SIZE * 2".
Allocate memory:
Lets I assume you wants to allocate memory for "2 * HASH_TABLE_SIZE", you can apply same concept for different dimensions.
The dimension "2 * HASH_TABLE_SIZE" means two rows and HASH_TABLE_SIZE columns. Correct allocation steps for this would be as follows:
step-1: First create an array of int pointers of lenght equals to number of rows.
data1.a_collision = malloc(2 * sizeof(int*));
// 2 rows ^ ^ you are missing `*`
this will create an array of int pointers (int*) of two size, In your code in outer-array allocation you have allocated memory for two int objects as 2 * sizeof(int) whereas you need memory to store addresses. total memory bytes you need to allocate should be 2 * sizeof(int*) (this is poor typo mistake).
You can picture above allocation as:
343 347
+----+----+
data1.a_collision---►| ? | ? |
+----+----+
? - means garbage value, malloc don't initialize allocate memory
It has allocated two memory cells each can store address of int
In picture I have assumed that size of int* is 4 bytes.
Additionally, you should notice I didn't typecast returned address from malloc function because it is implicitly typecast void* is generic and can be assigned to any other types of pointer type (in fact in C we should avoid typecasting you should read more from Do I cast the result of malloc?).
Now step -2: Allocate memory for each rows as an array of length number of columns you need in array that is = HASH_TABLE_SIZE. So you need loop for number of rows(not for HASH_TABLE_SIZE) to allocate array for each rows, as below:
for(int i = 0; i < 2; i++)
// ^^^^ notice
data1.a_collision[i] = malloc(HASH_TABLE_SIZE * sizeof(int));
// ^^^^^
Now in each rows you are going to store int for array of ints of length HASH_TABLE_SIZE you need memory bytes = HASH_TABLE_SIZE * sizeof(int). You can picture it as:
Diagram
data1.a_collision = 342
|
▼ 201 205 209 213
+--------+ +-----+-----+-----+-----+
343 | | | ? | ? | ? | ? | //for i = 0
| |-------| +-----+-----+-----+-----+
| 201 | +-----------▲
+--------+ 502 506 510 514
| | +-----+-----+-----+-----+
347 | | | ? | ? | ? | ? | //for i = 1
| 502 |-------| +-----+-----+-----+-----+
+--------+ +-----------▲
data1.a_collision[0] = 201
data1.a_collision[1] = 502
In picture I assuming HASH_TABLE_SIZE = 4 and size of int= 4 bytes, note address's valuea
Now these are correct allocation steps.
Deallocate memory:
Other then allocation your deallocation steps are wrong!
Remember once you have called free on some pointer you can't access that pointer ( pr memory via other pointer also), doing this calls undefined behavior—it is an illegal memory instruction that can be detected at runtime that may causes—a segmentation fault as well or Heap Corruption Detected.
Correct deallocation steps are reverse of allocation as below:
for(int i = 0; i < 2; i++)
free(data1.a_collision[i]); // free memory for each rows
free(data1.a_collision); //free for address of rows.
Further more this is one way to allocate memory for two dimension array something like you were trying to do. But there is better way to allocate memory for complete 2D array continuously for this you should read "Allocate memory 2d array in function C" (to this linked answer I have also given links how to allocate memory for 3D arrays).
Here is a start:
Your "outer array" has space for two integers, not two pointers to integer.
Is HASH_TABLE_SIZE equal to 2? Otherwise, your first for loop will write outside the array you just allocated.
There are several issues :
The first allocation is not correct, you should alloc an array of (int *) :
#define DIM_I 2
#define DIM_J HASH_TABLE_SIZE
data1.a_collision = (int**)malloc(sizeof(int*)*DIM_I);
The second one is not correct any more :
for (int i = 0; i < DIM_I; i++)
data1.a_collision[i] = (int*)malloc(sizeof(int)*DIM_J);
When you free memory, you have to free in LastInFirstOut order:
for (int i = 0; i < DIM_I; i++)
free(data1.a_collision[i]);
free(data1.a_collision);

Dynamically allocating array explain

This is sample code my teacher showed us about "How to dynamically allocate an array in C?". But I don't fully understand this. Here is the code:
int k;
int** test;
printf("Enter a value for k: ");
scanf("%d", &k);
test = (int **)malloc(k * sizeof(int*));
for (i = 0; i < k; i++) {
test[i] = (int*)malloc(k * sizeof(int)); //Initialize all the values
}
I thought in C, to define an array you had to put the [] after the name, so what exactly is int** test; isn't it just a pointer to a pointer? And the malloc() line is also really confusing me.....
According to declaration int** test; , test is pointer to pointer, and the code pice allocating memory for a matrix of int values dynamically using malloc function.
Statement:
test = (int **)malloc(k * sizeof(int*));
// ^^------^^-------
// allocate for k int* values
Allocate continue memory for k pointers to int (int*). So suppose if k = 4 then you gets something like:
temp 343 347 351 355
+----+ +----+----+----+----+
|343 |---►| ? | ? | ? | ? |
+----+ +----+----+----+----+
I am assuming addresses are of four bytes and ? means garbage values.
temp variable assigned returned address by malloc, malloc allocates continues memory blocks of size = k * sizeof(int**) that is in my example = 16 bytes.
In the for loop you allocate memory for k int and assign returned address to temp[i] (location of previously allocated array).
test[i] = (int*)malloc(k * sizeof(int)); //Initialize all the values
// ^^-----^^----------
// allocate for k int values
Note: the expression temp[i] == *(temp + i). So in for loop in each iterations you allocate memory for an array of k int values that looks something like below:
First malloc For loop
--------------- ------------------
temp
+-----+
| 343 |--+
+-----+ |
▼ 201 205 209 213
+--------+ +-----+-----+-----+-----+
343 | |= *(temp + 0) | ? | ? | ? | ? | //for i = 0
|temp[0] |-------| +-----+-----+-----+-----+
| 201 | +-----------▲
+--------+ 502 506 510 514
| | +-----+-----+-----+-----+
347 |temp[1] |= *(temp + 1) | ? | ? | ? | ? | //for i = 1
| 502 |-------| +-----+-----+-----+-----+
+--------+ +-----------▲
| | 43 48 52 56
351 | 43 | +-----+-----+-----+-----+
|temp[2] |= *(temp + 2) | ? | ? | ? | ? | //for i = 2
| |-------| +-----+-----+-----+-----+
+--------+ +-----------▲
355 | |
| 9002 | 9002 9006 9010 9014
|temp[3] | +-----+-----+-----+-----+
| |= *(temp + 3) | ? | ? | ? | ? | //for i = 3
+--------+ | +-----+-----+-----+-----+
+-----------▲
Again ? means garbage values.
Additional points:
1) You are casting returned address by malloc but in C you should avoid it. Read Do I cast the result of malloc? just do as follows:
test = malloc(k* sizeof(int*));
for (i = 0; i < k; i++){
test[i] = malloc(k * sizeof(int));
}
2) If you are allocating memory dynamically, you need to free memory explicitly when your work done with that (after freeing dynamically allocated memory you can't access that memory). Steps to free memory for test will be as follows:
for (i = 0; i < k; i++){
free(test[i]);
}
free(test);
3) This is one way to allocate memory for 2D matrix as array of arrays if you wants to allocate completely continues memory for all arrays check this answer: Allocate memory 2d array in function C
4) If the description helps and you want to learn for 3D allocation Check this answer: Matrix of String or/ 3D char array
Remember that arrays decays to pointers, and can be used as pointers. And that pointers can be used as arrays. In fact, indexing an array can be seen as a form or pointer arithmetics. For example
int a[3] = { 1, 2, 3 }; /* Define and initialize an array */
printf("a[1] = %d\n", a[1]); /* Use array indexing */
printf("*(a + 1) = %d\n", *(a + 1)); /* Use pointer arithmetic */
Both outputs above will print the second (index 1) item in the array.
The same way is true about pointers, they can be used with pointer arithmetic, or used with array indexing.
From the above, you can think of a pointer-to-pointer-to.type as an array-of-arrays-of-type. But that's not the whole truth, as they are stored differently in memory. So you can not pass an array-of-arrays as argument to a function which expects a pointer-to-pointer. You can however, after you initialized it, use a pointer-to-pointer with array indexing like normal pointers.
malloc is used to dynamically allocate memory to the test variable think of the * as an array and ** as an array of arrays but rather than passing by value the pointers are used to reference the memory address of the variable. When malloc is called you are allocating memory to the test variable by getting the size of an integer and multiplying by the number of ints the user supplies, because this is not known before the user enters this.
Yes it is perfectly Ok. test is pointer to pointer and so test[i] which is equivalent to writing test + i will be a pointer. For better understanding please have a look on this c - FAQ.
Yes indeed, int** is a pointer to a pointer. We can also say it is an array of pointers.
test = (int **) malloc(k * sizeof(int*));
This will allocate an array of k pointers first. malloc dynamically allocates memory.
test[i] = (int*) malloc(k * sizeof(int));
This is not necessary as it is enough to
test[i] = (int*) malloc(sizeof(int*));
Here we allocate each of the array places to point to a valid memory. However for base types like int this kind of allocation makes no sense. It is usefull for larger types (structs).
Each pointer can be accessed like an array and vice versa for example following is equivalent.
int a;
test[i] = &a;
(test + i) = &a;
This could be array test in memory that is allocated beginning at offset 0x10000000:
+------------+------------+
| OFFSET | POINTER |
+------------+------------+
| 0x10000000 | 0x20000000 | test[0]
+------------+------------+
| 0x10000004 | 0x30000000 | test[1]
+------------+------------+
| ... | ...
Each element (in this example 0x2000000 and 0x30000000) are pointers to another allocated memory.
+------------+------------+
| OFFSET | VALUE |
+------------+------------+
| 0x20000000 | 0x00000001 | *(test[0]) = 1
+------------+------------+
| ...
+------------+------------+
| 0x30000000 | 0x00000002 | *(test[1]) = 2
+------------+------------+
| ...
Each of the values contains space for sizeof(int) only.
In this example, test[0][0] would be equivalent to *(test[0]), however test[0][1] would not be valid since it would access memory that was not allocted.
For every type T there exists a type “pointer to T”.
Variables can be declared as being pointers to values of various types, by means of the * type declarator. To declare a variable as a pointer, precede its name with an asterisk.
Hence "for every type T" also applies to pointer types there exists multi-indirect pointers like char** or int*** and so on. There exists also "pointer to array" types, but they are less common than "array of pointer" (http://en.wikipedia.org/wiki/C_data_types)
so int** test declares an array of pointers which points to "int arrays"
in the line test = (int **)malloc(k*sizeof(int*)); puts enough memory aside for k amount of (int*)'s
so there are k amount of pointers to, each pointing to...
test[i] = (int*)malloc(k * sizeof(int)); (each pointer points to an array with the size of k amounts of ints)
Summary...
int** test; is made up of k amount of pointers each pointing to k amount of ints.
int** is a pointer to a pointer of int. take a look at "right-left" rule

keeping track of how much memory malloc has allocated

After a quick scan of related questions on SO, I have deduced that there's no function that would check the amount of memory that malloc has allocated to a pointer. I'm trying to replicate some of std::string basic functionality (mainly dynamic size) using simple char*'s in C and don't want to call realloc all the time. I guess I'll need to keep track of how much memory has been allocated. In order to do that, I'm considering creating a typedef that will contain the string itself and an integer with the amount of memory currently allocated, something like this:
typedef struct {
char * str;
int mem;
} my_string_t;
Is that an optimal solution, or perhaps you can suggest something that will bear better results? Thanks in advance for your help.
You will want to allocate the space for both the length and the string in the same block of memory. This may be what you intended with your struct, but you have reserved space for only a pointer to the string.
There must be space allocated to contain the characters of the string.
For example:
typedef struct
{
int num_chars;
char string[];
} my_string_t;
my_string_t * alloc_my_string(char *src)
{
my_string_t * p = NULL;
int N_chars = strlen(src) + 1;
p = malloc( N_chars + sizeof(my_string_t));
if (p)
{
p->num_chars = N_chars;
strcpy(p->string, src);
}
return p;
}
In my example, to access the pointer to your string, you address the string member of the my_string_t:
my_string_t * p = alloc_my_string("hello free store.");
printf("String of %d bytes is '%s'\n", p->num_chars, p->string);
Be careful to realize that you are obtaining the pointer for the string as a consequence of allocating space to store the characters. The resource you are allocating is the storage for the characters, the pointer obtained is a reference to the allocated storage.
In my example, the memory allocated is laid out sequentially as follows:
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 00 | 00 | 00 | 11 | 'h'| 'e'| 'l'| 'l'| 'o'| 20 | 'f'| 'r'| 'e'| 'e'| 20 | 's'| 't'| 'o'| 'r'| 'e'| '.'| 00 |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
^^ ^
|| |
p| |
p->num_chars p->string
Notice that the value of p->string is not stored in the allocated memory, it is four bytes from the beginning of the allocated memory, immediately subsequent to the (presumed 32-bit, four-byte) integer.
Your compiler may require that you declare the flexible C array as:
typedef struct
{
int num_chars;
char string[0];
} my_string_t;
but the version lacking the zero is supposedly C99-compliant.
You can accomplish the equivalent thing with no array member as follows:
typedef struct
{
int num_chars;
} mystr2;
char * str_of_mystr2(mystr2 * ms)
{
return (char *)(ms + 1);
}
mystr2 * alloc_mystr2(char *src)
{
mystr2* p = NULL;
size_t N_chars = strlen(src) + 1;
if (N_chars num_chars = (int)N_chars;
strcpy(str_of_mystr2(p), src);
}
return p;
}
printf("String of %d bytes is '%s'\n", p->num_chars, str_of_mystr2 (p));
In this second example, the value equivalent to p->string is calculated by str_of_mystr2(). It will have approximately the same value as the first example, depending on how the end of structs are packed by your compiler settings.
While some would suggest tracking the length in a size_t I would look up some old Dr. Dobb's article on why I disagree. Supporting values greater than INT_MAX is of doubtful value to your program's correctness. By using an int, you can write assert(p->num_chars >= 0); and have that test something. With an unsigned, you would write the equivalent test something like assert(p->num_chars < UINT_MAX / 2); As long as you write code which contains checks on run-time data, using a signed type can be useful.
On the other hand, if you are writing a library which handles strings in excess of UINT_MAX / 2 characters, I salute you.
This is the obvious solution. And while you are at it, you might want to have a struct member that maintains the amount of allocated memory actually in use. This will avoid having to call strlen() all the time, and would enable you to support non null-terminated strings, as the C++ std::string class does.
That is how it was done in the Pleistocene, and that's how you should do it today. You are dead on the money that malloc does not offer any portable, supported, mechanism to query the size of an allocated block.
A more common way is to wrap malloc (and realloc) and keep a list of sizes and pointers
That way you don't need to change any string functions.
write wrapper functions. If you are using malloc then you should do that anyway.
For an example look in "writing solid code"
I think you could use malloc_usable_size.

Resources