Shown below is a piece of code written in C with an intention of reallocating memory inside a function. I would like to know why this crashes during execution and also an efficient way to do it.
int main()
{
int *kn_row, *kn_col, *uk_row, *uk_col;
double *kn_val, *uk_val;
kn_row=NULL, kn_col=NULL, kn_val=NULL, uk_row=NULL, uk_col=NULL, uk_val=NULL;
evaluate_matrices(&kn_row, &kn_col, &kn_val, &uk_row, &uk_col, &uk_val);
........
}
I tried with two types of function:
evaluate_matrices(int **ptr_kn_row, int **ptr_kn_col, double **ptr_kn_val,
int **ptr_uk_row, int **ptr_uk_col, double **ptr_uk_val)
{
........
/* i,j, and k are calculated */
*ptr_kn_row=(int*)realloc(*ptr_kn_row,k*sizeof(int));
*ptr_kn_col=(int*)realloc(*ptr_kn_col,k*sizeof(int));
*ptr_kn_val=(double*)realloc(*ptr_kn_val,k*sizeof(double));
/* and*/
*ptr_uk_row=(int*)realloc(*ptr_uk_row,j*sizeof(int));
*ptr_uk_col=(int*)realloc(*ptr_uk_col,i*sizeof(int));
*ptr_uk_val=(double*)realloc(*ptr_uk_val,i*sizeof(double));
}
The other way is:
evaluate_matrices(int **ptr_kn_row, int **ptr_kn_col, double **ptr_kn_val,
int **ptr_uk_row, int **ptr_uk_col, double **ptr_uk_val)
{
int *temp1,*temp2,*temp3,*temp4;
double *temp5,*temp6;
..........
temp1 =(int*)realloc(*ptr_kn_row, k*sizeof(*temp1));
if(temp1){*ptr_kn_row = temp1;}
temp2 =(int*)realloc(*ptr_kn_col, k*sizeof(*temp2));
if(temp2){*ptr_kn_col = temp2;}
temp5 =(double*) realloc(*ptr_kn_val, k*sizeof(*temp5));
if(temp5){*ptr_kn_val = temp5;}
......
temp3 = (int*)realloc(*ptr_uk_row, j*sizeof(*temp3));
if(temp3){*ptr_uk_row = temp3;}
temp4 = (int*)realloc(*ptr_uk_col, i*sizeof(*temp4));
if(temp4){*ptr_uk_col = temp4;}
temp6 = (double*)realloc(*ptr_uk_val, i*sizeof(*temp6));
if(temp6){*ptr_uk_val = temp6;}
}
The first function is a minor disaster if memory allocation fails. It overwrites the pointer to the previously allocated space with NULL, thereby leaking the memory. If your strategy for handling out of memory is 'exit at once', this barely matters. If you were planning to release the memory, then you've lost it — bad luck.
Consequently, the second function is better. You're probably going to need to keep track of array sizes, though, so I suspect you'd do better with structures rather than raw pointers, where the structure will contain size information as well as the pointers to the allocated data. You must be able to determine how much space is allocated for each array, somehow.
You also need to keep track of which, if any, of the arrays could not be reallocated – so you don't try to access unallocated space.
I spy with my little eye:
*ptr_kn_val=(double*)realloc(*ptr_kn_val,k*sizeof(int));
^^^^^^^^^^^
I'm sure you meant sizeof(double) and this is just a copy-paste error.
On many systems, int is smaller than double, so if that's the case on yours, this is very likely to be the cause of your crash. That is, undefined behaviour at some point after writing past the end of the memory block.
Related
Normally if I want to allocate a zero initialized array I would do something like this:
int size = 1000;
int* i = (int*)calloc(sizeof int, size));
And later my code can do this to check if an element in the array has been initialized:
if(!i[10]) {
// i[10] has not been initialized
}
However in this case I don't want to pay the upfront cost of zero initializing the array because the array may be quite large (i.e. gigs). But in this case I can afford to use as much memory as I want memory.
I think I remember that there is a technique to keep track of the elements in the array that have been initialed, without paying any up front cost, that also allows O(1) cost (not amortized with a hash table). My recollection is that the technique requires an extra array of the same size.
I think it was something like this:
int size = 1000;
int* i = (int*)malloc(size*sizeof int));
int* i_markers = (int*)malloc(size*sizeof int));
If an entry in the array is used it is recorded like this:
i_markers[10] = &i[10];
And then it's use can be checked later like this:
if(i_markers[10] != &i[10]) {
// i[10] has not been initialized
}
Of course this isn't quite right because i_markers[10] could have been randomly set to &i[10].
Can anyone out there remind me of the technique?
Thank you!
I think I remembered it.
Is this right? Is there a better way or are there variations on this?
Thanks again.
(This was updated to be the right answer)
struct lazy_array {
int size;
int* values;
int* used;
int* back_references;
int num_used;
};
struct lazy_array* create_lazy_array(int size) {
struct lazy_array* lazy = (struct lazy_array*)malloc(sizeof(lazy_array));
lazy->size = 1000;
lazy->values = (int*)malloc(size*sizeof int));
lazy->used = (int*)malloc(size*sizeof int));
lazy->back_references = (int*)malloc(size*sizeof int));
lazy->num_used = 0;
return lazy;
}
void use_index(struct lazy_array* lazy, int index, int value) {
lazy->values[index] = value;
if(is_index_used(lazy, index))
return;
lazy->used[index] = lazy->used;
lazy->back_references[lazy->used[index]] = index;
++lazy->used;
}
int is_index_used(struct lazy_array* lazy, int index) {
return lazy->used[index] < lazy->num_used &&
lazy->back_references[lazy->used[index]] == index);
}
On most compilers/standard libraries I know of, large calloc requests (and malloc for that matter) are implemented in terms of the OS's bulk memory request logic. On Linux, that means a copy-on-write mmap-ing of the zero page, and on Windows it means VirtualAlloc. In both cases, the OS gives you memory that is already zero, and calloc recognizes this; it only explicitly zeroes the memory if it was doing a small calloc from the small allocation heap. So until you write to any given page in the allocation, it's zero "for free". No need to be explicitly lazy; the allocator is being lazy for you.
For small allocations it does need to memset to clear the memory, but then, it's fairly cheap to memset a few thousand bytes (or tens of thousands) of bytes. For the really large allocations where zeroing would be costly, you're getting OS provided memory that's zero-ed for free (separate from the rest of the heap); e.g. for dlmalloc in typical configuration, allocations beyond 256 KB will always be freshly mmap-ed and munmap-ed, which means you're getting freshly mapped copy-on-write mappings of the zero page (the cost to zero them being deferred until you perform a write somewhere in the page, and paid whether you got the 256 KB via malloc or calloc).
If you want better guarantees about zeroing, or to get free zeroing on smaller allocations (though it's more wasteful the closer to one page you get), you can just explicitly do what malloc/calloc do implicitly and use the OS provided zero-ed memory, e.g. replace:
sometype *x = calloc(num, sizeof(*x)); // Or the similar malloc(num * sizeof(*x));
if (!x) { ... do error handling stuff ... }
...
free(x);
with either:
sometype *x = mmap(NULL, num * sizeof(*x), PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (x == MAP_FAILED) { ... do error handling stuff ... }
...
munmap(x, num * sizeof(*x));
or on Windows:
sometype *x = VirtualAlloc(NULL, num * sizeof(*x), MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
if (!x) { ... do error handling stuff ... }
...
VirtualFree(x, 0, MEM_RELEASE); // VirtualFree with MEM_RELEASE only takes size of 0
It gets you the same lazy initialization (though on Windows, this may mean that the pages have simply been lazily zero-ed in the background between requests, so they'd be "real" zeroes when you got them, vs. *NIX where they'd be CoW-ed from the zero page, so the get zero-ed live when you write to them).
This can be done, although it relies on undefined behavior. It is called a lazy array.
The trick is to use a reverse lookup table. Every time you store a value, you store its index in the lazy array:
void store(int value)
{
if (is_stored(value)) return;
lazy_array[value] = next_index;
table[next_index] = value;
++next_index;
}
int is_stored(int value)
{
if (lazy_array[value]<0) return 0;
if (lazy_array[value]>=next_index) return 0;
if (table[lazy_array[value]]!=value) return 0;
return 1;
}
The idea is that if the value has not been stored in the lazy array, then the lazy_array[value] will be garbage. Its value will either be an invalid index or a valid index into your reverse lookup table. If it is an invalid index, then you immediately know nothing has been stored there. If it is a valid index, then you check your table. If you have a match then the value was stored, otherwise it wasn't.
The downside is that reading from uninitialized memory is undefined behavior. Based on my experience, it will probably work, but there are no guarantees.
There are many possible techniques. Everything depends on your task. For instance, you can remember maximal number of initialized element max of your array. I.e. if your algorithm can garantee, that all elements from 0 to max ara initialized, you can use simple check if (0 <= i && i <= max) or something like this.
But if your algorithms need to initialize arbitrary elements (i.e. random access), you need general solution. For instance, more effective data structure (not simple array, but sparse array or something like this).
So, add more details about your task. I expect we'll find the best solution for it.
I'm writing a simple language that compiles to C, and I want to implement smart pointers. I need a bit of help with that though, as I can't seem to think of how I would go around it, or if it's even possible. My current idea is to free the pointer when it goes out of scope, the compiler would handle inserting the frees. This leads to my questions:
How would I tell when a pointer has gone out of scope?
Is this even possible?
The compiler is written in C, and compiles to C. I thought that I could check when the pointer goes out of scope at compile-time, and insert a free into the generated code for the pointer, i.e:
// generated C code.
int main() {
int *x = malloc(sizeof(*x));
*x = 5;
free(x); // inserted by the compiler
}
The scoping rules (in my language) are exactly the same as C.
My current setup is your standard compiler, first it lexes the file contents, then it parses the token stream, semantically analyzes it, and then generates code to C. The parser is a recursive descent parser. I would like to avoid something that happens on execution, i.e. I want it to be a compile-time check that has little to no overhead, and isn't full blown garbage collection.
For functions, each { starts a new scope, and each } closes the corresponding scope. When a } is reached, the variables inside that block go out-of-scope. Members of structs go out of scope when the struct instance goes out of scope. There's a couple exceptions, such as temporary objects go out-of-scope at the next ;, and compilers silently put for loops inside their own block scope.
struct thing {
int member;
};
int foo;
int main() {
thing a;
{
int b = 3;
for(int c=0; c<b; ++c) {
int d = rand(); //the return value of rand goes out of scope after assignment
} //d and c go out of scope here
} //b goes out of scope here
}//a and its members go out of scope here
//globals like foo go out-of-scope after main ends
C++ tries really hard to destroy objects in the opposite order they're constructed, you should probably do that in your language too.
(This is all from my knowledge of C++, so it might be slightly different from C, but I don't think it is)
As for memory, you'll probably want to do a little magic behind the scenes. Whenever the user mallocs memory, you replace it with something that allocates more memory, and "hide" a reference count in the extra space. It's easiest to do that at the beginning of the allocation, and to keep alignment guarantees, you use something akin to this:
typedef union {
long double f;
void* v;
char* c;
unsigned long long l;
} bad_alignment;
void* ref_count_malloc(int bytes)
{
void* p = malloc(bytes + sizeof(bad_alignment)); //does C have sizeof?
int* ref_count = p;
*ref_count = 1; //now is 1 pointer pointing at this block
return p + sizeof(bad_alignment);
}
When they copy a pointer, you silently add something akin to this before the copy
void copy_pointer(void* from, void* to) {
if (from != NULL)
ref_count_free(free); //no longer points at previous block
bad_alignment* ref_count = to-sizeof(bad_alignment);
++*ref_count; //one additional pointing at this block
}
And when they free or a pointer goes out of scope, you add/replace the call with something like this:
void ref_count_free(void* ptr) {
if(ptr) {
bad_alignment* ref_count = ptr-sizeof(bad_alignment);
if (--*ref_count == 0) //if no more pointing at this block
free(ptr);
}
}
If you have threads, you'll have to add locks to all that. My C is rusty and the code is untested, so do a lot of research on these concepts.
The problem is slightly more difficult, since your code is straightforward, but... what if another pointer is made to point to the same place as x?
// generated C code.
int main() {
int *x = malloc(sizeof(*x));
int *y = x;
*x = 5;
free(x); // inserted by the compiler, now wrong
}
You doubtlessly will have a heap structure, in which each block has a header that tells a) whether the block is in use, and b) the size of the block. This can be achieved with a small structure, or by using the highest bit for a) in the integer value for b) [is this a 64bit compiler or 32bit?]. For simplicity, lets consider:
typedef struct {
bool allocated: 1;
size_t size;
} BlockHeader;
You would have to add another field to that small structure, which would be a reference count. Each time a pointer points to that block in the heap, you increment the reference count. When a pointer stops pointing to a block, then its reference count is decremented. If it reaches 0, then it can be compacted or whatever. The use of the allocated field has now gone.
typedef struct {
size_t size;
size_t referenceCount;
} BlockHeader;
Reference counting is quite simple to implement, but comes with a down side: it means there is overhead each time the value of a pointer changes. Still, is the simplest scheme to work, and that's why some programming languages still use it, such as Python.
In the following function, I am parsing string form a linked list and giving values to struct array. Is there any way that let me not use mallocs inside while loop.I can not handle glibc errors, so looking for other way.I tried to use char arrays instead of char* for the struct fields. But I am getting seg error. Actually the function is working, but I ahve to call the function 15000 times later, so I want to make sure it won't cause any memory trouble that time.
struct CoordNode
{
int resNum;
double coordX;
double coordY;
double coordZ;
char atomName[4];
};
void parseCrdList()
{
int resNum=1;
int tAtomNum,i;
char *tcoordX, *tcoordY, *tcoordZ, *tatomName, tresNum[5];
ccur_node=headCoord_node->next;
struct CoordNode *t;
t=malloc(numofRes*sizeof(struct CoordNode));
i=0;
while (ccur_node!=NULL)
{
tresNum=malloc(5*sizeof(char));
memcpy(tresNum,ccur_node->crdRow+26,4);
resNum=atoi(tresNum);
t[i].resNum=resNum;
tcoordX=malloc(8*sizeof(char));
memcpy(tcoordX,ccur_node->crdRow+35,7);
tcoordY=malloc(8*sizeof(char));
memcpy(tcoordY,ccur_node->crdRow+43,7);
tcoordZ=malloc(8*sizeof(char));
memcpy(tcoordZ,ccur_node->crdRow+51,7);
t[i].coordX=strtod(tcoordX,NULL);
t[i].coordY=strtod(tcoordY,NULL);
t[i].coordZ=strtod(tcoordZ,NULL);
tatomName=malloc(4*sizeof(char));
memcpy(tatomName,ccur_node->crdRow+17,3);
strcpy(t[i].atomName,tatomName);
old_ccur_node=ccur_node;
ccur_node=ccur_node->next;
//free(old_ccur_node);
i++;
}
numofRes=i;
addCoordData(t);
//free(t);
t=NULL;
}
A couple of thoughts and guesses.
First, as I mentioned before, sizeof(char) is always 1 byte in C, it's actually standard byte definition in C. So remove those as completely unnecessary and hard to read.
Back to the main problem.
You never use array of chars bigger than 8, so just make it statically 8 bytes long. If you have to call you function 15k times, that will save you tons of time(malloc takes time to allocate memory for you).
Given information from the question I guess your segfault was the cause of not initialising memory you allocated with malloc or reserved for auto char [8] with its declaration
1. You allocate (or 2nd version - declare 8-byte array) 8 bytes. It works fine. But you get 8 bytes full of trash here.
2. You copy 7 bytes from your list. And that's fine, too. But you forget to NULL terminate, so if you try to print it out back, you get segfault.EDIT If it works then probably you got lucky, because it shouldn't.
Solution
Replace char * witch char [8], remove all mallocs and frees corresponding to those char *, null terminate all your char[8] after strcpy, strncpy, or memcpy (whatever your choice is, depending on how confident you are that your data in list is correct) data to them.
Check your code with valgrind before further use, too.
It's surprising you saying this function worked for you. From what I see from your code I had a lot of memory leaks to begin with because non of those 8-byte mallocs was ever fried.
The second thing is that it looks like you're allocating array of CoordNode before knowing the actual number of data records to parse. I added proper numofRes calculation before allocation.
Since you don't modify input data you don't actually need all those mallocs and memcpy's, you can use crdRow in strtod() immediatly, assuming it has char * type.
The last thing: it's generally a bad practice to do allocation in one place and freeing data in another. So it's better you free your headCoord_node structure in a place where it was allocated, after parsing it. The decision of freeing t depends on how addCoordData(t) treats its parameter.
void parseCrdList()
{
struct CoordNode *t;
int i;
// count number of records to parse
numofRes = 0;
ccur_node = headCoord_node->next;
while (ccur_node != NULL)
{
numofRes++;
ccur_node=ccur_node->next;
}
t=malloc(numofRes*sizeof(struct CoordNode));
i=0;
ccur_node = headCoord_node->next;
while (ccur_node!=NULL)
{
t[i].resNum=atoi(ccur_node->crdRow+26);
t[i].coordX=strtod(ccur_node->crdRow+35,NULL);
t[i].coordY=strtod(ccur_node->crdRow+43,NULL);
t[i].coordZ=strtod(ccur_node->crdRow+51,NULL);
strncpy(t[i].atomName,ccur_node->crdRow+17,4);
ccur_node=ccur_node->next;
i++;
}
numofRes=i;
addCoordData(t);
//free(t); // <<< it depends on how addCoordData treats t
}
I have a dynamic array of structures, so I thought I could store the information about the array in the first structure.
So one attribute will represent the amount of memory allocated for the array and another one representing number of the structures actually stored in the array.
The trouble is, that when I put it inside a function that fills it with these structures and tries to allocate more memory if needed, the original array gets somehow distorted.
Can someone explain why is this and how to get past it?
Here is my code
#define INIT 3
typedef struct point{
int x;
int y;
int c;
int d;
}Point;
Point empty(){
Point p;
p.x=1;
p.y=10;
p.c=100;
p.d=1000; //if you put different values it will act differently - weird
return p;
}
void printArray(Point * r){
int i;
int total = r[0].y+1;
for(i=0;i<total;i++){
printf("%2d | P [%2d,%2d][%4d,%4d]\n",i,r[i].x,r[i].y,r[i].c,r[i].d);
}
}
void reallocFunction(Point * r){
r=(Point *) realloc(r,r[0].x*2*sizeof(Point));
r[0].x*=2;
}
void enter(Point* r,int c){
int i;
for(i=1;i<c;i++){
r[r[0].y+1]=empty();
r[0].y++;
if( (r[0].y+2) >= r[0].x ){ /*when the amount of Points is near
*the end of allocated memory.
reallocate the array*/
reallocFunction(r);
}
}
}
int main(int argc, char** argv) {
Point * r=(Point *) malloc ( sizeof ( Point ) * INIT );
r[0]=empty();
r[0].x=INIT; /*so here I store for how many "Points" is there memory
//in r[0].y theres how many Points there are.*/
enter(r,5);
printArray(r);
return (0);
}
Your code does not look clean to me for other reasons, but...
void reallocFunction(Point * r){
r=(Point *) realloc(r,r[0].x*2*sizeof(Point));
r[0].x*=2;
r[0].y++;
}
The problem here is that r in this function is the parameter, hence any modifications to it are lost when the function returns. You need some way to change the caller's version of r. I suggest:
Point * // Note new return type...
reallocFunction(Point * r){
r=(Point *) realloc(r,r[0].x*2*sizeof(Point));
r[0].x*=2;
r[0].y++;
return r; // Note: now we return r back to the caller..
}
Then later:
r = reallocFunction(r);
Now... Another thing to consider is that realloc can fail. A common pattern for realloc that accounts for this is:
Point *reallocFunction(Point * r){
void *new_buffer = realloc(r, r[0].x*2*sizeof(Point));
if (!new_buffer)
{
// realloc failed, pass the error up to the caller..
return NULL;
}
r = new_buffer;
r[0].x*=2;
r[0].y++;
return r;
}
This ensures that you don't leak r when the memory allocation fails, and the caller then has to decide what happens when your function returns NULL...
But, some other things I'd point out about this code (I don't mean to sound like I'm nitpicking about things and trying to tear them apart; this is meant as constructive design feedback):
The names of variables and members don't make it very clear what you're doing.
You've got a lot of magic constants. There's no explanation for what they mean or why they exist.
reallocFunction doesn't seem to really make sense. Perhaps the name and interface can be clearer. When do you need to realloc? Why do you double the X member? Why do you increment Y? Can the caller make these decisions instead? I would make that clearer.
Similarly it's not clear what enter() is supposed to be doing. Maybe the names could be clearer.
It's a good thing to do your allocations and manipulation of member variables in a consistent place, so it's easy to spot (and later, potentially change) how you're supposed to create, destroy and manipulate one of these objects. Here it seems in particular like main() has a lot of knowledge of your structure's internals. That seems bad.
Use of the multiplication operator in parameters to realloc in the way that you do is sometimes a red flag... It's a corner case, but the multiplication can overflow and you can end up shrinking the buffer instead of growing it. This would make you crash and in writing production code it would be important to avoid this for security reasons.
You also do not seem to initialize r[0].y. As far as I understood, you should have a r[0].y=0 somewhere.
Anyway, you using the first element of the array to do something different is definitely a bad idea. It makes your code horribly complex to understand. Just create a new structure, holding the array size, the capacity, and the pointer.
Which is considered better style?
int set_int (int *source) {
*source = 5;
return 0;
}
int main(){
int x;
set_int (&x);
}
OR
int *set_int (void) {
int *temp = NULL;
temp = malloc(sizeof (int));
*temp = 5;
return temp;
}
int main (void) {
int *x = set_int ();
}
Coming for a higher level programming background I gotta say I like the second version more. Any, tips would be very helpful. Still learning C.
Neither.
// "best" style for a function which sets an integer taken by pointer
void set_int(int *p) { *p = 5; }
int i;
set_int(&i);
Or:
// then again, minimise indirection
int an_interesting_int() { return 5; /* well, in real life more work */ }
int i = an_interesting_int();
Just because higher-level programming languages do a lot of allocation under the covers, does not mean that your C code will become easier to write/read/debug if you keep adding more unnecessary allocation :-)
If you do actually need an int allocated with malloc, and to use a pointer to that int, then I'd go with the first one (but bugfixed):
void set_int(int *p) { *p = 5; }
int *x = malloc(sizeof(*x));
if (x == 0) { do something about the error }
set_int(x);
Note that the function set_int is the same either way. It doesn't care where the integer it's setting came from, whether it's on the stack or the heap, who owns it, whether it has existed for a long time or whether it's brand new. So it's flexible. If you then want to also write a function which does two things (allocates something and sets the value) then of course you can, using set_int as a building block, perhaps like this:
int *allocate_and_set_int() {
int *x = malloc(sizeof(*x));
if (x != 0) set_int(x);
return x;
}
In the context of a real app, you can probably think of a better name than allocate_and_set_int...
Some errors:
int main(){
int x*; //should be int* x; or int *x;
set_int(x);
}
Also, you are not allocating any memory in the first code example.
int *x = malloc(sizeof(int));
About the style:
I prefer the first one, because you have less chances of not freeing the memory held by the pointer.
The first one is incorrect (apart from the syntax error) - you're passing an uninitialised pointer to set_int(). The correct call would be:
int main()
{
int x;
set_int(&x);
}
If they're just ints, and it can't fail, then the usual answer would be "neither" - you would usually write that like:
int get_int(void)
{
return 5;
}
int main()
{
int x;
x = get_int();
}
If, however, it's a more complicated aggregate type, then the second version is quite common:
struct somestruct *new_somestruct(int p1, const char *p2)
{
struct somestruct *s = malloc(sizeof *s);
if (s)
{
s->x = 0;
s->j = p1;
s->abc = p2;
}
return s;
}
int main()
{
struct somestruct *foo = new_somestruct(10, "Phil Collins");
free(foo);
return 0;
}
This allows struct somestruct * to be an "opaque pointer", where the complete definition of type struct somestruct isn't known to the calling code. The standard library uses this convention - for example, FILE *.
Definitely go with the first version. Notice that this allowed you to omit a dynamic memory allocation, which is SLOW, and may be a source of bugs, if you forget to later free that memory.
Also, if you decide for some reason to use the second style, notice that you don't need to initialize the pointer to NULL. This value will either way be overwritten by whatever malloc() returns. And if you're out of memory, malloc() will return NULL by itself, without your help :-).
So int *temp = malloc(sizeof(int)); is sufficient.
Memory managing rules usually state that the allocator of a memory block should also deallocate it. This is impossible when you return allocated memory. Therefore, the second should be better.
For a more complex type like a struct, you'll usually end up with a function to initialize it and maybe a function to dispose of it. Allocation and deallocate should be done separately, by you.
C gives you the freedom to allocate memory dynamically or statically, and having a function work only with one of the two modes (which would be the case if you had a function that returned dynamically allocated memory) limits you.
typedef struct
{
int x;
float y;
} foo;
void foo_init(foo* object, int x, float y)
{
object->x = x;
object->y = y;
}
int main()
{
foo myFoo;
foo_init(&foo, 1, 3.1416);
}
In the second one you would need a pointer to a pointer for it to work, and in the first you are not using the return value, though you should.
I tend to prefer the first one, in C, but that depends on what you are actually doing, as I doubt you are doing something this simple.
Keep your code as simple as you need to get it done, the KISS principle is still valid.
It is best not to return a piece of allocated memory from a function if somebody does not know how it works they might not deallocate the memory.
The memory deallocation should be the responsibility of the code allocating the memory.
The first is preferred (assuming the simple syntax bugs are fixed) because it is how you simulate an Out Parameter. However, it's only usable where the caller can arrange for all the space to be allocated to write the value into before the call; when the caller lacks that information, you've got to return a pointer to memory (maybe malloced, maybe from a pool, etc.)
What you are asking more generally is how to return values from a function. It's a great question because it's so hard to get right. What you can learn are some rules of thumb that will stop you making horrid code. Then, read good code until you internalize the different patterns.
Here is my advice:
In general any function that returns a new value should do so via its return statement. This applies for structures, obviously, but also arrays, strings, and integers. Since integers are simple types (they fit into one machine word) you can pass them around directly, not with pointers.
Never pass pointers to integers, it's an anti-pattern. Always pass integers by value.
Learn to group functions by type so that you don't have to learn (or explain) every case separately. A good model is a simple OO one: a _new function that creates an opaque struct and returns a pointer to it; a set of functions that take the pointer to that struct and do stuff with it (set properties, do work); a set of functions that return properties of that struct; a destructor that takes a pointer to the struct and frees it. Hey presto, C becomes much nicer like this.
When you do modify arguments (only structs or arrays), stick to conventions, e.g. stdc libraries always copy from right to left; the OO model I explained would always put the structure pointer first.
Avoid modifying more than one argument in one function. Otherwise you get complex interfaces you can't remember and you eventually get wrong.
Return 0 for success, -1 for errors, when the function does something which might go wrong. In some cases you may have to return -1 for errors, 0 or greater for success.
The standard POSIX APIs are a good template but don't use any kind of class pattern.