Specify layout of global variables in C - c

Consider this piece of code, where two global variables are defined:
int a;
int b;
As far as I know, the compiler may or may not place a and b in adjacent memory locations (please let me know if this is incorrect). For example, with GCC one may compile with -fdata-sections and reorder the two sections or whatever.
Is it possible to specify that a and b must be adjacent (in the sense that &a + 1 == &b), in either standard or GNU extended C?
Background: I am making an OpenGL loader, which is literally (omitting casts):
void (*glActiveShaderProgram)(GLuint, GLuint);
void (*glActiveTexture)(GLenum);
...
void load_gl(void (*(*loader)(char *))()) {
glActiveShaderProgram = load("glActiveShaderProgram");
glActiveTexture = load("glActiveTexture");
...
}
Simple enough, but every call to load compiles into a call to load. Since there is a relatively large number of functions to load, that can take up a lot of code space. (That is the reason I dropped glad.)
So I had something like this, which reduces binary size by ~30kB, which is extremely important for me:
char names[] = "glActiveShaderProgram glActiveTexture ...";
char *p = names, *pp;
for (int i = 0; i < COUNT; ++i) {
pp = strchr(names, ' ');
*pp = '\0';
(&glActiveShaderProgram)[i] = load(p);
p = pp + 1;
}
But this does assume the specific layout of these function pointers. Currently I wrap the function pointers in a struct which is type-punned into an array of pointers, like this:
union { struct {
void (*glActiveShaderProgram)(GLuint, GLuint);
void (*glActiveTexture)(GLenum);
...
}; void (*table[COUNT])(); } gl;
But then one #define for every function is required to make the user happy. So I wonder if there exists some more elegant way to specify the layout of global variables.

As Ted suggested in the comment. You could put the variables next to each other inside an array?
int ab[2] = {a, b};
Another way to ensure adjacent memory placement is with a packed struct. example
more info

Related

Is it possible to assign data to this "static array" in C using for-loop?

There is a line of code inside a method similar to:
static char data[] = "123456789";
I want to fill the above data array with a million characters not just nine.
But since it is tedious to type it, I want to do that in for loop.
Is that possible to do it keeping it as "static char data[]"?
edit:
static char data[1000000];
for(int i=0; i<1000000; i++)
{
data[i] = 1;
}
There are multiple ways to achieve this in C:
you can declare the global static array as uninitialized, write an initialization function and call this function at the beginning of the program. Unlike C++, C does not have a standard way to invoke such an initialisation function at program startup time, yet some compilers might provide an extension for this.
static char data[1000000];
void init_data(void) {
//the loop below will generate the same code as
//memset(data, 1, sizeof data);
for (int i = 0; i < 1000000; i++) {
data[i] = 1;
}
}
int main() {
init_data();
...
}
you can change your program logic so the array can be initialized to 0 instead of 1. This will remove the need for an initialization function and might simplify the code and reduce the executable size.
you can create the initializer for the array using an external program and include its output:
static char data[1000000] = {
#include "init_data.def"
};
you can initialize the array using macros
#define X10(s) s,s,s,s,s,s,s,s,s,s
#define X100(s) X10(s),X10(s),X10(s),X10(s),X10(s),X10(s),X10(s),X10(s),X10(s),X10(s)
#define X1000(s) X100(s),X100(s),X100(s),X100(s),X100(s),X100(s),X100(s),X100(s),X100(s),X100(s)
#define X10000(s) X1000(s),X1000(s),X1000(s),X1000(s),X1000(s),X1000(s),X1000(s),X1000(s),X1000(s),X1000(s)
#define X100000(s) X10000(s),X10000(s),X10000(s),X10000(s),X10000(s),X10000(s),X10000(s),X10000(s),X10000(s),X10000(s)
static char data[1000000] = {
X100000(1), X100000(1), X100000(1), X100000(1), X100000(1),
X100000(1), X100000(1), X100000(1), X100000(1), X100000(1),
};
Note however that this approach will be a stress test for both your compiler and readers of your code. Here are some timings:
clang: 1.867s
gcc: 5.575s
tcc: 0.690s
The last 2 solutions allow for data to be defined as a constant object.
Is that possible to do it keeping it as "static char data[]"?
No, you have to specify the size explicitly. If you wish to compile-time initialize the array rather than assigning to it in run-time with a for loop or memset, you can use tricks such as this.
Another option might be to use dynamic allocation with malloc instead, but then you have to assign everything in run-time.
You can define statically allocated arrays in various ways, incidentally, this has nothing to do with the static keyword, see this if you need more information about static variables. The following discussion won't have anything to do with that, hence I will be omitting your static keyword for simplicity.
An array declared as:
char data[] = "123456789";
is allocated in the stack in the compile time. Compiler can do that since the size of the array is implicitly given with the string "123456789" to be 10 characters, 9 for the data and +1 for the terminating null character.
char data[];
On the other hand, will not compile, and your compiler will complain about missing array sizes. As I said, since this declaration allocates the array in the compile time, your compiler wants to know how much to allocate.
char data[1000000];
This on the other hand will compile just fine. Since now the compiler knows how much to allocate. And you can assign elements as you did in a for loop:
for(int i=0; i<1000000; i++)
{
data[i] = 1;
}
Note:
An array of million chars has quite a respectable size, typically 1Mb, and may overflow your stack. Whether or not that it actually will depends on pretty much everything that it can depend on, but it certainly will rise some eyebrows even if your code works fine. And eventually, if you keep increasing the size you will end up overflowing your buffer.
If you have truly large arrays you need to work with, you can allocate them on the heap, i.e., in the wast empty oceans of your ram.
The part above hopefully should have answered your question. Below is simply an alternative way to assign a fixed value, such as your (1), to a char array, instead of using for loops. This is nothing but a more convenient way (and perhaps a better practice), you are free to ignore it if it causes confusion.
#include <string.h>
#define SIZE 100000
// Create the array, at this point filled with garbage.
static char data[SIZE];
int main( void )
{
// Initialise the array: assigns *integer 1* to each element.
memset( data, 1, sizeof data )
//^___ This single line is equivalent of:
// for ( int i = 0; i < SIZE; i++ )
// {
// data[i] = 1;
// }
.
.
.
return 0;
}

Preventing GCC from merging variables in braced groups

Edit:
Apparently accessing variables inside braced groups after they end is undefined behaviour. Since I don't want to use dynamic allocation for nodes (as suggested by #dbush, #ikegami) I assume the next best way to keep hidden variables (within a function) is generating unique variable names for the nodes (with __LINE__) and 'declaring' without the use of a braced group. The code now reads something like
#define PASTE_(x, y) x ## y
#define PASTE(x, y) PASTE_(x, y)
#define LT_mark_(LABEL, NAME, DELETE)\
struct LifeTime LABEL ={\
.delete=DELETE,\
.prev=lt_head,\
.ref=NAME\
};\
\
lt_head = &LABEL;\
#define LT_mark(NAME, DELETE) LT_mark_(PASTE(lt_, __LINE__), NAME, DELETE)
/Edit
I'm trying to keep records for memory allocated within a function's scope.
Records are kept by a LifeTime structure, which form a linked list. This list is later traversed when returning from said function, in order to automatically free the memory. The lt_head variable is used to keep track of the current head of the list.
struct LifeTime {
void (*delete)(void*);
struct LifeTime *prev;
void *ref;
};
#define LT_mark(NAME, DELETE)\
{\
struct LifeTime _ ={\
.delete=DELETE,\
.prev=lt_head,\
.ref=NAME\
};\
\
lt_head = &_;\
}
int example (){
struct LifeTime *lt_head = NULL;
char *s = malloc(64); LT_mark(s, free);
char *s2 = malloc(64); LT_mark(s2, free);
...
}
Using this code, the temporary variables (named _) within the braced groups created by the LT_mark macro, are created with the same memory address.
I assume the reason for this is, as stated in the answer to this question: In C, do braces act as a stack frame?
that variables with non-overlapping usage lifetimes may be merged if the compiler deems it appropriate.
Is there any way to override this behaviour? I acknowledge it may be impossible (I am using GCC without any optimization flags, so I can't simply remove them), but the actual code I am working with requires that the variables inside these groups are kept afterwards, though hidden from visibility (as braced groups do usually). I considered using __attribute__((used)) but apparently this is only valid for functions and global variables.
The lifetime of a variable is that of its enclosing scope, so when that scope ends the variable no longer exits. Saving the address of that variable and attempting to use it when its lifetime has ended causes undefined behavior.
For example:
int *p;
{
int i=4;
p=&i;
printf("*p=%d\n", *p); // prints *p=4
}
printf("*p=%d\n", *p); // undefined behavior, p points to invalid memory
Inside of the braces, p points to valid memory and can be dereferenced. Outside of the braces p cannot be safely defererenced.
You'll need to do some dynamic allocation to create these structures. Also, this isn't a place where you should be using a macro instead of a function:
void LT_mark(void *p, void (*cleanup)(void *))
{
struct LifeTime *l = malloc(sizeof *l);
l->delete = cleanup;
l->prev = lt_head;
l->ref = p;
lt_head = l;
}
And similarly the cleanup function:
void LT_clean()
{
struct LiftTime *p;
while (lt_head) {
lt_head->delete(lt_head->ref);
p = lt_head->prev;
lt_head = lt_head->prev;
free(p);
}
}
Also, the prev field should be renamed to next, as the existing name is misleading.
Under most circumstances, you'll want to use #dbush's dynamic allocation solution. Since you're presumably using this with dynamic memory allocations of some kind anyway, dynamically allocating the descriptor blocks shouldn't be a huge overhead.
However, under some really restricted circumstances which you will have to police yourself, and assuming that you're not using an antediluvian version of the C compiler, it is possible to do this fairly simply with compound literals. Aside from the C compiler version limitation (C99 or better, which shouldn't be a huge burden), this will work in exactly the same circumstances as your edit#1 using token concatenation to generate a unique name: that is, if no use of the LT_mark macro is inside a braced-block subordinate to the function.
The reason for this restriction -- which, as I said, applies also to your solution with token concatenation -- is that the lifetime of automatic allocations terminates when control exits from the block in which they were declared. This is an essential aspect of C (and many other programming languages), so it's important to be clear about how it works.
Here's a simple example:
int example (){
struct LifeTime *lt_head = NULL;
char *s = malloc(64); LT_mark(s, free);
for (int i = 0; i < 4; ++i) {
/* InnerBlock */
char *s2 = malloc(64); LT_mark(s2, free);
....
}
/* Lifetime of all variables declared in InnerBlock expires */
....
/* If lt_head points to a struct automatically allocated inside
* InnerBlock, it is now a dangling pointer and cannot be used.
* The next statement is Undefined Behaviour.
*/
freeTheMallocs(lt_head);
}
Note that the problem is not that the inner block is executed more than once (although that will probably guarantee that you notice the problem). The same thing would happen had I written it as a conditional:
int example (int flag){
struct LifeTime *lt_head = NULL;
char *s = malloc(64); LT_mark(s, free);
if (flag) {
/* InnerBlock */
char *s2 = malloc(64); LT_mark(s2, free);
....
}
/* Lifetime of all variables declared in InnerBlock expires */
....
freeTheMallocs(lt_head); /* Dangling pointer */
}
The above cannot work with automatic allocation of descriptor blocks (but it will work fine with dynamic allocation).
OK, so what happens if you absolutely promise to only use LT_mark in the outermost block of your function, as with your original example:
int example (){
struct LifeTime *lt_head = NULL;
char *s = malloc(64); LT_mark(s, free);
char *s2 = malloc(64); LT_mark(s2, free);
freeTheMallocs(lt_head);
}
That will work. Your only problem is how to enforce the restriction, including on all the maintenance programmers who will modify the code after you leave the project, and may not have the foggiest idea of why they're not allowed to nest LT_mark inside a block (or even know that they're not allowed to do that).
But if you like playing with fire, you can do it like this:
#define LT_mark(NAME, DELETE) \
lt_head = &(struct LifeTime){ \
.delete=DELETE, \
.prev=lt_head, \
.ref=NAME \
}
This will work, in the limited set of cases in which it does work, because the the compound literal created by the macro "has automatic storage duration associated with the enclosing block." (ยง6.5.2.5/5).
Honestly, I sincerely hope you don't use the above code. I contribute this answer mostly in the hopes that it provides some kind of explanation of the importance of understanding lifetimes.

Smart Pointers in a language that compiles to C

I'm writing a simple language that compiles to C, and I want to implement smart pointers. I need a bit of help with that though, as I can't seem to think of how I would go around it, or if it's even possible. My current idea is to free the pointer when it goes out of scope, the compiler would handle inserting the frees. This leads to my questions:
How would I tell when a pointer has gone out of scope?
Is this even possible?
The compiler is written in C, and compiles to C. I thought that I could check when the pointer goes out of scope at compile-time, and insert a free into the generated code for the pointer, i.e:
// generated C code.
int main() {
int *x = malloc(sizeof(*x));
*x = 5;
free(x); // inserted by the compiler
}
The scoping rules (in my language) are exactly the same as C.
My current setup is your standard compiler, first it lexes the file contents, then it parses the token stream, semantically analyzes it, and then generates code to C. The parser is a recursive descent parser. I would like to avoid something that happens on execution, i.e. I want it to be a compile-time check that has little to no overhead, and isn't full blown garbage collection.
For functions, each { starts a new scope, and each } closes the corresponding scope. When a } is reached, the variables inside that block go out-of-scope. Members of structs go out of scope when the struct instance goes out of scope. There's a couple exceptions, such as temporary objects go out-of-scope at the next ;, and compilers silently put for loops inside their own block scope.
struct thing {
int member;
};
int foo;
int main() {
thing a;
{
int b = 3;
for(int c=0; c<b; ++c) {
int d = rand(); //the return value of rand goes out of scope after assignment
} //d and c go out of scope here
} //b goes out of scope here
}//a and its members go out of scope here
//globals like foo go out-of-scope after main ends
C++ tries really hard to destroy objects in the opposite order they're constructed, you should probably do that in your language too.
(This is all from my knowledge of C++, so it might be slightly different from C, but I don't think it is)
As for memory, you'll probably want to do a little magic behind the scenes. Whenever the user mallocs memory, you replace it with something that allocates more memory, and "hide" a reference count in the extra space. It's easiest to do that at the beginning of the allocation, and to keep alignment guarantees, you use something akin to this:
typedef union {
long double f;
void* v;
char* c;
unsigned long long l;
} bad_alignment;
void* ref_count_malloc(int bytes)
{
void* p = malloc(bytes + sizeof(bad_alignment)); //does C have sizeof?
int* ref_count = p;
*ref_count = 1; //now is 1 pointer pointing at this block
return p + sizeof(bad_alignment);
}
When they copy a pointer, you silently add something akin to this before the copy
void copy_pointer(void* from, void* to) {
if (from != NULL)
ref_count_free(free); //no longer points at previous block
bad_alignment* ref_count = to-sizeof(bad_alignment);
++*ref_count; //one additional pointing at this block
}
And when they free or a pointer goes out of scope, you add/replace the call with something like this:
void ref_count_free(void* ptr) {
if(ptr) {
bad_alignment* ref_count = ptr-sizeof(bad_alignment);
if (--*ref_count == 0) //if no more pointing at this block
free(ptr);
}
}
If you have threads, you'll have to add locks to all that. My C is rusty and the code is untested, so do a lot of research on these concepts.
The problem is slightly more difficult, since your code is straightforward, but... what if another pointer is made to point to the same place as x?
// generated C code.
int main() {
int *x = malloc(sizeof(*x));
int *y = x;
*x = 5;
free(x); // inserted by the compiler, now wrong
}
You doubtlessly will have a heap structure, in which each block has a header that tells a) whether the block is in use, and b) the size of the block. This can be achieved with a small structure, or by using the highest bit for a) in the integer value for b) [is this a 64bit compiler or 32bit?]. For simplicity, lets consider:
typedef struct {
bool allocated: 1;
size_t size;
} BlockHeader;
You would have to add another field to that small structure, which would be a reference count. Each time a pointer points to that block in the heap, you increment the reference count. When a pointer stops pointing to a block, then its reference count is decremented. If it reaches 0, then it can be compacted or whatever. The use of the allocated field has now gone.
typedef struct {
size_t size;
size_t referenceCount;
} BlockHeader;
Reference counting is quite simple to implement, but comes with a down side: it means there is overhead each time the value of a pointer changes. Still, is the simplest scheme to work, and that's why some programming languages still use it, such as Python.

How do I set values inside a global, fixed-size array, in C (in Visual Studio)?

A part of my VS2012 Windows Phone project is in C. I've been struggling during one day trying to initialize an array to put stuff inside it.
Whenever I try to initialize it as global (outside any function), then I get a message telling me that I can't initialize it with a value that isn't a const.
const char* myArray = (const char*)malloc(256);
// Bad code: this isn't initialized with a const
If I don't initialize it with a value, then I'll have a message telling me to give it a value. So I assign a NULL value to the array.
const char* myArray = NULL;
Then I need to set a size somewhere, so I set the size within my main, or first function:
int myFirstFunctionInTheCode()
{
myArray = (char*)malloc(256);
}
Then I get something like:
';' expected before type
So I'm searching on forum and read that C in Visual Studio is C89, thus, I need to declare then to assign on two separate line, which isn't true elsewhere in my code, so I'm completely mixed-up about -real- standards. But I still get the same error when doing it on two lines.
I then decide to use some other tools from the available VS libraries to find out that in C, I can't include sstream, streambuf, etc, otherwise my whole project fails with thousands of bugs. So I could use boost to get a real stream libraries, but it's not compatible with Windows Phone because of some thread usage.
How do I set values inside a global, fixed-size array, in C (in Visual Studio)?
What I want to achieve is similar to something in C# like it:
static byte[] gentleCSharpArray = new byte[256];
private void easyShotCSharpFunction()
{
gentleCSharpArray[0] = 0x57;
gentleCSharpArray[1] = 0x54;
gentleCSharpArray[2] = 0x46;
}
I never spent so much time trying to assign a value to an array, so I guess I'm totally wrong with my global char* arrays?
PART ONE
const char* myArray = (const char*)malloc(256);
This doesn't work because global variables in C are divvied into two spaces: the data segment and the bss segment. For example:
#include <stdio.h>
#include <stdlib.h>
int* myArray; // uninitialized, represented by bss segment
const char* myArray2 = "abc"; // initialized, goes into data segment
int main ()
{
myArray = malloc(3*sizeof(int));
myArray[0] = 111;
myArray[1] = 222;
myArray[2] = 333;
int i;
for (i=0; i<3; i++)
printf("%d, %c\n", myArray[i], myArray2[i]);
return 0;
}
When this is compiled, the const char* myArray2 = "abc"; does not translate into machine instructions. Instead, an image of what "abc" looks like in memory is created and put into the data segment, along with every other initialized global variable. Later, the program loader picks up that entire data segment and sticks it in memory, before your program even starts to run.
Uninitialized variables, like myArray in the example, don't even have that much happen. Rather, it is represented in the BSS segment as the compiler says, "we're going to need n bytes of memory reserved for uninitialized variables." Later, the program loader takes note of this and reserves those n bytes, before your program even starts to run.
Thus, it doesn't make sense to try and malloc when you initialize global variables, because when the globals are created your program isn't running yet. The machine instructions for malloc may not even be in memory yet!
PART TWO
static byte[] gentleCSharpArray = new byte[256];
private void easyShotCSharpFunction()
{
gentleCSharpArray[0] = 0x57;
gentleCSharpArray[1] = 0x54;
gentleCSharpArray[2] = 0x46;
}
Okay, let's translate this bit by bit from C# to C. Are you using const in C because constant and static are (almost) synonyms in standard English? Because they're very different in programming.
The keyword const in C and C# means that the variable cannot be an L-value.
The keyword static in object-oriented languages (like C#) means that a function or variable is unchanging with respect to the object instance of its class. C has no objects and thus no analog.
The keyword static is used in plain C to mean that a variable is unchanging with respect to its invocation, or a function is unchanging with respect to where it can be seen (similar to private in C#, you can read more here).
But what are you really wanting to do there? Just reserve a huge chunk of memory for the program, right? C has no byte data type, but char is one byte in size; you can use that instead. The unsigned keyword makes it clear to program inspectors that this will not be used for a string:
// Compiled and ran with gcc -O0 -g -Wall on Ubuntu
#include <stdio.h>
#include <stdlib.h>
int* myArray;
const char* myArray2 = "abc";
unsigned char gentleCArray[256]; // <-- here's the declaration you want
static void easyShotCFunction()
{
gentleCArray[0] = 0x57;
gentleCArray[1] = 0x54;
gentleCArray[2] = 0x46;
}
int main ()
{
myArray = malloc(3*sizeof(int));
myArray[0] = 111;
myArray[1] = 222;
myArray[2] = 333;
easyShotCFunction();
int i;
for (i=0; i<3; i++)
printf("%d, %c, 0x%x\n", myArray[i], myArray2[i], gentleCArray[i]);
return 0;
}
When the program starts, gentleCArray will already be a pointer to 256 bytes of memory, most likely all zeroes. This is a consequence of the BSS segment I mentioned in part 1. Useful for doing your own memory management without malloc.
You either:
const char my_array[] = "abcdef";
or:
char *my_array;
int main(void)
{
my_array = malloc(some_size);
/* initialize elements of my_array */
}
Example 1 makes no sense because you are attempting to initialize a static variable at runtime.
Example 2 makes no sense because you are attempting to modify a const object. Essentially, you did the opposite of what could work in either situation.
What I want to achieve is similar to something in C# like it:
static byte[] gentleCSharpArray = new byte[256];
private void easyShotCSharpFunction()
{
gentleCSharpArray[0] = 0x57;
gentleCSharpArray[1] = 0x54;
gentleCSharpArray[2] = 0x46;
}
Ok, then you want;
unsigned char arr[256];
void c_version(void)
{
arr[0] = 0x57;
arr[1] = 0x54;
arr[2] = 0x46;
}

How to return an integer from a function

Which is considered better style?
int set_int (int *source) {
*source = 5;
return 0;
}
int main(){
int x;
set_int (&x);
}
OR
int *set_int (void) {
int *temp = NULL;
temp = malloc(sizeof (int));
*temp = 5;
return temp;
}
int main (void) {
int *x = set_int ();
}
Coming for a higher level programming background I gotta say I like the second version more. Any, tips would be very helpful. Still learning C.
Neither.
// "best" style for a function which sets an integer taken by pointer
void set_int(int *p) { *p = 5; }
int i;
set_int(&i);
Or:
// then again, minimise indirection
int an_interesting_int() { return 5; /* well, in real life more work */ }
int i = an_interesting_int();
Just because higher-level programming languages do a lot of allocation under the covers, does not mean that your C code will become easier to write/read/debug if you keep adding more unnecessary allocation :-)
If you do actually need an int allocated with malloc, and to use a pointer to that int, then I'd go with the first one (but bugfixed):
void set_int(int *p) { *p = 5; }
int *x = malloc(sizeof(*x));
if (x == 0) { do something about the error }
set_int(x);
Note that the function set_int is the same either way. It doesn't care where the integer it's setting came from, whether it's on the stack or the heap, who owns it, whether it has existed for a long time or whether it's brand new. So it's flexible. If you then want to also write a function which does two things (allocates something and sets the value) then of course you can, using set_int as a building block, perhaps like this:
int *allocate_and_set_int() {
int *x = malloc(sizeof(*x));
if (x != 0) set_int(x);
return x;
}
In the context of a real app, you can probably think of a better name than allocate_and_set_int...
Some errors:
int main(){
int x*; //should be int* x; or int *x;
set_int(x);
}
Also, you are not allocating any memory in the first code example.
int *x = malloc(sizeof(int));
About the style:
I prefer the first one, because you have less chances of not freeing the memory held by the pointer.
The first one is incorrect (apart from the syntax error) - you're passing an uninitialised pointer to set_int(). The correct call would be:
int main()
{
int x;
set_int(&x);
}
If they're just ints, and it can't fail, then the usual answer would be "neither" - you would usually write that like:
int get_int(void)
{
return 5;
}
int main()
{
int x;
x = get_int();
}
If, however, it's a more complicated aggregate type, then the second version is quite common:
struct somestruct *new_somestruct(int p1, const char *p2)
{
struct somestruct *s = malloc(sizeof *s);
if (s)
{
s->x = 0;
s->j = p1;
s->abc = p2;
}
return s;
}
int main()
{
struct somestruct *foo = new_somestruct(10, "Phil Collins");
free(foo);
return 0;
}
This allows struct somestruct * to be an "opaque pointer", where the complete definition of type struct somestruct isn't known to the calling code. The standard library uses this convention - for example, FILE *.
Definitely go with the first version. Notice that this allowed you to omit a dynamic memory allocation, which is SLOW, and may be a source of bugs, if you forget to later free that memory.
Also, if you decide for some reason to use the second style, notice that you don't need to initialize the pointer to NULL. This value will either way be overwritten by whatever malloc() returns. And if you're out of memory, malloc() will return NULL by itself, without your help :-).
So int *temp = malloc(sizeof(int)); is sufficient.
Memory managing rules usually state that the allocator of a memory block should also deallocate it. This is impossible when you return allocated memory. Therefore, the second should be better.
For a more complex type like a struct, you'll usually end up with a function to initialize it and maybe a function to dispose of it. Allocation and deallocate should be done separately, by you.
C gives you the freedom to allocate memory dynamically or statically, and having a function work only with one of the two modes (which would be the case if you had a function that returned dynamically allocated memory) limits you.
typedef struct
{
int x;
float y;
} foo;
void foo_init(foo* object, int x, float y)
{
object->x = x;
object->y = y;
}
int main()
{
foo myFoo;
foo_init(&foo, 1, 3.1416);
}
In the second one you would need a pointer to a pointer for it to work, and in the first you are not using the return value, though you should.
I tend to prefer the first one, in C, but that depends on what you are actually doing, as I doubt you are doing something this simple.
Keep your code as simple as you need to get it done, the KISS principle is still valid.
It is best not to return a piece of allocated memory from a function if somebody does not know how it works they might not deallocate the memory.
The memory deallocation should be the responsibility of the code allocating the memory.
The first is preferred (assuming the simple syntax bugs are fixed) because it is how you simulate an Out Parameter. However, it's only usable where the caller can arrange for all the space to be allocated to write the value into before the call; when the caller lacks that information, you've got to return a pointer to memory (maybe malloced, maybe from a pool, etc.)
What you are asking more generally is how to return values from a function. It's a great question because it's so hard to get right. What you can learn are some rules of thumb that will stop you making horrid code. Then, read good code until you internalize the different patterns.
Here is my advice:
In general any function that returns a new value should do so via its return statement. This applies for structures, obviously, but also arrays, strings, and integers. Since integers are simple types (they fit into one machine word) you can pass them around directly, not with pointers.
Never pass pointers to integers, it's an anti-pattern. Always pass integers by value.
Learn to group functions by type so that you don't have to learn (or explain) every case separately. A good model is a simple OO one: a _new function that creates an opaque struct and returns a pointer to it; a set of functions that take the pointer to that struct and do stuff with it (set properties, do work); a set of functions that return properties of that struct; a destructor that takes a pointer to the struct and frees it. Hey presto, C becomes much nicer like this.
When you do modify arguments (only structs or arrays), stick to conventions, e.g. stdc libraries always copy from right to left; the OO model I explained would always put the structure pointer first.
Avoid modifying more than one argument in one function. Otherwise you get complex interfaces you can't remember and you eventually get wrong.
Return 0 for success, -1 for errors, when the function does something which might go wrong. In some cases you may have to return -1 for errors, 0 or greater for success.
The standard POSIX APIs are a good template but don't use any kind of class pattern.

Resources