I found an answer on SO that suggests the following solution to reinitialize array in c.
int *foo = (int[]){1,2,3,4,5};
I am not really sure what exactly such syntax will do and I have few questions:
Will it cause memory leaks if my array was previously created?
double *my_array = (double[]){1.1, 2.2, 3.3, 4.4, 5.5};
...
my_array = (double[]){-1.1, -2.2, -3.3}; // Do i need to call free(my_array) first?
Is it allowed to use such an approach in function calls?
void foo(int *arr)
{
arr = (int[]){-2, -7, 1, 255};
}
int main()
{
int *my_array = (int[]){1, 2, 3};
foo(my_array);
if (my_array[2] != 1)
return -1;
}
Generalizing:
Does such syntax just allocates new memory in heap with predefined values and returns pointer?
Does it clear automatically everything that was in the previous pointer?
First, the declaration int *foo makes foo a pointer, not an array.
(int[]){1,2,3,4,5} is a compound literal. It is rare there is a good reason to set pointers to compound literals in this way.
Compound literals are managed automatically, so you do not need to free them. If a compound literal appears outside any function, it has static storage duration; it exists for the entire execution of the program. Otherwise, it has automatic storage duration associated with the block it is in, and its memory reservation will end when execution of that block ends. You should not set a pointer to point to such a compound literal if the pointer is used after execution of the block ends.
In this code:
void foo(int *arr)
{
arr = (int[]){-2, -7, 1, 255};
}
arr is set to point to the compound literal, but arr is only a function parameter. It effectively ceases to exist when the function returns.
In this code:
int *my_array = (int[]){1, 2, 3};
foo(my_array);
if (my_array[2] != 1)
return -1;
When foo is called, its parameter arr is set to the value of my_array. When arr is changed inside foo, it does not affect my_array. my_array will still be pointing to the start of (int[]){1, 2, 3}. This would be true regardless of whether arr is set to point to a compound literal, allocated memory, or anything else: Changing a parameter inside a function does not change the thing that was passed as an argument.
To get the pointer out of the function, you could either return it or you could pass a pointer to a pointer so that the function had the address of the pointer:
void foo(int **arr)
{
*arr = (int []) { -2, -7, 1, 255 };
}
int main(void)
{
int *my_array = (int []) { 1, 2, 3 };
foo(&my_array);
…
}
However, then foo would be putting the address of its compound literal into a pointer that is used after the function ends. That is a situation where you ought to call malloc to reserve memory and then copy the data into the allocated memory. Later, after the program is done with that data, it would call free to release the memory.
This is known as compound literals and code such as int *foo = (int[]){1,2,3,4,5}; can be regarded as 100% equivalent to this:
int arr[] = {1,2,3,4,5};
int *foo = arr;
That is, a compound literal has the same scope and storage duration as a named array declared at the same scope.
Will it cause memory leaks if my array was previously created?
No. In case a compound literal was declared at local scope, it will be valid inside the { } where it was declared (so-called automatic storage duration). After that, it gets automatically cleaned up just like any other local variable. Since it isn't using allocated storage, there are no leaks.
Is it allowed to use such an approach in function calls?
No. Just like any local variable, you cannot return a pointer to it from inside a function.
Additionally, your example has a bug, it just sets the local copy of the pointer parameter arr. The pointer in the caller was passed by value and is unaffected by the arr = (int[]){-2, -7, 1, 255}; line.
Does such syntax just allocates new memory in heap with predefined values and returns pointer?
The C standard doesn't specify where variables are allocated. But when looking at all well-known compiler implementations out there, then the following will likely hold:
Compilers/linkers do not allocate anything on the heap unless explicitly told to through malloc.
Local compound literals are likely allocated on the stack and/or in registers.
File scope compound literals are likely allocated in the .data segment.
Does it clear automatically everything that was in the previous pointer?
Data isn't stored "inside pointers", but yes the pointed-at memory will get "cleared" (become invalid to use & available for other parts of the program) when it goes out of scope. No matter if there are pointers pointing at it or not.
Related
I have a very basic question on memory allocation in C.
If I write:
int* test;
test = malloc(5 * sizeof(int));
test[0] = 1;
test[1] = 2;
test[2] = 3;
test[3] = 4;
test[4] = 5;
test = realloc(test, 6 * sizeof(int));
I am able to use realloc. If I define test as:
int test[5] = {1,2,3,4,5};
I am not able call realloc on it.
What is the lower level difference between these statements?
Can I somehow realloc on test[5]?
How do I free test[5]?
I am not sure where to look for an answer, if you could link a resource, I would be thankful.
You cannot use realloc() to int test[5] because this test is not a pointer allocated via memory management functions (like malloc()) nor NULL.
Quote from N1570 7.22.3.5 The realloc function:
If ptr is a null pointer, the realloc function behaves like the malloc function for the
specified size. Otherwise, if ptr does not match a pointer earlier returned by a memory
management function, or if the space has been deallocated by a call to the free or
realloc function, the behavior is undefined. If memory for the new object cannot be
allocated, the old object is not deallocated and its value is unchanged.
You cannot do re-allocation of int test[5]. (At least there are no standard way, but I cannot say there are no extended compiler that supports that).
To free int test[5], exit from the block in which that is declared if it is a local variable. Such variable has an automatic storage duration and it is freed on exiting from the block. If it is a global (or static local) variable, exit the process and the OS will free the memory used by the process.
You cannot change the size because the type of test in int test[5] = {1,2,3,4,5}; is int[5]. Number 5 is a part of its type and objects in C cannot change type after creation.
In int* test;, the test is a pointer and it can point to a memory region of an arbitrary number of consecutive ints.
int *tomato(int *a, int *b) {
int *foo = (int*)malloc(sizeof(int));
*foo = *a + *b;
return foo;
}
In this function I have foo that is allocated in heap and returns a pointer to int, but are the pointers *a and *b in the function arguments also allocated in heap? I am a bit confused here, generally, arguments are allocated in stack.
The pointers are local variables in the tomato function, just like foo.
The values that they point to can be allocated anywhere. For instance, you can call it like this:
int foo = 1;
int *bar = malloc(sizeof(int));
*bar = 3;
int *result = tomato(&foo, bar);
a will point to a the foo variable, while b will point to the memory allocated by malloc.
Parameters in C are not necessarily allocated in the stack (*), but their scope will necessarily be restricted to the tomato function block, and they will necessarily be passed by value.
When you dereference a and b in the assignment *foo = *a + *b, you are interpreting the memory address stored in pointers a and b as integers, summing them, and writing the result in the memory address stored in pointer foo (which, in your example, happens to be in the heap).
After the assignment, you could change a and b at will by assigning different memory addresses to them (i.e. pointers), and there would be no consequence to any external memory references as their scope is limited to the function block (e.g. a = foo). If, however, you changed the memory contents referred by them (e.g. *a = 0), this would become visible outside the scope of the function as you would be writing on a memory space (stack or heap) allocated somewhere else.
(*) Parameters may not be passed in memory (i.e. stack) to functions. Depending on the compiler/architecture, they may be directly assigned to a processor register. Either way, this is a transparent compiler optimization and you don't have to worry about it... the parameters will behave just the same.
It's safe initialize pointers using compound literals in such way and it's possible at all?:
#include <stdio.h>
#include <string.h>
void numbers(int **p)
{
*p = (int []){1, 2, 3};
}
void chars(char **p)
{
*p = (char[]){'a','b','c'};
}
int main()
{
int *n;
char *ch;
numbers(&n);
chars(&ch);
printf("%d %c %c\n", n[0], ch[0], ch[1]);
}
output:
1 a b
I don't understand exactly how it's works, does it's not the same as init pointer with local variable?
also if i try to print:
printf("%s\n", ch);
It's print nothing.
A compound literal declared inside a function has automatic storage duration associated with its enclosing block (C 2018 6.5.2.5 5), which means its lifetime ends when execution of the block ends.
Inside numbers, *p = (int []){1, 2, 3}; assigns the address of the compound literal to *p. When numbers returns, the compound literal ceases to exist, and the pointer is invalid. After this, the behavior of a program that uses the pointer is undefined. The program might be able to print values because the data is still in memory, or the program might print different values because memory has changed, or the program might trap because it tried to access inaccessible memory, or the entire behavior of the program may change in drastic ways because compiler optimization changed the undefined behavior into something else completely.
It depends on where the compound literal is placed.
C17 6.5.2.5 §5
The value of the compound literal is that of an unnamed object initialized by the
initializer list. If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration associated with
the enclosing block.
That is, if the compound literal is at local scope, it works exactly like a local variable/array and it is not safe to return a pointer to it from a function.
If it is however declared at file scope, it works like any other variable with static storage duration, and you can safely return a pointer to it. However, doing so is probably an indication of questionable design. Plus you'll get the usual thread-safety issues in a multi-threaded application.
Given pointers to char, one can do the following:
char *s = "data";
As far as I understand, a pointer variable is declared here, memory is allocated for both variable and data, the latter is filled with data\0 and the variable in question is set to point to the first byte of it (i. e. variable contains an address that can be dereferenced). That's short and compact.
Given pointers to int, for example, one can do this:
int *i;
*i = 42;
or that:
int i = 42;
foo(&i); // prefix every time to get a pointer
bar(&i);
baz(&i);
or that:
int i = 42;
int *p = &i;
That's somewhat tautological. It's small and tolerable with one usage of a single variable. It's not with multiple uses of several variables, though, producing code clutter.
Are there any ways to write the same thing dry and concisely? What are they?
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
String literals are a corner case : they trigger the creation of the literal in static memory, and its access as a char array. Note that the following doesn't compile, despite 42 being an int literal, because it is not implicitly allocated :
int *p = &42;
In all other cases, you are responsible of allocating the pointed object, be it in automatic or dynamic memory.
int i = 42;
int *p = &i;
Here i is an automatic variable, and p points to it.
int * i;
*i = 42;
You just invoked Undefined Behaviour. i has not been initialized, and is therefore pointing somewhere at random in memory. Then you assigned 42 to this random location, with unpredictable consequences. Bad.
int *i = malloc(sizeof *i);
Here i is initialized to point to a dynamically-allocated block of memory. Don't forget to free(i) once you're done with it.
int i = 42, *p = &i;
And here is how you create an automatic variable and a pointer to it as a one-liner. i is the variable, p points to it.
Edit : seems like you really want that variable to be implicitly and anonymously allocated. Well, here's how you can do it :
int *p = &(int){42};
This thingy is a compound literal. They are anonymous instances with automatic storage duration (or static at file scope), and only exist in C90 and further (but not C++ !). As opposed to string literals, compound literals are mutable, i.e you can modify *p.
Edit 2 : Adding this solution inspired from another answer (which unfortunately provided a wrong explanation) for completeness :
int i[] = {42};
This will allocate a one-element mutable array with automatic storage duration. The name of the array, while not a pointer itself, will decay to a pointer as needed.
Note however that sizeof i will return the "wrong" result, that is the actual size of the array (1 * sizeof(int)) instead of the size of a pointer (sizeof(int*)). That should however rarely be an issue.
int i=42;
int *ptr = &i;
this is equivalent to writing
int i=42;
int *ptr;
ptr=&i;
Tough this is definitely confusing, but during function calls its quite useful as:
void function1()
{
int i=42;
function2(&i);
}
function2(int *ptr)
{
printf("%d",*ptr); //outputs 42
}
here, we can easily use this confusing notation to declare and initialize the pointer during function calls. We don't need to declare pointer globally, and the initialize it during function calls. We have a notation to do both at same time.
int *ptr; //declares the pointer but does not initialize it
//so, ptr points to some random memory location
*ptr=42; //you gave a value to this random memory location
Though this will compile, but it will invoke undefined behaviour as you actually never initialized the pointer.
Also,
char *ptr;
char str[6]="hello";
ptr=str;
EDIT: as pointed in the comments, these two cases are not equivalent.
But pointer points to "hello" in both cases. This example is written just to show that we can initialize pointers in both these ways (to point to hello), but definitely both are different in many aspects.
char *ptr;
ptr="hello";
As, name of string, str is actually a pointer to the 0th element of string, i.e. 'h'.
The same goes with any array arr[], where arr contains the address of 0th element.
you can also think it as array , int i[1]={42} where i is a pointer to int
int * i;
*i = 42;
will invoke undefined behavior. You are modifying an unknown memory location. You need to initialize pointer i first.
int i = 42;
int *p = &i;
is the correct way. Now p is pointing to i and you can modify the variable pointed to by p.
Are there any ways to write the same thing dry and concisely?
No. As there is no pass by reference in C you have to use pointers when you want to modify the passed variable in a function.
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
If you are learning C then you can't avoid pointers and you should learn to use it properly.
I've always programmed in Java, which is probably why I'm so confused about this:
In Java I declare a pointer:
int[] array
and initialize it or assign it some memory:
int[] array = {0,1,0}
int[] array = new int[3]
Now, in C, it's all so confusing. At first I thought it was as easy as declaring it:
int array[]
and initializing it or assigning it some memory:
int array[] = {0,1,0}
int array[] = malloc(3*sizeof(int))
int array[] = calloc(3,sizeof(int))
Unless I'm wrong, all of the above is equivalent Java-C, right?
Then, today I met a code in which I found the following:
pthread_t tid[MAX_OPS];
and some lines below, without any kind of initialization...
pthread_create(&tid[0],NULL,mou_usuari,(void *) 0);
Surprisingly (at least to me), the code works! At least in Java, that would return a nice "NullPointerException"!
So, in order:
Am I correct with all of the Java-C "translations"?
Why does that code work?
Is there any difference between using malloc(n*sizeof(int)) and calloc(n,sizeof(int))?
Thanks in advance
You can't assign memory to an array. An array has a fixed size, for the whole of its lifespan. An array can never be null. An array is not a pointer.
malloc returns the address to a memory block that is reserved for the program. You can't "assign" that (being the memory block) to an array, but you can store the address of this memory block in a pointer: luckily, array subscription is defined through pointers - so you can "use pointers like arrays", e.g.
int *ptr = malloc(5 * sizeof *ptr);
ptr[2] = 5; // access the third element "of ptr"
free(ptr); // always free at the end
When you declare an array without a size (i.e. array[]), it simply means the size of the array is determined from the initializer list. That is
int array[] = {1, 2, 3, 4, 5}; // is equal to
int array[5] = {1, 2, 3, 4, 5};
Trying to declare an array without a size and without an initializer is an error.
The code pthread_t tid[MAX_OPS]; declares an array named tid of type pthread_t and of size MAX_OPS.
If the array has automatic storage (i.e. declaration is inside a function and not static, not global), then each of the arrays elements has indeterminate value (and it would cause undefined behavior trying to read such value). Luckily, all that the function call does is that it takes the address of the first element of the array as the first parameter, and probably initializes it (the element) inside the function.
The difference of calloc and malloc is that the memory block that calloc returns is initialized to zero. That is;
int *ptr = calloc(5, sizeof *ptr);
// is somewhat equal to
int *ptr = malloc(5 * sizeof *ptr);
memset(ptr, 0, 5 * sizeof *ptr);
The difference between
int *ptr = malloc(5 * sizeof *ptr);
// and
int array[5];
is that array has automatic storage, (is stored on stack), and is "released" after it goes out of scope. ptr, however, (is stored on heap), is dynamically allocated and must be freed by the programmer.
You are missing three very basic and tighten (and misleading!) C topics:
the difference between array and pointers
the difference between static and dynamic allocation
the difference from declaring variables on the stack or on the heap
If you write int array[] = malloc(3*sizeof(int)); you would get a compilation error (something like 'identifier' : array initialization needs curly braces).
This means that declaring an array allows only static initialization:
int array[] = {1,2,3}; that reserves 3 contiguous integers on the stack;
int array[3] = {1,2,3}; which is the same as the previous one;
int array[3]; that still reserves 3 contiguous integers on the stack, but does not initialize them (the content will be random garbage)
int array[4] = {1,2,3}; when the initializer list doesn't initialize all the elements, the rest are set to 0 (C99 §6.7.8/19): in this case you'll get 1,2,3,0
Note that in all these cases you are not allocating new memory, you are just using the memory already committed to the stack. You would run in a problem only if the stack is full (guess it, it would be a stack overflow). For this reason declaring int array[]; would be wrong and meaningless.
To use malloc you have to declare a pointer: int* array.
When you write int* array = malloc(3*sizeof(int)); you are actually doing three operations:
int* array tells the compiler to reserve a pointer on the stack (an integer variable that contains a memory address)
malloc(3*sizeof(int)) allocates on the heap 3 contiguous integers and returns the address of the first one
= assigns copies that return value (the address of the first integer you have allocated) to your pointer variable
So, to come back to your question:
pthread_t tid[MAX_OPS];
is an array on the stack, so it doesn't need to be allocated (if MAX_OPS is, say, 16 then on the stack will be reserved the number of contiguous bytes needed to fit 16 pthread_t). The content of this memory will be garbage (stack variables are not initialized to zero), but pthread_create returns a value in its first parameter (a pointer to a pthread_t variable) and disregards any previous content, so the code is just fine.
C offers static memory allocation as well as dynamic- you can allocate arrays off the stack or in executable memory (managed by the compiler). This is just the same as how in Java, you can allocate an int on the stack or an Integer on the heap. Arrays in C are just like any other stack variable- they go out of scope, etc. In C99 they can also have a variable size, although they cannot be resized.
The main difference between {} and malloc/calloc is that {} arrays are statically allocated (don't need freeing) and automatically initialized for you, whereas malloc/calloc arrays must be freed explicitly and you have to initialize them explicitly. But of course, malloc/calloc arrays don't go out of scope and you can (sometimes) realloc() them.
2 - This array declaration is static :
pthread_t tid[MAX_OPS];
We don't need to allocate memory block, instead of dynamic allocation :
pthread_t *tid = (pthread_t *)malloc( MAX_OPS * sizeof(pthread_t) );
Don't forget to free the memory :
free(tid);
3 - The difference between malloc and calloc is calloc allocate a block of memory for an array and initializes all its bits at 0.
I find it helpful when you are programming in C (as opposed to C++) to indicate *array explicitly, to remember that there is a pointer that can be moved around. So I would like to start by rephrasing your example as:
int array[] = {0,1,2};
int *array = malloc(3*sizeof(int));
int *array = calloc(3,sizeof(int));
The first makes it clear that there is something called array which is pointing to a block of memory that contains a 0, 1 and 2. array can't be moved elesewhere.
Your next code:
pthread_t tid[MAX_OPS];
Does in fact cause an array with sizeof(pthread_t) * MAX_OPS to be allocated. But it does not allocate a pointer called *tid. There is an address of the base of the array, but you can't move it elsewhere.
The ptherad_t type is actually a cover for a pointer. So tid above is actually an array of pointers. And they are all statically allocated but they are not initialized.
The pthread_create takes the location at the beginning of the array (&tid[0]), which is a pointer, and allocates a block of memory to hold the pthread data structure. The pointer is set to point to the new data structure and the data structure is allocated.
Your last question --- the difference between malloc(n*sizeof(int)) and calloc(n,sizeof(int)) is that the later initializes each byte to 0, while the first does not.