Initializing a pointer to compound literals in C - c

Here is one not-so-common way of initializing the pointer:
int *p = (int[10]){[1]=1};
Here, pointer point to compound literals.
#include <stdio.h>
int main(void)
{
int *p = (int[10]){[1]=1};
printf("%d\n", p[1]);
}
Output:
1
This program is compiled and run fine in G++ compiler.
So,
Is it the correct way to initializing a pointer to compound literals? or
Is it undefined behaviour initialize pointer to compound literals?

Yes, it is valid to have a pointer to compound literals. Standard allows this.
n1570-§6.5.2.5 (p8):
EXAMPLE 1 The file scope definition
int *p = (int []){2, 4};
initializes p to point to the first element of an array of two ints, the first having the value two and the second, four. The expressions in this compound literal are required to be constant. The unnamed object
has static storage duration.

Related

Why can't a defined, fix-sized array be assigned using a compound literal?

Why is it so that a struct can be assigned after defining it using a compound literal (case b) in sample code), while an array cannot (case c))?
I understand that case a) does not work as at that point compiler has no clue of the memory layout on the rhs of the assignment. It could be a cast from any type. But going with this line, in my mind case c) is a perfectly well-defined situation.
typedef struct MyStruct {
int a, b, c;
} MyStruct_t;
void function(void) {
MyStruct_t st;
int arr[3];
// a) Invalid
st = {.a=1, .b=2, .c=3};
// b) Valid since C90
st = (MyStruct_t){.a=1, .b=2, .c=3};
// c) Invalid
arr = (int[3]){[0]=1, [1]=2, [2]=3};
}
Edit:
I am aware that I cannot assign to an array - it's how C's been designed. I could use memcpy or just assign values individually.
After reading the comments and answers below, I guess now my question breaks down to the forever-debated conundrum of why you can't assign to arrays.
What's even more puzzling as suggested by this post and M.M's comment below is that the following assignments are perfectly valid (sure, it breaks strict aliasing rules). You can just wrap an array in a struct and do some nasty casting to mimic an assignable array.
typedef struct Arr3 {
int a[3];
} Arr3_t;
void function(void) {
Arr3_t a;
int arr[3];
a = (Arr3_t){{1, 2, 3}};
*(Arr3_t*)arr = a;
*(Arr3_t*)arr = (Arr3_t){{4, 5, 6}};
}
So then what's stopping developers to include a feature like this to, say C22(?)
C does not have assignment of arrays, at all. That is, where array has any array type, array = /* something here */ is invalid regardless of the contents of "something here". Whether it's a compound literal (which you seem to have confused with designated initializer, a completely different concept) is irrelevant. array1 = array2 would be just as invalid.
As to why it's invalid, at some level that's a question of the motivations/rationale of the C language and its design and unanswerable. However, mechanically, arrays in any context except the operand of sizeof or the operand of & "decay" to pointers to their first element. So in the case of:
arr = (int[3]){[0]=1, [1]=2, [2]=3};
you are attempting to assign pointer to the first element of the compound literal array to a non-lvalue (the rvalue produced when arr decays). And of course that is nonsense.
A compound array literal can be used anywhere that an actual array variable can be used. Since you can't assign one array to another array, it's also not valid to assign a compound literal to an array.
Since you can copy arrays using memcpy(), you could write:
memcpy(arr, (int[3]){[0]=1, [1]=2, [2]=3}, sizeof(arr));
Just like the array variable, the array literal decays to a pointer to its first element.
Compound struct literals can also be used in place of an actual struct variable. But structs can be assign to each other, so it's valid to assign a compound struct literal to a struct variable.
That's the difference between the two cases.

compound literals and pointers

It's safe initialize pointers using compound literals in such way and it's possible at all?:
#include <stdio.h>
#include <string.h>
void numbers(int **p)
{
*p = (int []){1, 2, 3};
}
void chars(char **p)
{
*p = (char[]){'a','b','c'};
}
int main()
{
int *n;
char *ch;
numbers(&n);
chars(&ch);
printf("%d %c %c\n", n[0], ch[0], ch[1]);
}
output:
1 a b
I don't understand exactly how it's works, does it's not the same as init pointer with local variable?
also if i try to print:
printf("%s\n", ch);
It's print nothing.
A compound literal declared inside a function has automatic storage duration associated with its enclosing block (C 2018 6.5.2.5 5), which means its lifetime ends when execution of the block ends.
Inside numbers, *p = (int []){1, 2, 3}; assigns the address of the compound literal to *p. When numbers returns, the compound literal ceases to exist, and the pointer is invalid. After this, the behavior of a program that uses the pointer is undefined. The program might be able to print values because the data is still in memory, or the program might print different values because memory has changed, or the program might trap because it tried to access inaccessible memory, or the entire behavior of the program may change in drastic ways because compiler optimization changed the undefined behavior into something else completely.
It depends on where the compound literal is placed.
C17 6.5.2.5 §5
The value of the compound literal is that of an unnamed object initialized by the
initializer list. If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration associated with
the enclosing block.
That is, if the compound literal is at local scope, it works exactly like a local variable/array and it is not safe to return a pointer to it from a function.
If it is however declared at file scope, it works like any other variable with static storage duration, and you can safely return a pointer to it. However, doing so is probably an indication of questionable design. Plus you'll get the usual thread-safety issues in a multi-threaded application.

C compare two pointers greater than if one is null

If I compare two pointers in C I am aware of C 6.5.8/5 which says:
pointers to structure members declared later compare greater than pointers to members declared earlier in the structure
That is fine but what if one of the pointers is NULL? I know I can do foo != NULL but for example is this against the standard:
char *bar = NULL;
char *foo = "foo";
if(foo > bar) { function(); }
The section doesn't specifically address NULL in the case of greater than which is why I'm confused. Also if you could tell me if it applies to C89 as well as C99.
To clarify, this has nothing to do with structures that is just the part of the standard I was quoting. The code is very similar to what I describe above. I have some pointers to an array and one of those pointers may be null therefore I'd like to know if it's ok to compare using greater than.
Your example is indeed undefined. As explained in C11, 6.5.8., p5, every rule mandates that pointers point to the same object
or one past that object.
So, two pointers may be compared using relational operators: <, >, <=, >=, only if they point to the same object or one past that object. In all other cases:
6.5.8. Relational operators,
. In all other cases, the behavior is undefined.
Pointer with the value NULL, a null pointer, doesn't point to an object. This is explained in:
6.3.2.3 Pointers
If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
The C99 standard says:
If the objects pointed to are members of the same aggregate object,
pointers to structure members declared later compare greater than pointers to members
declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript
values.
The key here is in same object:
struct {
int a; // structrure members
int b;
} some_struct;
// so pointers to structure members:
&(some_struct.b) > &(some_struct.a)
The same applies for arrays:
char arr[128];
&(arr[100]) > &(arr[1])
If a pointer is NULL then it is most probably not pointing to a member of the same data structure, unless you're programming the BIOS (and even then it's against the standard, since the null pointer is guaranteed to not point inside any object), so the comparison becomes kinda useless
pointers to structure members declared later compare greater than pointers to members declared earlier in the structure
The quote only describes member variables inside a struct. For example:
struct A {
int a;
int b;
};
Then any instance of A has it's address of a smaller than b.
In other words, if someone write a statement like the followings:
A instanceA;
int* aa = &(instanceA.a);
int* ab = &(instanceA.b);
then it is guaranteed that ab > aa as member variable b is defined later than a.
That section of code is talking about:
#include <stdio.h>
struct things {
int blah;
int laterz;
};
struct things t;
int main ( void ) {
int * a = &t.blah;
int * b = &t.laterz;
printf("%p < %p\n", (void *)a, (void *)b);
return 0;
}
A pointer to laterz is greater than one to blah.
The specification also says that you can only compare two related pointers using anything else than equality or inequality.
So if you have two pointers that both point to different places in the same buffer then you can compare them using the greater or lesser than operators., not other wise.
Example
char buffer1[] = "foobar";
char buffer2[] = "some other text";
char *ptr1 = buffer1 + 3;
char *ptr2 = buffer2;
With the above, you can compare buffer1 and ptr1 using < and >. You can't do that with ptr1 and ptr2, only use the == or != operators.

Declare and initialize pointer concisely (i. e. pointer to int)

Given pointers to char, one can do the following:
char *s = "data";
As far as I understand, a pointer variable is declared here, memory is allocated for both variable and data, the latter is filled with data\0 and the variable in question is set to point to the first byte of it (i. e. variable contains an address that can be dereferenced). That's short and compact.
Given pointers to int, for example, one can do this:
int *i;
*i = 42;
or that:
int i = 42;
foo(&i); // prefix every time to get a pointer
bar(&i);
baz(&i);
or that:
int i = 42;
int *p = &i;
That's somewhat tautological. It's small and tolerable with one usage of a single variable. It's not with multiple uses of several variables, though, producing code clutter.
Are there any ways to write the same thing dry and concisely? What are they?
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
String literals are a corner case : they trigger the creation of the literal in static memory, and its access as a char array. Note that the following doesn't compile, despite 42 being an int literal, because it is not implicitly allocated :
int *p = &42;
In all other cases, you are responsible of allocating the pointed object, be it in automatic or dynamic memory.
int i = 42;
int *p = &i;
Here i is an automatic variable, and p points to it.
int * i;
*i = 42;
You just invoked Undefined Behaviour. i has not been initialized, and is therefore pointing somewhere at random in memory. Then you assigned 42 to this random location, with unpredictable consequences. Bad.
int *i = malloc(sizeof *i);
Here i is initialized to point to a dynamically-allocated block of memory. Don't forget to free(i) once you're done with it.
int i = 42, *p = &i;
And here is how you create an automatic variable and a pointer to it as a one-liner. i is the variable, p points to it.
Edit : seems like you really want that variable to be implicitly and anonymously allocated. Well, here's how you can do it :
int *p = &(int){42};
This thingy is a compound literal. They are anonymous instances with automatic storage duration (or static at file scope), and only exist in C90 and further (but not C++ !). As opposed to string literals, compound literals are mutable, i.e you can modify *p.
Edit 2 : Adding this solution inspired from another answer (which unfortunately provided a wrong explanation) for completeness :
int i[] = {42};
This will allocate a one-element mutable array with automatic storage duration. The name of the array, while not a pointer itself, will decay to a pointer as needed.
Note however that sizeof i will return the "wrong" result, that is the actual size of the array (1 * sizeof(int)) instead of the size of a pointer (sizeof(int*)). That should however rarely be an issue.
int i=42;
int *ptr = &i;
this is equivalent to writing
int i=42;
int *ptr;
ptr=&i;
Tough this is definitely confusing, but during function calls its quite useful as:
void function1()
{
int i=42;
function2(&i);
}
function2(int *ptr)
{
printf("%d",*ptr); //outputs 42
}
here, we can easily use this confusing notation to declare and initialize the pointer during function calls. We don't need to declare pointer globally, and the initialize it during function calls. We have a notation to do both at same time.
int *ptr; //declares the pointer but does not initialize it
//so, ptr points to some random memory location
*ptr=42; //you gave a value to this random memory location
Though this will compile, but it will invoke undefined behaviour as you actually never initialized the pointer.
Also,
char *ptr;
char str[6]="hello";
ptr=str;
EDIT: as pointed in the comments, these two cases are not equivalent.
But pointer points to "hello" in both cases. This example is written just to show that we can initialize pointers in both these ways (to point to hello), but definitely both are different in many aspects.
char *ptr;
ptr="hello";
As, name of string, str is actually a pointer to the 0th element of string, i.e. 'h'.
The same goes with any array arr[], where arr contains the address of 0th element.
you can also think it as array , int i[1]={42} where i is a pointer to int
int * i;
*i = 42;
will invoke undefined behavior. You are modifying an unknown memory location. You need to initialize pointer i first.
int i = 42;
int *p = &i;
is the correct way. Now p is pointing to i and you can modify the variable pointed to by p.
Are there any ways to write the same thing dry and concisely?
No. As there is no pass by reference in C you have to use pointers when you want to modify the passed variable in a function.
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
If you are learning C then you can't avoid pointers and you should learn to use it properly.

Initialization different from assignment?

1.char str[] = "hello"; //legal
2.char str1[];
str1 = "hello"; // illegal
I understand that "hello" returns the address of the string literal from the string literal pool which cannot be directly assigned to an array variable. And in the first case the characters from the "hello" literal are copied one by one into the array with a '\0' added at the end.
Is this because the assignment operator "=" is overloaded here to support this?
I would also like to know other interesting cases wherein initialization is different from assignment.
You cannot think of it as overloading (which doesn't exist in C anyway), because the initialization of char arrays with string literals is a special case. The type of a string literal is const char[N], so if it were similar to overloading, you'd be able to initialize a char array with any expression whose type is const char[N]. But you cannot!
const char arr[3];
const char arr1[] = arr; //compiler error. Cannot initialize array with another array.
The language standard simply says that character arrays can be initialized with string literals. Since they say nothing about assignment, the general rules apply, in particular, that an array cannot be assigned to.
As for other cases when initialization is different from assignment: in C++, where there are references and classes, there would be zillions of examples. In C, with no full-fledged classes or references, the only other thing I can think of off the top of my head is const variables:
const int a = 4; //OK;
const int b; //Error;
b = 4; //Error;
Another example: array initialization with braces
int a[3] = {1,2,3}; //OK
int b[3];
b = {1,2,3}; //error
Same with structs
If you want to think of it as the operator being overloaded (even though C doesn't use the term), you can of course do that.
Do you also consider this to be overloading:
unsigned char x;
double y;
x = 2;
y = 1.243;
Those are assigning totally different types of data, after all, but using the "same operator", right?
It's just different, to be initializing or to be assigning.
Another big difference is that you used to be able to initialize structures, but there was no corresponding "struct literal" syntax for later assignments. This is no longer true as of C99, where we now have compound literals.
char str[] = "hello";
Is array initialization, using syntactic sugar defined in C because string initialization is so common. The compiler allocates some fixed memory in your program an initializes it. The name of the array (str) evaluates to the address of this memory, and it cannot be changed because there is no variable which holds that address.
Grijesh Chauhan explains more details of this.
Other cases depend on what you mean. Extending the current case, you can easily see that other initialized arrays have the same properties, for example
int a[] = { 1, 2, 3, 4 };
Array has non modifiable address. You need a pointer as a modifiable lvalue.
By assigning(trying) to a contant string literal, you are taking the address of it. Different address causes that illegality.
"hello" allocates some space in memory and gives and address. Then you take its address to initialize the array.

Resources