I've got a struct x:
struct x {
__s32 array[10];
};
How can I create a pointer to array x->array, if I've got only pointer to stucture?
The straightaway method is the commonly used way, as
struct x * ptr = NULL;
//allocation
__s32 * otherPtr = ptr->array; //array name decays to pointer to first member
__s32 (*p) [10] = &(ptr->array); // pointer to whole array.
Otherwise, there's another way, but for specialized cases, quoting C11, chapter §6.7.2.1, Structure and union specifiers
[...] A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
So, in case, the array variable is the first member (or only member, as seen in above example) of the structure, the pointer to the structure variable, suitably converted to proper type, will also point to the beginning of the array member variable.
In this case, you can use a cast of (__s32 (*)[10]).
Correct way is
__s32 *pointer = x->array
It is equal to
__s32 *pointer = &(x->array[0])
Related
I have many types of structs in my project, and another struct that holds a pointer to one of these structs. Such as,
struct one{int num = 1;};
struct two{int num = 2;};
struct three{int num = 3;};
// These structs hold many other values as well, but the first value is always `int num`.
And I have another struct that holds references to these structs. I had to use void* because I do not know which of these structs is going to be referenced.
struct Holder{void* any_struct};
My question is, I need the values inside these structs, but I have a void pointer, could I declare a base struct that the first variable is an int, cast it, and use it to extract the num variable from these structs, such as:
struct Base{int num};
((Base*) Holder->any_struct)->num
// Gives 1, 2 or 3
If the only thing you need is to extract the num, you can use memcpy.
Assuming it's always int and always first and always present.
int num = 0;
memcpy(&num, Holder->any_struct, sizeof(int));
// Gives 1, 2 or 3 in num.
C99 standard section 6.7.2.1 bullet point 13:
A pointer to a structure object, suitably converted, points to its
initial member. There may be unnamed padding within a structure
object, but not at its beginning.
More info about the standard in this answer.
I think this is acceptable, and I've seen this pattern in other C projects. E.g., in libuv. They define a type uv_handle_t and refer to it as a "Base handle" ... here's info from their page (http://docs.libuv.org/en/v1.x/handle.html)
uv_handle_t is the base type for all libuv handle types.
Structures are aligned so that any libuv handle can be cast to
uv_handle_t. All API functions defined here work with any handle type.
And how they implement is pattern you could adopt. They define a macro for the common fields:
#define UV_HANDLE_FIELDS \
/* public */ \
void* data; \
/* read-only */ \
uv_loop_t* loop; \
uv_handle_type type; \
/* private */ \
uv_close_cb close_cb; \
void* handle_queue[2]; \
union { \
int fd; \
void* reserved[4]; \
} u; \
UV_HANDLE_PRIVATE_FIELDS \
/* The abstract base class of all handles. */
struct uv_handle_s {
UV_HANDLE_FIELDS
};
... and then they use this macro to define "derived" types:
/*
* uv_stream_t is a subclass of uv_handle_t.
*
* uv_stream is an abstract class.
*
* uv_stream_t is the parent class of uv_tcp_t, uv_pipe_t and uv_tty_t.
*/
struct uv_stream_s {
UV_HANDLE_FIELDS
UV_STREAM_FIELDS
};
The advantage of this approach is that you can add fields to the "base" class by updating this macro, and then be sure that all "derived" classes get the new fields.
First of all, the various rules of type conversions between different struct types in C are complex and not something one should meddle with unless one knows the rules of what makes two structs compatible, the strict aliasing rule, alignment issues and so on.
That being said, the simplest kind of base class interface is similar to what you have:
typedef struct
{
int num;
} base_t;
typedef struct
{
base_t base;
/* struct-specific stuff here */
} one_t;
one_t one = ...;
...
base_t* ref = (base_t*)&one;
ref->num = 0; // this is well-defined
In this code, the base_t* doesn't point directly at num but at the first object in the struct which is of base_t. It is fine to de-reference it because of that.
However, your original code with the int num spread over 3 structs doesn't necessarily allow you to cast from one struct type to another, even if you only access the initial member num. There's various details regarding strict aliasing and compatible types that may cause problems.
The construct you describe of using a pointer to a "base" structure as an alias to several "derived" structures, while often used with things like struct sockaddr, is not guaranteed to work by the C standard.
While there is some language to suggest is might be supported, particularly 6.7.2.1p15:
Within a structure object, the non-bit-field members and the
units in which bit-fields reside have addresses that increase in
the order in which they are declared. A pointer to a structure
object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it
resides), and vice versa. There may be unnamed padding within
a structure object, but not at its beginning.
Other parts suggest it is not, particularly 6.3.2.3 which discusses pointer conversions that are allowed:
1 A pointer to void may be converted to or from a pointer to any object type. A pointer toany object type may be converted to a
pointer to void and back again; the result shall compare equal
to the original pointer.
2 For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type;
the values stored in the original and converted pointers shall compare
equal.
3 An integer constant expression with the value 0, or such an expression cast to type void *, is called anull pointer
constant. If a null pointer constant is converted to a pointer type,
the resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.Any two null pointers shall
compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined,
might not be correctly aligned, might not point to an entity
of the referenced type, and might be a trap representation.
6 Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type,the behavior
is undefined. The result need not be in the range of values
of any integer type.
7 A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare
equal to the original pointer. When a pointer to an object is
converted to a pointer to a character type,the result points to the
lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining
bytes of the object.
8 A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare
equal to the original pointer. If a converted pointer is used to call
a function whose type is not compatible with the referenced type,the
behavior is undefined.
From the above is does not state that casting from one struct to another where the type of the first member is the same is allowed.
What is allowed however is making use of a union to do essentially the same thing. Section 6.5.2.3p6 states:
One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common
initial sequence (see below), and if the union object
currently contains one of these structures, it is permitted
to inspect the common initial part of any of them anywhere that a
declaration of the completed type of the union is visible. Two
structures share a common initial sequence if corresponding
members have compatible types (and, for bit-fields, the same widths)
for a sequence of one or more initial members.
So what you can do is define a union that contains all the possible types as well as the base type:
union various {
struct base { int num; } b;
struct one { int num; int a; } s1;
struct two { int num; double b; } s2;
struct three { int num; char *c; } s3;
};
Then you use this union anyplace you need of of the three subtypes, and you can freely inspect the base member to determine the type. For example:
void foo(union various *u)
{
switch (u->b.num) {
case 1:
printf("s1.a=%d\n", u->s1.a);
break;
case 2:
printf("s2.b=%f\n", u->s2.b);
break;
case 1:
printf("s3.c=%s\n", u->s3.c);
break;
}
}
...
union various u;
u.s1.num = 1;
u.s1.a = 4;
foo(&u);
u.s2.num = 2;
u.s2.b = 2.5;
foo(&u);
u.s3.num = 3;
u.s3.c = "hello";
foo(&u);
This code snippet prints the value 5. I don't understand why.
#include <stdio.h>
struct A
{
int x;
};
struct B
{
struct A a;
int y;
};
void printA(struct A *a)
{
printf("A obj: %d\n", a->x);
}
int main(void)
{
struct B b = {
{
5
},
10
};
struct A *a = (struct A*)&b;
printA(a);
printf("Done.\n");
return 0;
}
When I create b, a pointer to it would point to the data { {5}, 10 }.
When I cast &b to struct A*, I'm assuring the compiler that this struct A* points to a struct of a single data element of data type int. Instead, I'm providing it a pointer to a struct of two data elements of data types struct A, int.
Even if the second variable is ignored (since struct A has only one data member) I am still providing it a struct whose member is of data type struct A, not int.
Thus, when I pass in a to printA, the line a->x is performed, essentially asking to access the first data element of a. The first data element of a is of data type struct A, which is a type mismatch due to the %d expecting a digit, not a struct A.
What exactly is happening here?
When I create b, a pointer to it would point to the data { {5}, 10 }.
Yes, in the sense of that being the text of a type-appropriate and value-correct C initializer. That text itself should not be taken literally as the value of the structure.
When I cast &b to struct A*, I'm assuring the compiler that this
struct A* points to a struct of a single data element of data type
int.
No, not exactly. You are converting the value of the expression &b to type struct A *. Whether the resulting pointer actually points to a struct A is a separate question.
Instead, I'm providing it a pointer to a struct of two data
elements of data types struct A, int.
No, not "instead". Given that struct B's first member is a struct A, and that C forbids padding before the first member of a structure, a pointer to a struct B also points to a struct A -- the B's first member -- in a general sense. As #EricPostpischi observed in comments, the C standard explicitly specifies the outcome in your particular case: given struct B b, converting a pointer to b to type struct A * yields a pointer to b's first member., a struct A.
Even if the second variable is ignored (since struct A has only one
data member) I am still providing it a struct whose member is of data
type struct A, not int.
The first sizeof(struct A) bytes of the representation of a struct B form the representation of its first member, a struct A. That the latter is a member of the former has no physical manifestation other than their overlap in memory.
Even if the language did not explicitly specify it, given your declaration of variable b as a struct B, there would be no practical reason to expect that the expression (struct A*)&b == &b.a would evaluate to false, and there can be no question that the right-hand pointer can be used to access a struct A.
Thus, when I pass in a to printA, the line a->x is performed,
essentially asking to access the first data element of a.
Yes, and this is where an assertion enters that a really does point to a struct A. Which it does in your case, as already discussed.
The first
data element of a is of data type struct A,
No. *a is by definition a struct A. Specifically, it is the struct A whose representation overlaps the beginning of the representation of b. If there were not such a struct A then the behavior would be undefined, but that's not an issue here. Like every struct A, it has a member, designated by x, that is an int.
which is a type mismatch
due to the %d expecting a digit, not a struct A.
You mean expecting an int. And that's what it gets. That's what the expression a->x reads, supposing the behavior is defined at all, because that is the type of that expression. Under different circumstances the behavior might indeed not be defined, but under no circumstance does that expression ever provide a struct A.
What exactly is happening here?
What seems to be happening is that you are imagining different, higher-level semantics than C actually provides. In particular, you seem to have a mental model of structures as lists of distinguishable member objects, and that's leading you to form incorrect expectations.
Perhaps you are more familiar with a weakly typed language such as Perl, or a dynamically typed language such as Python, but C works differently. You cannot look at a C object and usefully ask "what is your type"? Instead, you look at each and every object through the lens of the static type of the expression used to access it.
The language-lawyer explanation of why the code is fine:
Any pointer in C may be converted to any other pointer type. (C17 6.3.2 §7).
If it is safe to dereference the pointed-at object after conversion depends on: 1) if the types are compatible and thereby correctly aligned, and 2) if the respective pointer types used are allowed to alias.
As a special case, a pointer to a struct type is equivalent to a pointer to its first member. The relevant part of C17 6.7.2 §15 says:
A pointer to a structure object,
suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in
which it resides), and vice versa.
This means that (struct A*)&b is fine. &b is suitably converted to the correct type.
There is no violation of "strict aliasing", since we fulfil C17 6.5 §7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
...
an aggregate or union type that includes one of the aforementioned types among its members
The effective type of the initial member being struct A. The lvalue access that happens inside the print function is fine. struct B is also an aggregate type that includes struct A among its members, so strict aliasing violations are impossible, regardless of the initial member rule cited at the top.
There is a special rule in the C standard for this case. C 2011 6.7.2.1 15 says:
A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa.
OK so I was reading the standard paper (ISO C11) in the part where it explains flexible array members (at 6.7.2.1 p18). It says this:
As a special case, the last element of a structure with more than one
named member may have an incomplete array type; this is called a
flexible array member. In most situations, the flexible array member
is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more
trailing padding than the omission would imply. However, when a . (or
->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member,
it behaves as if that member were replaced with the longest array
(with the same element type) that would not make the structure larger
than the object being accessed; the offset of the array shall remain
that of the flexible array member, even if this would differ from that
of the replacement array. If this array would have no elements, it
behaves as if it had one element but the behavior is undefined if any
attempt is made to access that element or to generate a pointer one
past it.
And here are some of the examples given below (p20):
EXAMPLE 2 After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to
use this is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to
by p behaves, for most purposes, as if p had been declared as:
struct { int n; double d[m]; } *p;
(there are circumstances in which this equivalence is broken; in
particular, the offsets of member d might not be the same).
Added spoilers as examples inside the standard are not documentation.
And now my example (extending the one from the standard):
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
struct s { int n; double d[]; };
int m = 7;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m])); //create our object
printf("%zu", sizeof(p->d)); //retrieve the size of the flexible array member
free(p); //free out object
}
Online example.
Now the compiler is complaining that p->d has incomplete type double[] which is clearly not the case according the standard paper. Is this a bug in the GCC compiler?
As a special case, the last element of a structure with more than one named member may have an incomplete array type; ... C11dr 6.7.2.1 18
In the following d is an incomplete type.
struct s { int n; double d[]; };
The sizeof operator shall not be applied to an expression that has function type or an incomplete type ... C11dr §6.5.3.4 1
// This does not change the type of field `m`.
// It (that is `d`) behaves like a `double d[m]`, but it is still an incomplete type.
struct s *p = foo();
// UB
printf("%zu", sizeof(p->d));
This looks like a defect in the Standard. We can see from the paper where flexible array members were standardized, N791 "Solving the struct hack problem", that the struct definition replacement is intended to apply only in evaluated context (to borrow the C++ terminology); my emphasis:
When an lvalue whose type is a structure
with a flexible array member is used to access an object, it behaves as
if that member were replaced by the longest array that would not make
the structure larger than the object being accessed.
Compare the eventual standard language:
[W]hen a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed [...]
Some form of language like "When a . (or ->) operator whose left operand is (a pointer to) a structure with a flexible array member and whose right operand names that member is evaluated [...]" would seem to work to fix it.
(Note that sizeof does not evaluate its argument, except for variable length arrays, which are another kettle of fish.)
There is no corresponding defect report visible via the JTC1/SC22/WG14 website. You might consider submitting a defect report via your ISO national member body, or asking your vendor to do so.
Standard says:
C11-§6.5.3.4/2
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand.
and it also says
C11-§6.5.3.4/1
The sizeof operator shall not be applied to an expression that has function type or an incomplete type, [...]
p->d is of incomplete type and it can't be an operand of sizeof operator. The statement
it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed
doesn't hold for sizeof operator as it determine size of the object by the type of object which must be a complete type.
First, what is happening is correct in terms of the standard, arrays that are declared [] are incomplete and you can't use the sizeof operator.
But there is also a simple reason for it in your case. You never told your compiler that in that particular case the d member should be viewed as of a particular size. You only told malloc the total memory size to be reserved and placed p to point to that. The compiler has obtained no type information that could help him deduce the size of the array.
This is different from allocating a variable length array (VLA) or a pointer to VLA:
double (*q)[m] = malloc(sizeof(double[m]));
Here the compiler can know what type of array q is pointing to. But not because you told malloc the total size (that information is not returned from the malloc call) but because m is part of the type specification of q.
The C Standard is a bit loosey-goosey when it comes to the definition of certain terms in certain contexts. Given something like:
struct foo {uint32_t x; uint16_t y[]; };
char *p = 1024+(char*)malloc(1024); // Point to end of region
struct foo *q1 = (struct foo *)(p -= 512); // Allocate some space from it
... some code which uses *q1
struct foo *q2 = (struct foo *)(p -= 512); // Allocate more space from it
there's no really clear indication of what storage is occupied by objects
*q1 or *q2, nor by q1->y or q2->y. If *q1 will never be accessed afterward,
then q2->y may be treated as a uint16_t[509], but writing to *q1 will trash
the contents of q2->y[254] and above, and writing q2->y[254] and above will
trash *q1. Since a compiler will generally have no way of knowing what will
happen to *q1 in the future, it will have no way of sensibly reporting a size
for q2->y.
Is a pointer to the struct aligned as if it were a pointer to the first element?
or
Is a conversion between a pointer to a struct and a pointer to the type of its first member (or visa versa) ever UB?
(I hope they are the same question...)
struct element
{
tdefa x;
tdefb y;
};
int foo(struct element* e);
int bar(tdefa* a);
~~~~~
tdefa i = 0;
foo((struct element*)&i);
or
struct element e;
bar((tdefa*)&e);
Where tdefa and tdefb could be defined as any type
Background:
I asked this question
and a user in a comment on one of the answers brought up C11 6.3.2.3 p7 that states:
"A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined"
However I am having trouble working out when this would become an issue, my understanding was that padding would allow all members of the struct to be aligned correctly. Have I misunderstood?
and if:
struct element e;
tdefa* a = &e.x;
would work then:
tdefa* a = (tdefa*)&e;
would too.
There is never any initial padding; the first member of a struct is required to start at the same address as the struct itself.
You can always access the first member of a struct by casting a pointer to the whole struct, to be a pointer to the type of the first member.
Your foo example might run into trouble because foo will be expecting its argument to point to a struct element which in fact it does not, and there might be an alignment mismatch.
However the bar example and the final example is fine.
A pointer to a structure always points to its initial member.
Here is the citation directly from C99 standard (6.7.2.1, paragraph 13), emphasis mine:
Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning
As for your foo and bar examples:
The call to bar will be fine, as bar expects a tdefa, which is exactly what it's getting.
The call to foo, however, is problematic. foo expects a full struct element, but you're only passing a tdefa (while the struct consists of both tdefa and tdefb).
I have a struct:
struct TypeValue{
u8 type;
union{
u8 u8value;
s8 s8value;
}value;
}
Depending on type var we can have value from u8value or s8value.
Now I have a struct TypeValue and I want to get the void pointer to the value element (dont care pointer type), which one is correct:
void *ptr = &typevalue.value (1)
OR
void *ptr = &typevalue.value.u8value (2)
(put this in a for loop to find correct value depending on type)
Of course (2) is a correct way but I have to loop through the data type to get the correct pointer to correct value. But question is if (1) is correct as I am wondering if the pointer to the union is equal to the pointer to its element?
Does big/little endian affect the result?
void *ptr = &typevalue.value will do, because C standard (N1570, 6.7.2.1 Structure and union specifiers) says:
16 The size of a union is sufficient to contain the largest of its members. The value of at
most one of the members can be stored in a union object at any time. A pointer to a
union object, suitably converted, points to each of its members (or if a member is a bit-
field, then to the unit in which it resides), and vice versa.
Address only makes sense in the current running environment (you don't store the address to flash), so endianness of address doesn't matter.
The three ideas are correct.
void *ptr = &(typevalue.value.u8value);
void *ptr = &(typevalue.value.s8value);
void *ptr = &(typevalue.value);
A union is a struct on memory that take the size of the larger type declared inside. All the fields use the same memory space, is just giving semantical information to the definition.