I'm trying to understand how 'const' works in C.
What I would like to create is a polygon struct whose members cannot be mutated.
I started by creating the following structs
struct vector2{
float x;
float y;
};
struct polygon2{
const size_t count;
struct vector2* const points;
};
To create a polygon I created the following function:
struct polygon2* polygon2_create(size_t count)
{
struct vector2* points = calloc(count, sizeof *points);
struct polygon2 temp = {.count = count,
.points = points};
struct polygon2* actual = malloc(sizeof *actual);
memcpy(actual, &temp, sizeof(*actual));
return actual;
}
I believe this function doesn't cause undefined behavior.
This way I can do things like
struct polygon2* poly = polygon2_create(30);
poly->points[3] = (struct vector2){7.1, 5.3};
But I can't do
poly->points = NULL;
Nor
poly->count = 3;
Which is great. I'm sure that I won't accidentally change the contents of struct polygon.
But I'd like to make vector2's members const too.
If I change vector2 to:
struct vector2{
const float x;
const float y;
};
I no longer am able to do this:
poly->points[3] = (struct vector2){7.1, 5.3};
I'd like to know why. I expected that making vector2's members const I wouldn't be able to do this
poly->points[3].x = 3
But I'd still be able to do this
poly->points = otherpoint;
Can someone explain what am I missing? And how can I achieve the following:
create a "immutable" vector2 struct
create a polygon struct whose points or count member can't be changed, but the things pointed by points can be 'swaped'.
const qualification of a type means that lvalue expressions of that type are not modifiable. In particular, lvalues of const type or of composite type with at least one const member, recursively, cannot be the left-hand side of an assignment operator, and pointers to such objects cannot be free()d.
Moreover, qualified, including const-qualified, types are different types from their unqualified counterparts and from differently-qualified versions of the underlying unqualifed type. This has implications on compatibility of composite types that have qualified members.
On the other hand, do not mistake const to be a promise that the value is actually constant. It can be the case that the same object can designated by multiple lvalues, some const and others non-const. In that case, the object can be modified via any of the non-const lvalues that designate it, and those modifications will be visible even via the const lvalues that designate it.
With respect to the specifics of your question:
I agree that your function polygon2_create() is valid and has well-defined behavior. In particular, const members of a struct can be initialized in an initializer, and functions such as memcpy() can modify memory in which an object that can be referenced via a const lvalue is stored. Your compiler might warn about the memcpy(), though.
More generally, the initialization and assignment behavior and constraints you describe are correct.
As for poly->points[3] = (struct vector2){7.1, 5.3};, how would it make sense for that to be acceptable if the members of struct vector2 were const? If allowed, the assignment certainly would modify them, and preventing that is exactly the point of const. Or if you prefer a citation to authority, C2011 6.3.2.1/1 specifies that if a structure type has any const members, then lvalue expressions of that type are not modifiable.
It sounds like you are confused about the semantics of whole-struct assignment. If you assign one struct to a different struct, you are not replacing one struct with the other; rather you are copying the value of one struct to the other. This is exactly analogous to assignments to simple types, such as int.
You asked,
how can I achieve the following:
create a "immutable" vector2 struct
You already know how to do this, to the extent that it is possible. If you make all the members const then they cannot be modified via an expression of type struct vector2. As I remarked before, however, this does not confer absolute immutability. C has no such thing.
create a polygon struct whose points or count member can't be changed, but the things pointed by points can be 'swaped'.
I'm not sure I understand how an ability to swap points is consistent with your desired level of unmodifiability. Certainly, if struct vector2 has const members then you cannot assign to lvalue expressions of that type. You could still perform swapping via memcpy(), though, or by casting to a modifiable type. These mechanisms do, however, violate at least the spirit of const-ness. Your compiler will likely warn about them.
You could consider changing points from a struct vector2 * const to a struct vector2 ** const. You can then swap the (non-const) struct vector * objects accessible via *points:
struct vector *temp = poly->points[3];
poly->points[3] = poly->points[2];
poly->points[2] = temp;
Your focus on immutability makes me wonder whether you come from a Java background; either way, that's more Java-esque, since all non-primitives in Java are references, which are more or less pointers.
Overall, however, I think you are overly focused on immutability. const-ness will cause you trouble, especially with memory management. Consider doing without, at least for your struct members themselves. At most, use const qualification on function parameter types to express that the function will not modify the actual argument, and perhaps on global variables where you want at least to be warned about any possibility that they will be modified.
Related
I came across older code in a project that declares a struct in the H file as
struct A {
const int i;
};
Now the function that creates struct A pointers internally looks like this
struct A * newStructA ( int i ) {
struct B * ptr = malloc(sizeof(struct B));
ptr->i = i;
return (struct A *)ptr;
}
and struct B is declared in the C file and looks like this:
struct B {
int i;
};
Question 1: Is that even allowed according to C standard? Sure, the data type of i is the same and the struct should have the same memory layout, but is that guaranteed or could the const modifier also change the memory layout on certain systems?
Question 2: Knowing that all struct A * pointers the code deals with are in fact struct B * pointers, would it be allowed to cast the pointers back to struct B * and then modify the int value? The declaration says it is const, but I know for sure that it isn't as it's always located in modifiable heap memory. Or could that have implications as the C compiler may rely on the value being constant, so it assumes it cannot ever change and thus if two lines of code contain ptr->i the compiler may not even fetch the value a second time as how could it have changed if it is const? As that would lead to very hard to trace bugs that may only be seen if a certain optimization level is being used.
Question 3: Is there a better way to achieve a const value that external code cannot directly change (or at least should never try to), yet internal code can change as the value is in fact not const at all? The only way I can think of would be to hide the struct layout altogether (struct A;) but then I need to provide a function like int getI(struct A * ptr) and always access the value using that function.
Are you allowed to modify a C value that claims to be const if you know for sure that it actually isn't constant?
Yes.
Is that even allowed according to C standard?
Yes.
is that guaranteed or could the const modifier also change the memory layout on certain systems?
It could, when defining the variable as const the variable will typically be placed in a different memory region that is read-only.
Knowing that all struct A * pointers the code deals with are in fact struct B * pointers, would it be allowed to cast the pointers back to struct B * and then modify the int value?
It is unclear. Can you safely cast a C structure with non-const members to an equivalent structure with const members?
Or could that have implications as the C compiler may rely on the value being constant, so it assumes it cannot ever change and thus if two lines of code contain ptr->i the compiler may not even fetch the value a second time as how could it have changed if it is const?
Yes. Related https://stackoverflow.com/a/20707255/9072753 .
Is there a better way to achieve a const value that external code cannot directly change (or at least should never try to), yet internal code can change as the value is in fact not const at all?
This is C. The spirit of C says https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2086.htm :
(1) Trust the programmer.
(2) Don't prevent the programmer from doing what needs to be done.
In my opinion, a better way is not to hide and just use struct with non-const members. A C programmer will be able to access it anyway.
The only way I can think of would be to hide the struct layout altogether (struct A;) but then I need to provide a function like int getI(struct A * ptr)
Yes, hiding something will cause runtime overhead. FILE * has been with us since forever, and the members of FILE are visible to user code in many implementations. Yet no one uses them.
To achieve that a value should not be changed by external code, write a specification of your library that external code should not do it.
To hide your proprietary code, use accessors and let users operate only on pointers to your data with PIMPL idiom.
C 2011 draft
6.7.3 Type qualifiers ... 6 If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with
non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type
through use of an lvalue with non-volatile-qualified type, the behavior
is undefined.133) 133) This applies to those objects that behave as if
they were defined with qualified types, even if they are never actually
defined as objects in the program (such as an object at a memory-mapped
input/output address).
With these definitions:
struct My_Header { uintptr_t bits; }
struct Foo_Type { struct My_Header header; int x; }
struct Foo_Type *foo = ...;
struct Bar_Type { struct My_Header header; float x; }
struct Bar_Type *bar = ...;
Is it correct to say that this C code ("case one"):
foo->header.bits = 1020;
...is actually different semantically from this code ("case two"):
struct My_Header *alias = &foo->header;
alias->bits = 1020;
My understanding is that they should be different:
Case One considers the assignment unable to affect the header in a Bar_Type. It only is seen as being able to influence the header in other Foo_Type instances.
Case Two, by forcing access through a generic aliasing pointer, will cause the optimizer to realize all bets are off for any type which might contain a struct My_Header. It would be synchronized with access through any pointer type. (e.g. if you had a Foo_Type which was pointing to what was actually a Bar_Type, it could access through the header and reliably find out which it had--assuming that's something the header bits could tell you.)
This relies on the optimizer not getting "smart" and making case two back into case one.
The code bar->header.bits = 1020; is exactly identical to struct My_Header *alias = &bar->header; alias->bits = 1020;.
The strict aliasing rule is defined in terms of access to objects through lvalues:
6.5p7 An object shall have its stored value accessed
only by an lvalue expression that has one of the following
types:
The only things that matter are the type of the lvalue, and the effective type of the object designated by the lvalue. Not whether you stored some intermediate stages of the lvalue's derivation in a pointer variable.
NOTE: The question was edited since the following text was posted. The following text applies to the original question where the space was allocated by malloc, not the current question as of August 23.
Regarding whether the code is correct or not. Your code is equivalent to Q80 effective_type_9.c in N2013 rev 1571, which is a survey of existing C implementations with an eye to drafting improved strict aliasing rules.
Q80. After writing a structure to a malloc’d region, can its members be accessed via a pointer to a different structure type that has the same leaf member type at the same offset?
The stumbling block is whether the code (*bar).header.bits = 1020; sets the effective type of only the int bits; or of the entire *bar. And accordingly, whether reading (*foo).header.bits reads an int, or does it read the entire *foo?
Reading only an int would not be a strict aliasing violation (it's OK to read int as int); but reading a Bar_Struct as Foo_Struct would be a violation.
The authors of this paper consider the write to set the effective type for the entire *bar, although they don't give their justification for that, and I do not see any text in the C Standard to support that position.
It seems to me there's no definitive answer currently for whether or not your code is correct.
The fact that you have two structures which contain My_Header is a red herring and complicates your thinking without bringing anything new to the table. Your problem can be stated and clarified without any struct (other than My_Header ofcourse).
foo->header.bits = 1020;
The compiler clearly knows which object to modify.
struct My_Header *alias = &foo->header;
alias->bits = 1020;
Again the same is true here: with a very rudimentary analysis the compiler knows exactly which object the alias->bits = 1020; modifies.
The interesting part comes here:
void foo(struct My_Header* p)
{
p->bits = 1020;
}
In this function the pointer p can alias any object (or sub-object) of type My_header. It really doesn't matter if you have N structures who contain My_header members or if you have none. Any object of type My_Header could be potentially modified in this function.
E.g.
// global:
struct My_header* global_p;
void foo(struct My_Header* p)
{
p->bits = 1020;
global_p->bits = 15;
return p->bits;
// the compiler can't just return 1020 here because it doesn't know
// if `p` and `global_p` both alias the same object or not.
}
To convince you that the Foo_Type and Bar_Type are red herrings and don't matter look at this example for which the analysis is identical to the previous case who doesn't involve neither Foo_Type nor Bar_type:
// global:
struct My_header* gloabl_p;
void foo(struct Foo_Type* foo)
{
foo->header.bits = 1020;
global_p->bits = 15;
return foo->header.bits;
// the compiler can't just return 1020 here because it doesn't know
// if `foo.header` and `global_p` both alias the same object or not.
}
The way N1570 p5.6p7 is written, the behavior of code that accesses individual members of structures or unions will only be defined if the accesses are performed using lvalues of character types, or by calling library functions like memcpy. Even if a struct or union has a member of type T, the Standard (deliberately IMHO) refrains from giving blanket permission to access that part of the aggregate's storage using seemingly-unrelated lvalues of type T. Presently, gcc and clang seem to grant blanket permission for accessing structs, but not unions, using lvalues of member type, but N1570 p5.6p7 doesn't require that. It applies the same rules to both kinds of aggregates and their members. Because the Standard doesn't grant blanket permission to access structures using unrelated lvalues of member type, and granting such permission impairs useful optimizations, there's no guarantee gcc and clang will continue this behavior with with unrelated lvalues of member types.
Unfortunately, as can be demonstrated using unions, gcc and clang are very poor at recognizing relationships among lvalues of different types, even when one lvalue is quite visibly derived from the other. Given something like:
struct s1 {short x; short y[3]; long z; };
struct s2 {short x; char y[6]; };
union U { struct s1 v1; struct s2 v2; } unionArr[100];
int i;
Nothing in the Standard would distinguish between the "aliasing" behaviors of the following pairs of functions:
int test1(int i)
{
return unionArr[i].v1.x;
}
int test2a(int j)
{
unionArr[j].v2.x = 1;
}
int test2a(int i)
{
struct s1 *p = &unionArr[i].v1;
return p->x;
}
int test2b(int j)
{
struct s2 *p = &unionArr[j].v2;
p->x = 1;
}
Both of them use an lvalue of type int to access the storage associated with objects of type struct s1, struct s2, union U, and union U[100], even though int is not listed as an allowable type for accessing any of those.
While it may seem absurd that even the first form would invoke UB, that shouldn't be a problem if one recognizes support for access patterns beyond those explicitly listed in the Standard as a Quality of Implementation issue. According to the published rationale, the authors of the Standard thought compiler writers would to try to produce high-quality implementations, and it was thus not necessary to forbid "conforming" implementations from being of such low quality as to be useless. An implementation could be "conforming" without being able to handle test1a() or test2b() in cases where they would access member v2.x of a union U, but only in the sense that an implementation could be "conforming" while being incapable of correctly processing anything other than some particular contrived and useless program.
Unfortunately, although I think the authors of the Standard would likely have expected that quality implementations would be able to handle code like test2a()/test2b() as well as test1a()/test1b(), neither gcc nor clang supports them pattern reliably(*). The stated purpose of the aliasing rules is to avoid forcing compilers to allow for aliasing in cases where there's no evidence of it, and where the possibility of aliasing would be "dubious" [doubtful]. I've seen no evidence that they intended that quality compilers wouldn't recognize that code which takes the address of unionArr[i].v1 and uses it is likely to access the same storage as other code that uses unionArr[i] (which is, of course, visibly associated with unionArr[i].v2). The authors of gcc and clang, however, seem to think it's possible for something to be a quality implementation without having to consider such things.
(*) Given e.g.
int test(int i, int j)
{
if (test2a(i))
test2b(j);
return test2a(i);
}
neither gcc nor clang will recognize that if i==j, test2b(j) would access the same storage as test2a(i), even when though both would access the same element of the same array.
I'm trying to implement a linked list like this:
typedef struct SLnode
{
void* item;
void* next;
} SLnode;
typedef struct DLnode
{
void* item;
void* next;
struct DLnode* prev;
} DLnode;
typedef struct LinkedList
{
void* head; /*SLnode if doubly_linked is false, otherwise DLnode*/
void* tail; /* here too */
bool doubly_linked;
} LinkedList;
And I want to access it like this:
void* llnode_at(const LinkedList* ll, size_t index)
{
size_t i;
SLnode* current;
current = ll->head;
for(i = 0; i < index; i++)
{
current = current->next;
}
return current;
}
So my question is:
Am I allowed to cast between these structs as long as I only access the common members? I read differing opinions on this.
Could I also make the next-pointer of the respective types? Or would it be UB then to use it in my example function in case it really is DLnode?
In case this doesn't work, are there any other ways of doing something like this? I read that unions might work, but this code should also run in C89, and afaik reading a different union member than last written to is UB there.
So you are trying to build subclasses in C. A possible way is to make the base struct to be the first element of the child struct, because in that case C standard explicitely allows casting back and forth between those 2 types:
6.7.2.1 Structure and union specifiers
§ 13 ... A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa...
The downside is that you need a cast to the base class to access its members:
Example code:
typedef struct SLnode
{
void* item;
void* next;
} SLnode;
typedef struct DLnode
{
struct SLnode base;
struct DLnode* prev;
} DLnode;
You can then use it that way:
DLnode *node = malloc(sizeof(DLnode));
((SLnode*) node)->next = NULL; // or node->base.next = NULL
((SLnode *)node)->item = val;
node->prev = NULL;
You can do this safely provided you use a union to contain the two structures:
union Lnode {
struct SLnode slnode;
struct DLnode dlnode;
};
Section 6.5.2.3 of the current C standard, as well as section 6.3.2.3 of the C89 standard, states the following:
6 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common
initial sequence (see below), and if the union object currently
contains one of these structures, it is permitted to inspect the
common initial part of any of them anywhere that a declaration of the
completed type of the union is visible. Two structures share a common
initial sequence if corresponding members have compatible types (and,
for bit-fields, the same widths) for a sequence of one or more initial
members.
Because the first two members of both structures are of the same type, you can freely access those members using either union member.
What you describe should be allowed under the C Standard. The confusion of the Common Initial Sequence rule stems from a bigger problem: the Standard fails to specify when the use of a pointer or lvalue which is visibly derived from another is considered to have be a use of the original. If the answer is "never", then any struct or union member of a non-character type would be pretty much useless, since the member would be an lvalue whose type isn't valid for accessing the struct or union. Such a view would clearly be absurd. If the answer is "only when it is formed by directly applying "." or "->" on the struct or union type, or a pointer to such a type, that would make the ability to use "&" on struct and union members rather useless. I'd regard that view as only slightly less absurd.
I think it's clear that in order to be useful the C language must be viewed as allowing derived lvalues to be used in at least some circumstances. Whether your code, or most code relying upon the Common Initial Sequence rule, is usable depends upon what those circumstances are.
The language would be rather silly if code couldn't reliably use derived lvalues to access structure members. Unfortunately, even though this problem was apparent in 1992 (it forms the underlying basis of Defect Report #028, published in that year) the Committee didn't address the fundamental issue but instead reached a correct conclusion based upon totally nonsensical logic, and has since gone and added needless complexity in the form of "Effective Types" without ever bothering to actually define the behavior of someStruct.member.
Consequently, there is no way to write any code which does much of anything with structs or unions without relying upon more behaviors than would actually be guaranteed by a literal reading of the Standard, whether such accesses are done by coercing void* or pointers to proper member types.
If one reads the intention of 6.5p7 as being to somehow allow actions which use an lvalue which is derived from one of a particular type to access objects of that type, at least in cases that don't involve actual aliasing (note a huge stretch, given footnote #88 "The intent of this list is to specify those circumstances in which an object may or may not be aliased."), and recognizes that aliasing requires that a region of storage be accessed using a reference X at a time when there exists another reference from which X was not visibly derived that will in future be used to access the storage in conflicting fashion, then compilers that honor that intention should be able to handle code like yours without difficulty.
Unfortunately, both gcc and clang seem to interpret p6.5p7 as saying that an lvalue which is derived from one of another type should often be presumed incapable of actually identifying objects of that former type even in cases where the derivation is fully visible.
Given something like:
struct s1 {int x;};
struct s2 {int x;};
union u {struct s1 v1; struct s2 v2;};
int test(union u arr[], int i1, int i2)
{
struct s1 *p1 = &arr[i1].v1;
if (p1->x)
{
struct s2 *p2 = &arr[i2].v2;
p2->x=23;
}
struct s1 *p3 = &arr[i1].v1;
return p3->x;
}
At the time p1->x is accessed, p1 is clearly derived from an lvalue of union type, and should thus be capable of accessing such an object, and the only other existing references that will ever be used to access the storage are references to that union type. Likewise when p2->x and p3->x are accessed. Unfortunately, both gcc and clang interpret N1570 6.5p7 as an indication that they should ignore the relationships between the union and the pointers to its members. If gcc and clang can't be relied upon to usefully allow code like the above to access the Common Initial Sequence of identical structures, I wouldn't trust them to reliably handle structures like yours either.
Unless or until the Standard is corrected to say under what cases a derived lvalue may be used to access a member of a struct or union, it's unclear that any code that does anything remotely unusual with structures or unions should be particularly expected to work under the -fstrict-aliasing dialects of gcc and clang. On the other hand, if one recognizes the concept of lvalue derivation as working both ways, a compiler might be justified in assuming that a pointer which is of one structure type won't be used in ways that would alias a reference to another, even if the pointer is cast to the second type before use. I'd therefore suggest that using void* would be less likely to run into trouble if the Standard ever fixes the rules.
Now I know I can implement inheritance by casting the pointer to a struct to the type of the first member of this struct.
However, purely as a learning experience, I started wondering whether it is possible to implement inheritance in a slightly different way.
Is this code legal?
#include <stdio.h>
#include <stdlib.h>
struct base
{
double some;
char space_for_subclasses[];
};
struct derived
{
double some;
int value;
};
int main(void) {
struct base *b = malloc(sizeof(struct derived));
b->some = 123.456;
struct derived *d = (struct derived*)(b);
d->value = 4;
struct base *bb = (struct base*)(d);
printf("%f\t%f\t%d\n", d->some, bb->some, d->value);
return 0;
}
This code seems to produce desired results , but as we know this is far from proving it is not UB.
The reason I suspect that such a code might be legal is that I can not see any alignment issues that could arise here. But of course this is far from knowing no such issues arise and even if there are indeed no alignment issues the code might still be UB for any other reason.
Is the above code valid?
If it's not, is there any way to make it valid?
Is char space_for_subclasses[]; necessary? Having removed this line the code still seems to be behaving itself
As I read the standard, chapter §6.2.6.1/P5,
Certain object representations need not represent a value of the object type. If the stored
value of an object has such a representation and is read by an lvalue expression that does
not have character type, the behavior is undefined. [...]
So, as long as space_for_subclasses is a char (array-decays-to-pointer) member and you use it to read the value, you should be OK.
That said, to answer
Is char space_for_subclasses[]; necessary?
Yes, it is.
Quoting §6.7.2.1/P18,
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. However, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed; the
offset of the array shall remain that of the flexible array member, even if this would differ
from that of the replacement array. If this array would have no elements, it behaves as if
it had one element but the behavior is undefined if any attempt is made to access that
element or to generate a pointer one past it.
Remove that and you'd be accessing invalid memory, causing undefined behavior. However, in your case (the second snippet), you're not accessing value anyway, so that is not going to be an issue here.
This is more-or-less the same poor man's inheritance used by struct sockaddr, and it is not reliable with the current generation of compilers. The easiest way to demonstrate a problem is like this:
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
struct base
{
double some;
char space_for_subclasses[];
};
struct derived
{
double some;
int value;
};
double test(struct base *a, struct derived *b)
{
a->some = 1.0;
b->some = 2.0;
return a->some;
}
int main(void)
{
void *block = malloc(sizeof(struct derived));
if (!block) {
perror("malloc");
return 1;
}
double x = test(block, block);
printf("x=%g some=%g\n", x, *(double *)block);
return 0;
}
If a->some and b->some were allowed by the letter of the standard to be the same object, this program would be required to print x=2.0 some=2.0, but with some compilers and under some conditions (it won't happen at all optimization levels, and you may have to move test to its own file) it will print x=1.0 some=2.0 instead.
Whether the letter of the standard does allow a->some and b->some to be the same object is disputed. See http://blog.regehr.org/archives/1466 and the paper it links to.
NOTE: this is NOT a C++ question, i can't use a C++ compiler, only a C99.
Is this valid(and acceptable, beautiful) code?
typedef struct sA{
int a;
} A;
typedef struct aB{
struct sA a;
int b;
} B;
A aaa;
B bbb;
void init(){
bbb.b=10;
bbb.a.a=20;
set((A*)&bbb);
}
void set(A* a){
aaa=*a;
}
void useLikeB(){
printf("B.b = %d", ((B*)&aaa)->b);
}
In short, is valid to cast a "sub class" to "super class" and after recast "super class" to "sub class" when i need specified behavior of it?
Thanks
First of all, the C99 standard permits you to cast any struct pointer to a pointer to its first member, and the other way (6.7.2.1 Structure and union specifiers):
13 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other way, in your code you are free to:
Convert B* to A* — and it will always work correctly,
Convert A* to B* — but if it doesn't actually point to B, you're going to get random failures accessing further members,
Assign the structure pointed through A* to A — but if the pointer was converted from B*, only the common members will be assigned and the remaining members of B will be ignored,
Assign the structure pointed through B* to A — but you have to convert the pointer first, and note (3).
So, your example is almost correct. But useLikeB() won't work correctly since aaa is a struct of type A which you assigned like stated in point (4). This has two results:
The non-common B members won't be actually copied to aaa (as stated in (3)),
Your program will fail randomly trying to access A like B which it isn't (you're accessing a member which is not there, as stated in (2)).
To explain that in a more practical way, when you declare A compiler reserves the amount of memory necessary to hold all members of A. B has more members, and thus requires more memory. As A is a regular variable, it can't change its size during run-time and thus can't hold the remaining members of B.
And as a note, by (1) you can practically take a pointer to the member instead of converting the pointer which is nicer, and it will allow you to access any member, not only the first one. But note that in this case, the opposite won't work anymore!
I think this is quite dirty and relatively hazardous. What are you trying to achieve with this? also there is no guarantee that aaa is a B , it might also be an A. so when someone calls "uselikeB" it might fail. Also depending on architecture "int a" and "pointer to struct a" might either overlap correctly or not and might result in interesting stuff happening when you assign to "int a" and then access "struct a"
Why would you do this? Having
set((A*)&bbb);
is not easier to write than the correct
set(&bbb.a);
Other things that you should please avoid when you post here:
you use set before it is declared
aaa=a should be aaa = *a
First of all, I agree with most concerns from previous posters about the safety of this assignments.
With that said, if you need to go that route, I'd add one level of indirection and some type-safety checkers.
static const int struct_a_id = 1;
static const int struct_b_id = 2;
struct MyStructPtr {
int type;
union {
A* ptra;
B* ptrb;
//continue if you have more types.
}
};
The idea is that you manage your pointers by passing them through a struct that contains some "type" information. You can build a tree of classes on the side that describe your class tree (note that given the restrictions for safely casting, this CAN be represented using a tree) and be able to answer questions to ensure you are correctly casting structures up and down. So your "useLikeB" function could be written like this.
MyStructPtr the_ptr;
void init_ptr(A* pa)
{
the_ptr.type = struct_a_id
the_ptr.ptra = pa;
}
void useLikeB(){
//This function should FAIL IF aaa CANT BE SAFELY CASTED TO B
//by checking in your type tree that the a type is below the
//a type (not necesarily a direct children).
assert( is_castable_to(the_ptr.type,struct_b_id ) );
printf("B.b = %d", the_ptr.ptrb->b);
}
My 2 cents.