What does ((Struct*)0) Mean? - c

I encountered a problem in reading a piece of C code. code show as below :
#define size_of_attribute(Struct, Attribute) sizeof(((Struct*)0)->Attribute)
The function of this macro function is gets the length of the attribute in the struct. I know what this function is for, but i can't understand the meaning of "((Struct*)0)".
I will appeaciate If you can give me some explanation :).

The constant value 0 qualifies as a null pointer constant. The expression (Struct*)0 is therefore casting that null pointer constant to a pointer of type Struct *. The expression then gets the Attribute member.
Attempting to evaluate ((Struct*)0)->Attribute would result in a null pointer defererence, however this expression is the argument to the sizeof operator. This means the expression is not actually evaluated but simply examined to determine its type.
So sizeof(((Struct*)0)->Attribute) gives you the size of the Attribute member of the struct named Struct without having to have an object of that type.

This is basically accessing a member variable type without actually mentioning / creating any variable of that structure type.
Here,
the 0 is casted to the structure type pointer, and
then that pointer is used to access the member variable
which is used as the operand of sizeof operator.
Since sizeof is a compile time operation, the NULL dereference never actually executes at runtime.

It's casting a null pointer to the Struct* type so it can determine the size of the attribute of that struct. Normally, reading an attribute from NULL is illegal, but for sizeof, it doesn't actually read anything, it just looks at the definition of the struct to determine the statically defined size of the attribute of any such struct.
At least for C++, this is useful because unlike a non-pointer-based:
sizeof(Struct{}.Attribute)
it doesn't require Struct to have a default constructor. A pointer can be made with no knowledge of how to construct the object, while an actual object (even if none is actually constructed) must still be constructed in a valid way, and you can't say with any reliability how an arbitrary struct can be legally constructed.

Related

Parentheses around structure variable with asterisk operator

Consider the if statement:
if (((DbSignal*) ev->newVal.buff)->sig)
Where DbSignal is a structure.
Why is DbSignal within brackets and what is the asterisk operator doing in this case?
It is the cast: ev->newVal.buff is casted to pointer to DbSignal. Then this pointer is being dereferenced (sig member accessed)
What is the type cast: What exactly is a type cast in C/C++?
The syntax (DbSignal*) is a typecast. It converts one type to another.
In this case, the operand of the cast is ev->newVal.buff which presumably is a pointer to a character buffer. This pointer is converted to a pointer to DbSignal via the cast. The result is then dereferenced and the sig member is accessed.
We have ev which is a pointer in this case.
it points to a struct containing the variable newVal which contains a buff pointer.
So we have ev->newVal.buff
Here buff is either a char* or void* (a series of bytes, but apparently has some layout). Meaning that the memory it points to could potentially be interpreted in different ways.
By your example, we know that buff has a certain layout, corresponding to the DbSignal struct.
So in order to access ->sig we have to cast this .buff to DbSignal, basically telling that we want to interpret that memory region with the layout described by DbSignal.
Hope this gives some context.

what does this line of code "#define LIBINJECTION_SQLI_TOKEN_SIZE sizeof(((stoken_t*)(0))->val)" do?

In particular I'd like to know what ->val does in the
sizeof(((stoken_t*)(0))->val)
and what stoken_t*(0) pointer do, in particular what the (0) means?
I hope I have formulated my question clearly enough.
This is a way of accessing a member of a structure at compile time, without needing to have a variable defined of that structure type.
The cast (stoken_t*) to a value of 0 emulates a pointer of that structure type, allowing you to make use of the -> operator on that, just like you would use it on a pointer variable of that type.
To add, as sizeof is a compile time operator, the expression is not evaluated at run-time, so unlike other cases, here there is no null-pointer dereference happening.
It is analogous to something like
stoken_t * ptr;
sizeof(ptr->val);
In detail:
(stoken_t*)(0) simply casts 0 (this could be an arbitrary numeric literal) to a pointer to stoken_t, ((stoken_t*)(0)->val) is then the type of the val member of stoken_t and sizeof returns the number of bytes this type occupies in memory. In short, this expression finds the size of a struct member at compile time without the need for an instance of that struct type.

Is it valid to pass the address of a non-array variable to a function parameter declared as `Type ptr[static 1]`?

As mentioned here, here and here a function (in c99 or newer) defined this way
void func(int ptr[static 1]){
//do something with ptr, knowing that ptr != NULL
}
has one parameter (ptr) of type pointer to int and the compiler can assume that the function will never be called with null as argument. (e.g. the compiler can optimize null pointer checks away or warn if func is called with a nullpointer - and yes I know, that the compiler is not required to do any of that...)
C17 section 6.7.6.3 Function declarators (including prototypes) paragraph 7 says:
A declaration of a parameter as “array of type” shall be adjusted to “qualified pointer to type”, where
the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the
keyword static also appears within the [ and ] of the array type derivation, then for each call to
the function, the value of the corresponding actual argument shall provide access to the first element
of an array with at least as many elements as specified by the size expression.
In case of the definition above the value of ptr has to provide access to the first element of an array with at least 1 element. It is therefore clear that the argument can never be null.
What I'm wandering is, whether it is valid to call such a function with the address of an int that is not part of an array. E.g. is this (given the definition of func above) technically valid or is it undefined behavior:
int var = 5;
func(&var);
I am aware that this will practically never be an issue, because no compiler I know of differentiates between a pointer to a member of an int array and a pointer to a local int variable. But given that a pointer in c (at least from the perspective of the standard) can be much more than just some integer with a special compile time type I wandered if there is some section in the standard, that makes this valid.
I do suspect, that it is actually not valid, as section 6.5.6 Additive operators paragraph 8 contains:
[...] If both the pointer operand and the result point
to elements of the same array object, or one past the last element of the array object, the evaluation
shall not produce an overflow; otherwise, the behavior is undefined. [...]
To me that sounds as if for any pointer that points to an array element adding 1 is a valid operation while it would be UB to add 1 to a pointer that points to a regular variable. That would mean, that there is indeed a difference between a pointer to an array element and a pointer to a normal variable, which would make the snippet above UB...
Section 6.5.6 Additive operators paragraph 7 contains:
For the purposes of these operators, a pointer to an object that is not an element of an array behaves
the same as a pointer to the first element of an array of length one with the type of the object as its
element type.
As the paragraph begins with "for the purposes of these operators" I suspect that there can be a difference in other contexts?
tl;dr;
Is there some section of the standard, that specifies, that there is no difference between a pointer to a regular variable of type T and a pointer to the element of an array of length one (array of type T[1])?
At face value, I think you have a point. We aren't really passing a pointer to the first element of an array. This may be UB if we consider the standard in a vacuum.
Other than the paragraph you quote in 6.5.6, there is no passage in the standard equating a single object to an array of one element. And there shouldn't be, since the two things are different. An array (of even one element) is implicitly converted to a pointer when appearing in most expressions. That's obviously not a property most object types posses.
The definition of the static keyword in [] mentions that the the pointer being passed, must be to the initial element of an array that contains at least a certain number of elements. There is another problem with the wording you cited, what about
int a[2];
func(a + 1);
Clearly the pointer being passed is not to the first element of an array. That is UB too if we take a literal interpretation of 6.7.6.3p7.
Putting the static keyword aside, when a function accepts a pointer to an object, whether the object is a member of an array (of any size) or not matters in only one context: pointer arithmetic.
In the absence of pointer arithmetic, there is no distinguishable difference in behavior when using a pointer to access an element of an array, or a standalone object.
I would argue that the intent behind 6.7.6.3p7 has pointer arithmetic in mind. And so the semantic being mentioned comes hand in hand with trying to do pointer arithmetic on the pointer being passed into the function.
The use of static 1 simply emerged naturally as useful idiom, and maybe wasn't the intent from get go. While the normative text may do with a slight correction, I think the intent behind it is clear. It isn't meant to be undefined behavior by the standard.
The authors of the Standard almost certainly intended that quality implementations would treat the value of a pointer to a non-array object in the same way as it would treat the value of a pointer to the first element of an array object of length 1. Had it merely said that a pointer to a non-array object was equivalent to a pointer to an array, however, that might have been misinterpreted as applying to all expressions that yield pointer values. This could cause problems given e.g. char a[1],*p=a;, because the expressions a and p both yield pointers of type char* with the same value, but sizeof p and sizeof a would likely yield different values.
The language was in wide use before the Standard was written, and it was hardly uncommon for programs to rely upon such behavior. Implementations that make a bona fide effort to behave in a fashion consistent with the Standard Committee's intentions as documented in the published Rationale document should thus be expected to process such code meaningfully without regard for whether a pedantic reading of the Standard would require it. Implementations that do not make such efforts, however, should not be trusted to process such code meaningfully.

Why this redefinition of sizeof works

I'm redefining sizeof as:
#undef sizeof
#define sizeof(type) ((char*)((type*)(0) + 1) - (char*)((type*)(0)))
For this to work, the 2 '0' in the definition need to be the same entity in memory, or in other words, need to have the same address. Is this always guaranteed, or is it compiler/architecture/run-time dependent?
The 0 here is not an object – it is an address. So the question you ask is something of a non-sequitur.
You are thinking that the zero's are discreet pieces of data that need to be stored somewhere. They aren't.. they are being cast as pointers to memory location zero.
When you increment a pointer to a type, it is actually incremented by the size of the type it points to. This is how C array arithmetic works.
In practice, a null pointer of a certain type always refers to the same location in memory (especially when constructed the same way, as you do above), simply because any other implementation would be senseless.
However, The standard actually does not guarantee a lot about this:
"[...] is guaranteed to compare unequal to a pointer to any object or function." 6.3.2.3§3
"[...] Any two null pointers shall compare equal." 6.3.2.3§4
This leaves a lot of lee-way. Assume a memory model with two distinctive regions. Each region could have a region of null pointers (say the first 128 bytes). It is easy to see, that even in that weird case, the basic assumptions about null pointers can indeed hold! Well, given a proper compiler that makes weird null tests...
So, what else do we know about pointers in general...
What you are trying to do is first, increment a pointer
"one operand shall be a pointer to a complete object type and the other shall have integer type. (Incrementing is equivalent to adding 1.)" [6.5.6§2]
and then a pointer difference
"both operands are pointers to qualified or unqualified versions of compatible complete object types" [6.5.6§3]
OK, they are (well, assuming type is a complete object type). But what about semantics?
"For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type." [6.5.6§7]
This is actually a bit of a problem: The null pointer need not point to an actual object! (Otherwise you could dereference it safely...) Therefore, incrementing it or subtracting it from another pointer is UB!
To conclude: 0 does not point to an object, and therefore the answer to your question is No.
A strictly standards-conforming compiler could reject this, or return some nonsense. On "typical" machines and pointers have the same size, and casting an integer to a pointer just takes that bit pattern and looks at it as a pointer. There are machines where words contain extra data (type perhaps, permission bits). Some addresses might be forbidden for certain objects (i.e., nothing can have address 0), and so on. While it is guaranteed that sizeof(char) == 1, on e.g. Crays a character is actually 32 bits.
Besides, the C standard guarantees that the expresison in sizeof(expression) is not evaluated at all, just its type is taken. I.e., ^sizeof(x++)doesn't incrementx`.

What does the following macro do?

in qemu source code, I have the following macro named offsetof. Can anybody tell me what it does?
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *) 0)->MEMBER)
It's used in this manner :
offsetof(CPUState, icount_decr.u32)
where CPUState is a struct.
I think it gives the offset of the member inside a struct, but I'm not sure.
EDIT:Yeah, I found out what was happening. The definition of CPUState had a macro inside, which I missed, which included the variable icount_decr.
It gets the offset of the member of a struct. It does so by casting address zero to a struct of that type then taking the address of the member.
Your thinking is correct! And the name of the macro gives a good hint, too. ;)
It's defined in §7.17/3:
offsetof(type, member-designator)
which expands to an integer constant expression that has type size_t, the value of
which is the offset in bytes, to the structure member (designated by member-designator),
from the beginning of its structure (designated by type). The type and member designator
shall be such that given
static type t;
then the expression &(t.member-designator) evaluates to an address constant. (If the
specified member is a bit-field, the behavior is undefined.)
Because the library doesn't have to necessarily follow language rules, an implementation is free to get the result however it pleases.
So the result of this particular implementation is not undefined behavior, because you aren't suppose to care how it's implemented. (In other words, your implementation makes the guarantee that taking the address of an indirection through a null pointer is well-defined. You of course can't assume this in your own programs.)
If that some library has (re)defined offsetof, they've made your program behavior undefined and should be using the standard library instead. (The dummies.)

Resources