How memcpy finds the first byte of the passed object? - c

Let g be a object designator.
void *p = &g;
char *pf = (char *)p;
Every pointer type can be converted to a pointer to void and back, the result shall compare equal to original pointer.
When a pointer to object type casted to a pointer to character type, the resulting pointer points to the first byte of the object.
Pointer to void and pointer to char types are interchangeable.
But at code example above. A void pointer doesn't need to point anything, all it needs to do is to conform condition number 1. So we can't even say that it points to our object. So if we cast that void pointer to a character pointer we can't say that resulting pointer points to lowest addressing byte of our object.
My question is, if my conclusion is true, how memcpy function finds the lowest addressed byte of the passed object; since every pointer passed to memcpy converted to a pointer to void?

The C standard fails to present rules for pointer conversions expressed in formal mathematics or logic. It expresses rules in natural language (English) in clause 6.3.2.3 (in C 2018). While these natural language rules do not explicitly state that a pointer to an object converted first to void * and then to char * yields the same result as converting directly to char *, this is understood. That is, experienced practitioners with C and compilers understand this is the intent.

Related

When (void *) p == (void *) *p - What does the Standard say about this?

Example:
int a[99];
int (*p)[99] = &a;
// this prints 1
printf("%d\n", (void *) p == (void *) *p);
In general, if p is a pointer to an array, then both the object representations (i.e. the bit patterns) of p and *p are equal.
I'm just lost and completely unsure about the portability of this behaviour.
So, I'm curious whether this behaviour is guaranteed by the Standard. If so, could someone please quote all of the relevant paragraphs that guarantee it?
This comparison is guaranteed to be 1.
The relevant part of the C standard is section 6.5.9p6 regarding the equality operator and the comparison of pointers:
Two pointers compare equal if and only if both are null pointers, both
are pointers to the same object (including a pointer to an object and
a subobject at its beginning) or function, both are pointers to one
past the last element of the same array object, or one is a pointer to
one past the end of one array object and the other is a pointer to the
start of a different array object that happens to immediately
follow the first array object in the address space.
Take particular note of the passage in bold. This means two things: 1) a pointer to a struct and a pointer to its first member (suitable converted) will compare equal, and 2) a pointer to an array and a pointer to its first member (again, suitable converted) will compare equal.
In your particular case, p points to an array and *p is the array itself, and using *p in an expression yields a pointer to its first member. Both are converted to void * to give them a common type. So this comparison will always evaluate to 1.
In general, if p is a pointer to an array, then both the object representations (i.e. the bit patterns) of p and *p are equal.
If p is a pointer to an array, then *p is the array. The bit representation of the array is the concatenation of the bit representations of the elements of the array (because C 2018 6.2.5 20 says an array is made of contiguously allocated objects). The bits in the array are not generally equal to the bits in the pointer.
However, when an array is used in an expression other than as the operand of unary & or the operand of sizeof or as a string literal used to initialize an array, the array is automatically converted to a pointer to its first element. The first element of the array *p is (*p)[0], so *p is automatically converted to &(*p)[0].
Then the question is whether (void *) p equals (void *) &(*p)[0].
C 2018 6.3.2.3 1 tells us any pointer to an object type may be converted to void *. However, it does not tell us what the results of comparisons are while the pointer is void *. It does tell us that converting the void * back to its original type yields a pointer that compares equal to the original.
C 2018 6.5.9 6 tells us “Two pointers compare equal if and only if …, both are pointers to the same object (including a pointer to an object and a subobject at its beginning)…” (I elided some other cases that are not of concern here.) What are we to make of this given two void *? It seems the intent is for a pointer to “point to an object” even if it is currently in the form of a void *. Then (void *) p points to the array and (void *) &(*p)[0] points to a subobject at its beginning, so they compare equal.
The semantics would be clearer with (char *) p == (char *) *p because C 2018 6.3.2.3 7 tells us that converting to char * produces a pointer to the first byte of an object, and the first byte of an array is the same as the first byte of its first element.

What does this mean: a pointer to void will never be equal to another pointer?

One of my friends pointed out from "Understanding and Using C Pointers - Richard Reese, O'Reilly publications" the second bullet point and I wasn't able to explain the first sentence from it. What am I missing?
Pointer to void
A pointer to void is a general-purpose pointer used to hold references to any data type. An example of a pointer to void is shown below:
void *pv;
It has two interesting properties:
A pointer to void will have the same representation and memory alignment as a pointer to char.
A pointer to void will never be equal to another pointer. However, two void pointers assigned a NULL value will be equal.
This is my code, not from the book and all pointers are having the same value and are equal.
#include <stdio.h>
int main()
{
int a = 10;
int *p = &a;
void *p1 = (void*)&a;
void *p2 = (void*)&a;
printf("%p %p\n",p1,p2);
printf("%p\n",p);
if(p == p1)
printf("Equal\n");
if(p1 == p2)
printf("Equal\n");
}
Output:
0x7ffe1fbecfec 0x7ffe1fbecfec
0x7ffe1fbecfec
Equal
Equal
TL/DR: the book is wrong.
What am I missing?
Nothing, as far as I can see. Even the erratum version presented in comments ...
A pointer to void will never be equal to another pointer to void.
... simply is not supported by the C language specification. To the extent that the author is relying on the language specification, the relevant text would be paragraph 6.5.9/6:
Two pointers compare equal if and only if both are null pointers, both
are pointers to the same object (including a pointer to an object and
a subobject at its beginning) or function, both are pointers to one
past the last element of the same array object, or one is a pointer to
one past the end of one array object and the other is a pointer to the
start of a different array object that happens to immediately follow
the first array object in the address space.
void is an object type, albeit an "incomplete" one. Pointers to void that are valid and non-null are pointers to objects, and they compare equal to each other under the conditions expressed by the specification. The usual way that such pointers are obtained is by converting an object pointer of a different (pointer) type to void *. The result of such a conversion still points to the same object that the original pointer did.
My best guess is that the book misinterprets the spec to indicate that pointers to void should not be interpreted as pointers to objects. Although there are special cases that apply only to pointers to void, that does not imply that general provisions applying to object pointers do not also apply to void pointers.
C 2018 6.5.9 6 says:
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.
So, suppose we have:
int a;
void *p0 = &a;
void *p1 = &a;
Then, if p0 and p1 “point to the same object”, p0 == p1 must evaluate as true. However, one might interpret the standard to mean that a void * does not point to anything while it is a void *; it just holds the information necessary to convert it back to its original type. But we can test this interpretation.
Consider the specification that two pointers compare equal if they point to an object and a subobject at its beginning. That means that given int a[1];, &a == &a[0] should evaluate as true. However, we cannot properly use &a == &a[0], because the constraints for == for pointers require the operands point to compatible types or that one or both is a void * (with qualifiers like const allowed). But a and a[0] neither have compatible types nor are void.
The only way for a fully defined situation to arise in which we are comparing pointers to this object and its subobject is for at least one of the pointers to have been converted either to void * or to a pointer to a character type (because these are given special treatment in conversions). We could interpret the standard to mean only the latter, but I judge the more reasonable interpretation to be that void * is included. The intent is that (void *) &a == (void *) &a[0] is to be interpreted as a comparison of a pointer to the object a to a pointer to the object a[0] even though those pointers are in the form void *. Thus, these two void * should compare as equal.
The following section from this Draft C11 Standard completely refutes the claim made (even with the clarification mentioned in the 'errata', in the comment by GSerg).
6.3.2.3 Pointers
1     A pointer to void may be converted to or from a pointer to any object type. A pointer to any
object type may be converted to a pointer to void and back again;
the result shall compare equal to the original pointer.
Or, this section from the same draft Standard:
7.20.1.4 Integer types capable of holding object pointers
1    The following type designates a signed integer type with
the property that any valid pointer to void can be converted to this
type, then converted back to pointer to void, and the result will
compare equal to the original pointer:
      intptr_t
A pointer is just an address in memory. Any two pointers are equal if they're NULL or if they point to the same address. You can go on and on about how that can happen with the language of structures, unions and so on. But in the end, it's simply just algebra with memory locations.
A pointer to void will never be equal to another pointer. However, two void pointers assigned a NULL value will be equal.
Since NULL is mentioned in that statement, I believe it is a mistype. The statement should be something like
A pointer to void will never be equal to NULL pointer. However, two void pointers assigned a NULL value will be equal.
That means any valid pointer to void is never equal to NULL pointer.

Is it UB to compare (for equality) a void pointer with a typed pointer in C?

I have a typed pointer, typed, that was initialized using pointer arithmetic to point to an object within an array. I also have a function that takes two pointer args, the 1st typed the same as the afore-mentioned pointer and the 2nd is void * (see myfunc() in the code below).
If I pass typed as the 1st argument and another pointer typed the same as typed as the 2nd argument, and then compare these for equality within the function, is that Undefined Behavior?
#include <stdio.h>
typedef struct S {int i; float f;} s;
void myfunc(s * a, void * b)
{
if (a == b) // <-------------------------------- is this UB?
printf("the same\n");
}
int main()
{
s myarray[] = {{7, 7.0}, {3, 3.0}};
s * typed = myarray + 1;
myfunc(typed, &(myarray[0]));
return 0;
}
Update: Ok, so I come back a day after posting my question above and there are two great answers (thanks both to #SouravGhosh and #dbush). One came in earlier than the other by less than a minute (!) but from the looks of the comments on the 1st one, the answer was initially wrong and only corrected after the 2nd answer was posted. Which one do I accept? Is there a protocol for accepting one answer over the other in this case?
No, this is not undefined behaviour. This is allowed and explicitly defined in the spec for equality operator constraints. Quoting C11, chapter 6.5.9
one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void;
and from paragraph 5 of same chapter
[...] If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.
This comparison is well defined.
When a void * is compared against another pointer type via ==, the other pointer is converted to void *.
Also, Section 6.5.9p6 of the C standard says the following regarding pointer comparisons with ==:
Two pointers compare equal if and only if both are null pointers, both
are pointers to the same object (including a pointer to an object and
a subobject at its beginning) or function,both are pointers to one
past the last element of the same array object, or one is a pointer to
one past the end of one array object and the other is a pointer to the
start of a different array object that happens to immediately
follow the first array object in the address space.
There is no mention here of undefined behavior.

Can you cast a "pointer to a function pointer" to void*

Inspired by comments to my answer here.
Is this sequence of steps legal in C standard (C11)?
Make an array of function pointers
Take a pointer to the first entry and cast that pointer to function pointer to void*
Perform pointer arithmetic on that void*
Cast it back to pointer to function pointer and dereference it.
Or equivalently as code:
void foo(void) { ... }
void bar(void) { ... }
typedef void (*voidfunc)(void);
voidfunc array[] = {foo, bar}; // Step 1
void *ptr1 = array; // Step 2
void *ptr2 = (char*)ptr1 + sizeof(voidfunc); // Step 3
voidfunc bar_ptr = *(voidfunc*)ptr2; // Step 4
I thought that this would be allowed, as the actual function pointers are only accessed through properly typed pointer. But Andrew Henle pointed out that this doesn't seem to be covered by Standard section 6.3.2.3: Pointers.
Your code is correct.
A pointer to a function is an object and you're casting a pointer to an object (a pointer to a function pointer) to void pointer and back again; and then finally dereferencing a pointer to an object.
As for the char pointer arithmetic, this is referred to by footnote 106 of C11:
106) Another way to approach pointer arithmetic is first to convert the pointer(s) to character pointer(s): In this scheme the integer expression added to or subtracted from the converted pointer is first multiplied by the size of the object originally pointed to, and the resulting pointer is converted back to the original type. For pointer subtraction, the result of the difference between the character pointers is similarly divided by the size of the object originally pointed to. When viewed in this way, an implementation need only provide one extra byte (which may overlap another object in the program) just after the end of the object in order to satisfy the ''one past the last element'' requirements.
Yes, the code is fine. There's various pitfalls and conversion rules at play here:
C splits all types in two main categories: objects and functions. A pointer to a function is a scalar type which in turn is an object. (C17 6.2.5)
void* is the generic pointer type for pointers to object type. Any pointer to object type may be converted to/from void*, implicitly. (C17 6.3.2.3 §1).
No such generic pointer type exists for pointers to function type. Thus a function pointer cannot be converted to a void* or vice versa. (C17 6.3.2.3 §1)
However, any function pointer type can be converted to another function pointer type and back, allowing us to use something like for example void(*)(void) as a generic function pointer type. As long as you don't call the function through the wrong function pointer type, it is fine. (C17 6.3.2.3 §8)
Function pointers point to functions, but they are objects in themselves, just like any pointer is. And so you can use a void* to point at the address of a function pointer.
Therefore, using a void* to point at a function pointer is fine. But not using it to point directly at a function. In case of void *ptr1 = array; the array decays into a pointer to the first element, a void (**)(void) (equivalent to voidfunc* in your example). You may point at such a pointer to function-pointer with a void*.
Furthermore, regarding pointer arithmetic:
No pointer arithmetic can be performed on a void*. (C17 6.3.2.2) Such arithmetic is a common non-standard extension that should be avoided. Instead, use a pointer to character type.
A pointer to character type may, as a special case, be used to iterate over any object (C17 6.2.3.3 §7). Apart from concerns regarding alignment, doing so is well-defined and does not violate "strict pointer aliasing", should you de-reference the character pointer (C17 6.5 §7).
Therefore, (char*)ptr1 + sizeof(voidfunc); is also fine. You then convert from void* to voidfunc*, to voidfunc which is the original function pointer type stored in the array.
As been noted in comments, you can improve readability of this code significantly by using a typedef to a function type:
typedef void (voidfunc)(void);
voidfunc* array[] = {&foo, &bar}; // Step 1
void* ptr1 = array; // Step 2
void* ptr2 = (char*)ptr1 + sizeof(voidfunc*); // Step 3
voidfunc* bar_ptr = *(voidfunc**)ptr2; // Step 4
Pointer arithmetic on void* is not in the C language. You re not doing it though, you are doing pointer arithmetic on char* which is perfectly OK. You could have used char* instead of void* to begin with.
Andrew Helne seems to be missing the fact that a pointer to a function is an object, and its type is an object type. It is a plain simple fact, not something veiled in a shroud of mystery as some other commentators seem to imply. So his objection to casting a pointer to a function pointer is unfounded, as pointers to any object type can be cast to void*.
However, the C standard doesn't seem to allow using (T*)((char*)p + sizeof(T)) in lieu of (p+1) (where p is a pointer to an element of an array of type T), or at least I cannot find such permission in the text. Your code might not be legal because of that.

Can you ever assume typecasting pointers is safe?

I've heard from many people that you cannot guarantee typecasting will be performed lossless. Is that only true if you don't know your processor, that is, you haven't verified the number of bytes used for your data types? Let me give an example:
If you execute the following:
typedef struct
{
int i;
char c;
float f;
double d;
} structure;
size_t voidPtrSz = sizeof(void *);
size_t charPtrSz = sizeof(char *);
size_t intPtrSz = sizeof(char *);
size_t floatPtrSz = sizeof(float *);
size_t doublePtrSz = sizeof(double *);
size_t structPtrSz = sizeof(structure *);
size_t funcPtrSz = sizeof(int (*)(float, char));
printf("%lu\n", voidPtrSz);
printf("%lu\n", charPtrSz);
printf("%lu\n", intPtrSz);
printf("%lu\n", floatPtrSz);
printf("%lu\n", doublePtrSz);
printf("%lu\n", structPtrSz);
printf("%lu\n", funcPtrSz);
…and the output is the following…
4
4
4
4
4
4
4
Can you assume that in all cases you can typecast a specific data type pointer to another data type pointer safely? For example, if you execute this:
int foo(float, char)
{
}
void *bar(void)
{
return (void *)foo;
}
int (*pFunc)(float, char) = bar();
Can you assume with certitude that pFunc has the address of foo?
Regarding your specific code example, let's refer to section 6.3.2.3 of the C99 language standard:
A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
Note that a pointer-to-function is not the same as pointer-to-object. The only mention of pointer-to-function conversions is:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.
So your code example invokes undefined behaviour.
If we avoid function-pointer conversions, the following paragraph explains everything:
A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
Note: Converting between pointer types is a separate issue from converting and then dereferencing (in general, that's only valid if you're converting to char * and then dereferencing.)
Can you assume that in all cases you can typecast a specific data type pointer to another data type pointer safely?
Any data pointer can be safely cast to char* or void*. Any char* or void* thus created can be cast back to its original type. Any other data pointer cast leads to undefined behavior when indirection is performed on the pointer.
Any function pointer type can be cast to any other function pointer type, although you should not call a function through the wrong type. Casting a function pointer to void* or any other data pointer type results in undefined behavior.
Is that only true if you don't know your processor, that is, you haven't verified the number of bytes used for your data types?
Even then, you're not safe. When the C standard says a construct has undefined behavior, compiler writers are free to handle the construct as they wish. The result is that even though you think you know a construct with UB will be handled because you know the target CPU, optimizing compilers may cut corners and generate very different code than you expect.
#Oli Charlesworth gives you a great answer.
I hope I can shed a little light on what pointer are so you can better understand pointer mechanics:
A pointer is an address. This address is the address of the first byte of your data. The type of the pointer specifies how many bytes starting from that first byte are part of the data and how those bytes encode the data.
For instance, on gcc x86, if you have a int * p, the value held by p tells the starting address of data, and the type of p (int *) tells that at that address he will interpret 4 bytes (in little endian byte-order) in two's complement signed number representation.
A void * pointer is a "generic pointer". The pointer still holds an address, but the pointer type doesn't specify what kind of data you find there, or even how many bytes form the data, so you can never access data through a void * pointer, but as answered before, you can safely convert between a pointer to void and a pointer to any incomplete or object type.
A pointer to function holds the address of a function, and the type of the pointer tells how to call that function (what parameters and of what kind) and what the function returns.

Resources