Can I do arithmetic on void * pointers in C? - c

is this valid
void *p = &X; /* some thing */
p += 12;
and if so what does p now point to?
I have (third party) code that does this (and compiles cleanly) and my guess is that the void * was treated as a char *. My trusty K&R is silent(ish) on the topic
EDIT: My little test app runs fine on gcc 4.1.1 and treats void * as char *. But g++ barfs
I know how to do it properly. I need to know if I have to clean this code base to find all the places its done.
BTW gcc -pedantic throws up a warning
Summary:
The C spec is ambiguous. It says that in terms of representation and use as function parameters void* =char*. But it is silent regarding pointer arithmetic.
gcc (4) permits it and treats it as
char *
g++ refuses it
gcc -pedantic warns about it
vs2010 both c and c++
refuses it

No this is not legal. A void* cannot be arbitrarily incremented. It needs to be cast to a specific type first.
If you want to increment it by a specific number of bytes then this is the solution I use.
p = ((char*)p) + 12;
The char type is convenient because it has a defined size of 1 byte.
EDIT
It's interesting that it runs on gcc with a warning. I tested on Visual Studio 2010 and verified it does not compile. My limited understanding of the standard would say that gcc in the error here. Can you add the following compilation flags
-Wall -ansi -pedantic

To quote from the spec:
§6.5.6/2: For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to an object type and the other shall have integer type. (Incrementing is equivalent to adding 1.)
A pointer to void is not a pointer to an object type, as per these excerpts:
§6.2.5/1: [...] Types
are partitioned into object types (types that fully describe objects), function types (types
that describe functions), and incomplete types (types that describe objects but lack
information needed to determine their sizes).
§6.2.5/19: The void type comprises an empty set of values; it is an incomplete type that cannot be
completed.
Therefore, pointer arithmetic is not defined for pointer to void types.

It depends on compiler. Those that allow it consider sizeof(*(void *)) as 1.
EDIT: it's only for void pointer arithmetic. It would have no sense using in this case steps of sizeof(int) or of 0. The common expectations of someone who uses it would be the smallest possible step.

Your guess is correct.
In the standard ISO C99, section 6.2.5 paragraph 26, it declares that void pointers and character pointers will have the same representation and alignment requirements (paraphrasing).

You may want to have a look at the top voted answer for this question
Pointer arithmetic for void pointer in C

I don't think you can, because it doesn't know its type, therefore can not seek the correct amount of bytes.
Cast it to a type first, i.e. (int).

Related

Casting an address of subroutine into void pointer

Is it okay to cast function location with void pointer though function pointers size is not always the same as opaque pointer size?
I already did search about opaque pointers , and casting function pointers . I found out function pointers and normal pointers are not the same on some systems.
void (*fptr)(void) = (void *) 0x00000009; // is that legal???
I feel I should do this
void (*fptr)(void)= void(*)(void) 0x00000009;
It did work fine , though I expected some errors or at least warnings
I'm using keil arm compiler
No, the problem is that you cannot go between void* and function pointers, they are not compatible types. void* is the generic pointer type for object pointers only.
In the second case you have a minor syntax error, should be (void(*)(void)). Fixing that, we have:
void (*fptr)(void) = (void *) 0x00000009; // NOT OK
void (*fptr)(void) = (void(*)(void)) 0x00000009; // PERHAPS OK (but likely not on ARM)
Regarding the former, it is simply not valid C (but might be supported as a non-standard compiler extension). Specifically, it violates the rules of simple assignment, C17 6.5.16.1. Both operands of = must be compatible pointer types.
Regarding the latter, the relevant part is C17 6.3.2.3/5
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
Some special cases exist for null pointers, but that does not apply in your case.
So you can probably cast the value 9 to a function pointer - you have to check the Keil compiler manual regarding implementation-defined behavior. What meaningful purpose the resulting function pointer will have from there on, I have no idea.
Because the address 9 is certainly not an aligned address on ARM. Rather, it is 1 byte past where I'd expect to find the address of the non-maskable interrupt ISR or some such. So what are you actually trying to do here? Grab an address of a ISR from the vector table?
void (*fptr)(void) = (void *) 0x00000009;
is not legal according to standard C as such.
If the pointer on the left is integer constant expression with value 0, or such expression cast to (void *) is a null pointer constant, that can be assigned to a function pointer:
void (*fptr)(void) = 0;
This only applies to the null pointer constant. It does not even apply to a variable of type void * that contains a null pointer. The constraints (C11 6.5.16.1) for simple assignments include
the left operand is an atomic, qualified, or unqualified pointer, and the right is a null pointer constant; or
In strictest sense the standard does not provide a mechanism to convert a pointer-to-void to a pointer to function at all! Not even with a cast. However it is available on most common platforms as a documented common extension C11 J.5.7 Function pointer casts:
A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4).
but it is not required at all by the C standard - indeed it is possible to use C in a platform where the code can be executed only from memory that cannot be accessed as data at all.
The first expression is perfectly legal in C, as the void * pointer type is assignment and parameter passing compatible with any other pointer type, and you can run into trouble, if pointers are different size than integers. It's not good programming style, and there's apparently no reason to assign to a function pointer the integer literal 9. I cannot guess what are you doing so for.
Despite of that, there are some few (well, very few) cases in history that that thing has been done (e.g. to give the special values SIG_DFL and SIG_IGN to the signal(2) system call, one can assume nobody will ever use those values to call the function dereferenced by the pointer, indeed, you can use some integers, different than zero, in the page zero virtual addresses of a process, to avoid dereferencing the pointers (so you cannot call the functions, or you'll get a segmentation violation immediately), while using different than zero values to assume several values apart of the NULL pointer itself)
But the second expression is not legal. It's not valid for an expression to start with a type identifier, so the subexpression to the right of the = sign is invalid. To do a correct assignment, with a valid cast, you had to write:
void (*ptr)(void) = (void (*)(void)) 0x9; /* wth to write so many zeros? */
(enclosing the whole type mark in parenthesis) then, you can call the function as:
(*ptr)();
or simply as:
ptr();
Just writing
void(*ptr)(void) = 9;
is also legal, while the integer to pointer conversion is signalled by almost every compiler with a warning. You'll get an executable from there.
If the integer is 0, then the compiler will shut up, as 0 is converted automatically to the NULL pointer.
EDIT
To illustrate the simple use I mentioned above in the first paragraph, from the file <sys/signal.h> of FreeBSD 12.0:
File /usr/include/sys/signal.h
139 #define SIG_DFL ((__sighandler_t *)0)
140 #define SIG_IGN ((__sighandler_t *)1)
141 #define SIG_ERR ((__sighandler_t *)-1)
142 /* #define SIG_CATCH ((__sighandler_t *)2) See signalvar.h */
143 #define SIG_HOLD ((__sighandler_t *)3)
all those definitions are precisely of the type mentioned in the question, an integer value cast to a pointer to function, in order to permit special values to represent non executable/non callback values. The type __sighandler_t is defined as:
161 typedef void __sighandler_t(int);
below.
From CLANG:
$ cc -std=c17 -c pru.c
$ cat pru.c
void (*ptr)(void) = (void *)0x9;
you get even no warning at all.
Without the cast:
$ cc -std=c11 pru.c
pru.c:1:8: warning: incompatible integer to pointer conversion
initializing 'void (*)(void)' with an expression of type 'int'
[-Wint-conversion]
void (*ptr)(void) = 0x9;
^ ~~~
1 warning generated.
(Only a warning, not an error)
With a zero literal:
$ cc -std=c11 -c pru.c
$ cat pru.c
void (*ptr)(void) = 0x0;
even no warning at all.

Does setting a void * value to a intptr_t variable require an explicit cast?

I can't seem to make sense of a GCC compiler warning I get when I try to assign a void * value to a intptr_t variable. Specifically, when I compile with -std=c99 -pedantic, I get the following warning regarding the initialization of variable z on line 7:
warning: initialization makes integer from pointer without a cast [-Wint-conversion]
Here is the source code:
#include <stdint.h>
int main(void){
unsigned int x = 42;
void *y = &x;
intptr_t z = y; /* warning: initialization makes integer from pointer without a cast [-Wint-conversion] */
return 0;
}
Naturally, if I explicitly cast y to intptr_t then the warning disappears. However, I confused why the warning is present for implicit conversions when the whole purpose of intptr_t is in the conversion and manipulation of void * values.
From section 7.18.1.4 of the C99 standard:
The following type designates a signed integer type with the property that any valid
pointer to void can be converted to this type, then converted back to pointer to void,
and the result will compare equal to the original pointer:
intptr_t
Am I misinterpreting the standard, or is GCC simply overly pedantic in its "integer from pointer" check in this case?
Summing up! Apologies in advance for any errors — please leave me a comment.
In C99:
Any pointer can be converted to an integer type.1
You might want to do that, e.g., if you are implementing your own operating system!
Conversions between pointers and integers can go horribly wrong,1 so are usually not what you want.
Therefore, the compiler warns you when you convert pointers to integers without casting. This is not overly pedantic, but to save you from undefined behaviour.
intptr_t (and uintptr_t, and likewise throughout) is just an integer type,2 so it is subject to the same risks as any other pointer-to-integer conversion. Therefore, you get the same warning.
However, with intptr_t, you at least know that the conversion from a pointer won't truncate any bits. So those are the types to use — with explicit casts — if you really need the integer values of pointers.
The spec1, #6 says that
... the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined.
With intptr_t, the result can be represented in the integer type. Therefore, the behaviour is not undefined, but merely implementation-defined. That is (as far as I know) why those types are safe to use for receiving values from pointers.
Edit
Reference 1, below, is part of section 6.3, "Conversions." The spec says:3
Several operators convert operand values from one type to another automatically. This
subclause specifies the result required from such an implicit conversion...
and refers to section 6.5.4 for a discussion of explicit casts. Therefore, the discussion in Reference 1 indeed covers implicit casts from any pointer type to intptr_t. By my reading, then, an implicit cast from void * to intptr_t is legal, and has an implementation-defined result.1, 4
Regarding whether the explicit cast should be used, gcc -pedantic thinks it should, and there must be a good reason! :) I personally agree that the explicit cast is more clear. I am also of the school of thought that code should compile without warnings if at all possible, so I would add the explicit cast if it were my code.
References
1C99 draft (since I don't have a copy of the final spec), sec. 6.3.2.3 #5 and #6).
2Id., sec. 7.18.1.4
3Id., sec. 6.3
4Id., sec. 3.4.1, defines "implementation-defined behavior" as "unspecified behavior where each implementation documents how the choice is made." The implication is that the conversion is legal, but that the result may be different on one platform than on another.

Is this behavior of clang standard compliant?

This is going to be a long, language lawyerish question, so I'd like to quickly state why I find it relevant. I am working on a project where strict standard compliance is crucial (writing a language that compiles to C). The example I am going to give seems like a standard violation on the part of clang, and so, if this is the case, I'd like to confirm it.
gcc says that a conditional with a pointer to a restrict qualified pointer can not co-inhabit a conditional statement with a void pointer. On the other hand, clang compiles such things fine. Here is an example program:
#include <stdlib.h>
int main(void){
int* restrict* A = malloc(8);
A ? A : malloc(8);
return 0;
}
For gcc, the options -std=c11 and -pedantic may be included or not in any combination, likewise for clang and the options -std=c11 and -Weverything. In any case, clang compiles with no errors, and gcc gives the following:
tem-2.c: In function ‘main’:
tem-2.c:7:2: error: invalid use of ‘restrict’
A ? A : malloc(8);
^
The c11 standard says the following with regard to conditional statements, emphasis added:
6.5.15 Conditional operator
...
One of the following shall hold for the second and third operands:
— both operands have arithmetic type;
— both operands have the same structure or union type;
— both operands have void type;
— both operands are pointers to qualified or unqualified versions of compatible types;
— one operand is a pointer and the other is a null pointer constant; or
— one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void.
...
If both the second and third operands are pointers or one is a null pointer constant and the
other is a pointer, the result type is a pointer to a type qualified with all the type qualifiers
of the types referenced by both operands. Furthermore, if both operands are pointers to
compatible types or to differently qualified versions of compatible types, the result type is
a pointer to an appropriately qualified version of the composite type; if one operand is a
null pointer constant, the result has the type of the other operand; otherwise, one operand
is a pointer to void or a qualified version of void, in which case the result type is a
pointer to an appropriately qualified version of void.
...
The way I see it, the first bold portion above says that the two types can go together, and the second bold portion defines the result to be a pointer to a restrict qualified version of void. However, as the following states, this type can not exist, and so the expression is correctly identified as erroneous by gcc:
6.7.3 Type qualifiers, paragraph 2
Types other than pointer types whose referenced type is an object type shall not be restrict-qualified.
Now, the problem is that a "shall not" condition is violated by this example program, and so is required to produce an error, by the following:
5.1.1.3 Diagnostics, paragraph 1
A conforming implementation shall produce at least one diagnostic message (identified in
an implementation-defined manner) if a preprocessing translation unit or translation unit
contains a violation of any syntax rule or constraint, even if the behavior is also explicitly
specified as undefined or implementation-defined. Diagnostic messages need not be
produced in other circumstances.
It seems clang is not standard compliant by treating an erroneous type silently. That makes me wonder what else clang does silently.
I am using gcc version 5.4.0 and clang version 3.8.0, on an x86-64 Ubuntu machine.
Yes it looks like a bug.
Your question more briefly: can void be restrict qualified? Since void is clearly not a pointer type, the answer is no. Because this violates a constraint, the compiler should give a diagnostic.
I was able to trick clang to confess its sins by using a _Generic expression
puts(_Generic(A ? A : malloc(8), void* : "void*"));
and clang tells me
static.c:24:18: error: controlling expression type 'restrict void *' not compatible with any generic association type
puts(_Generic(A ? A : malloc(8), void* : "void*"));
which shows that clang here really tries to match a nonsense type restrict void*.
Please file them a bug report.
While a compiler could satisfy all obligations surrounding restrict by ignoring the qualifier altogether, a compiler which wants to keep track of what it is or is not allowed to do needs to keep track of which pointers hold copies of restrict pointers. Given something like:
int *foo;
int *bar;
int wow(int *restrict p)
{
foo = p;
...
*p = 123;
*foo = 456;
*p++;
*bar = 890;
return *p;
}
since foo is derived from p, a compiler must allow for accesses made via
foo to alias accesses via p. A compiler need not make such allowances
for accesses made via bar, since that is known not to hold an address derived from p.
The rules surrounding restrict get murky in cases where a pointer may or
may not be derived from another. A compiler would certainly be allowed to
simply ignore a restrict qualifier in cases where it can't track all of
the pointers derived from a pointer; I'm not sure if any such cases would
invoke UB even if nothing ever modifies the storage identified by the
pointer. If a syntactic construct is structurally guaranteed to invoke
UB, having a compiler squawk may be more useful than having it act in an
arbitrary fashion (though having a compiler simply ignore any restrict
qualifiers it can't fully handle might be more useful yet).

Sizeof a function that returns void in C [duplicate]

What would this statement yield?
void *p = malloc(sizeof(void));
Edit: An extension to the question.
If sizeof(void) yields 1 in GCC compiler, then 1 byte of memory is allocated and the pointer p points to that byte and would p++ be incremented to 0x2346? Suppose p was 0x2345. I am talking about p and not *p.
The type void has no size; that would be a compilation error. For the same reason you can't do something like:
void n;
EDIT.
To my surprise, doing sizeof(void) actually does compile in GNU C:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc -w - && ./a.out
1
However, in C++ it does not:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc++ -w - && ./a.out
<stdin>: In function 'int main()':
<stdin>:1: error: invalid application of 'sizeof' to a void type
<stdin>:1: error: 'printf' was not declared in this scope
If you are using GCC and you are not using compilation flags that remove compiler specific extensions, then sizeof(void) is 1. GCC has a nonstandard extension that does that.
In general, void is a incomplete type, and you cannot use sizeof for incomplete types.
Although void may stand in place for a type, it cannot actually hold a value. Therefore, it has no size in memory. Getting the size of a void isn’t defined.
A void pointer is simply a language construct meaning a pointer to untyped memory.
void has no size. In both C and C++, the expression sizeof (void) is invalid.
In C, quoting N1570 6.5.3.4 paragraph 1:
The sizeof operator shall not be applied to an expression that
has function type or an incomplete type, to the parenthesized name of
such a type, or to an expression that designates a bit-field member.
(N1570 is a draft of the 2011 ISO C standard.)
void is an incomplete type. This paragraph is a constraint, meaning that any conforming C compiler must diagnose any violation of it. (The diagnostic message may be a non-fatal warning.)
The C++ 11 standard has very similar wording. Both editions were published after this question was asked, but the rules go back to the 1989 ANSI C standard and the earliest C++ standards. In fact, the rule that void is an incomplete type to which sizeof may not be applied goes back exactly as far as the introduction of void into the language.
gcc has an extension that treats sizeof (void) as 1. gcc is not a conforming C compiler by default, so in its default mode it doesn't warn about sizeof (void). Extensions like this are permitted even for fully conforming C compilers, but the diagnostic is still required.
Taking the size of void is a GCC extension.
sizeof() cannot be applied to incomplete types. And void is incomplete type that cannot be completed.
In C, sizeof(void) == 1 in GCC, but this appears to depend on your compiler.
In C++, I get:
In function 'int main()':
Line 2: error: invalid application of 'sizeof' to a void type
compilation terminated due to -Wfatal-errors.
To the 2nd part of the question: Note that sizeof(void *)!= sizeof(void).
On a 32-bit arch, sizeof(void *) is 4 bytes, so p++, would be set accordingly.The amount by which a pointer is incremented is dependent on the data it is pointing to. So, it will be increased by 1 byte.
while sizeof(void) perhaps makes no sense in itself, it is important when you're doing any pointer math.
eg.
void *p;
while(...)
p++;
If sizeof(void) is considered 1 then this will work.
If sizeof(void) is considered 0 then you hit an infinite loop.
Most C++ compilers choosed to raise a compile error when trying to get sizeof(void).
When compiling C, gcc is not conforming and chose to define sizeof(void) as 1. It may look strange, but has a rationale. When you do pointer arithmetic adding or removing one unit means adding or removing the object pointed to size. Thus defining sizeof(void) as 1 helps defining void* as a pointer to byte (untyped memory address). Otherwise you would have surprising behaviors using pointer arithmetic like p+1 == p when p is void*. Such pointer arithmetic on void pointers is not allowed in c++ but works fine with when compiling C with gcc.
The standard recommended way would be to use char* for that kind of purpose (pointer to byte).
Another similar difference between C and C++ when using sizeof occurs when you defined an empty struct like:
struct Empty {
} empty;
Using gcc as my C compiler sizeof(empty) returns 0.
Using g++ the same code will return 1.
I'm not sure what states both C and C++ standards on this point, but I believe defining the size of some empty structs/objects helps with reference management to avoid that two references to differing consecutive objects, the first one being empty, get the same address. If reference are implemented using hidden pointers as it is often done, ensuring different address will help comparing them.
But this is merely avoiding a surprising behavior (corner case comparison of references) by introduction another one (empty objects, even PODs consume at least 1 byte memory).

Equivalence of p[0] and *p for incomplete array types

Consider the following code (it came about as a result of this discussion):
#include <stdio.h>
void foo(int (*p)[]) { // Argument has incomplete array type
printf("%d\n", (*p)[1]);
printf("%d\n", p[0][1]); // Line 5
}
int main(void) {
int a[] = { 5, 6, 7 };
foo(&a); // Line 10
}
GCC 4.3.4 complains with the error message:
prog.c: In function ‘foo’:
prog.c:5: error: invalid use of array with unspecified bounds
Same error message in GCC 4.1.2, and seems to be invariant of -std=c99, -Wall, -Wextra.
So it's unhappy with the expression p[0], but it's happy with *p, even though these should (in theory) be equivalent. If I comment out line 5, the code compiles and does what I would "expect" (displays 6).
Presumably one of the following is true:
My understanding of the C standard(s) is incorrect, and these expressions aren't equivalent.
GCC has a bug.
I'd place my money on (1).
Question: Can anyone elaborate on this behaviour?
Clarification: I'm aware that this can be "solved" by specifying an array size in the function definition. That's not what I'm interested in.
For "bonus" points: Can anyone confirm that MSVC 2010 is in error when it rejects line 10 with the following message?
1><snip>\prog.c(10): warning C4048: different array subscripts : 'int (*)[]' and 'int (*)[3]'
Section 6.5.2.1 of n1570, Array subscripting:
Constraints
One of the expressions shall have type ‘‘pointer to complete object type’’, the other
expression shall have integer type, and the result has type ‘‘type’’.
So the standard forbids the expression p[0] if p is a pointer to an incomplete type. There is no such restriction for the indirection operator *.
In older versions/drafts of the standard, however, (n1256 and C99), the word "complete" is absent in that paragraph. Not being involved in any way in the standard procedure, I can only guess whether it's a breaking change or the correction of an omission. The behaviour of the compiler suggests the latter. That is reinforced by the fact that p[i] is per the standard identical to *(p + i) and the latter expression doesn't make sense for a pointer to an incomplete type, so for p[0] to work if p is a pointer to an incomplete type, an explicit special case would be needed.
My C is a bit rusty, but my reading is that when you have an int (*p)[] this:
(*p)[n]
Says "dereference p to get an array of ints, then take the nth one". Which seems naturally to be well defined. Whereas this:
p[n][m]
Says "take the nth array in p, then take the mth element of that array". Which doesn't seem well-defined at all; you have to know how big the arrays are to find where the nth one starts.
This could work for the specific special case where n = 0, because the 0th array is easy to find regardless of how big the arrays are. You've simply found that GCC isn't recognising this special case. I don't know the language spec in detail, so I don't know whether that's a "bug" or not, but my personal tastes in language design are that p[n][m] should either work or not, not that it should work when n is statically known to be 0 and not otherwise.
Is *p <===> p[0] really a definitive rule from the language specification, or just an observation? I don't think of dereferencing and indexing-by-zero as the same operation when I'm programming.
For your "bonus points" question (you probably should have asked this as a separate question), MSVC10 is in error. Note that MSVC only implements C89, so I have used that standard.
For the function call, C89 §3.3.2.2 tells us:
Each argument shall have a type such that its value may be assigned to
an object with the unqualified version of the type of its
corresponding parameter.
The constraints for assignment are in C89 §3.3.16:
One of the following shall hold: ... both operands are pointers to
qualified or unqualified versions of compatible types, and the type
pointed to by the left has all the qualifiers of the type pointed to
by the right;
So we can assign two pointers (and thus call a function with a pointer parameter using a pointer argument) if the two pointers point to compatible types.
The compatibility of various array types is defined in C89 §3.5.4.2:
For two array types to be compatible, both shall have compatible
element types, and if both size specifiers are present, they shall
have the same value.
For the two array types int [] and int [3] this condition clearly holds. Therefore the function call is legal.
void foo(int (*p)[])
{
printf("%d\n", (*p)[1]);
printf("%d\n", p[0][1]); // Line 5
}
Here, p is a pointer to an array of an unspecified number of ints. *p accesses that array, so (*p)[1] is the 2nd element in the array.
p[n] adds p and n times the size of the pointed-to array, which is unknown. Even before considering the [1], it's broken. It's true that zero times anything is still 0, but the compiler's obviously checking the validity of all the terms without short-circuiting as soon as it sees zero. So...
So it's unhappy with the expression p[0], but it's happy with *p, even though these should (in theory) be equivalent.
As explained, they're clearly not equivalent... think of p[0] as p + 0 * sizeof *p and it's obvious why....
For "bonus" points: Can anyone confirm that MSVC 2010 is in error when it rejects line 10 with the following message?
1>\prog.c(10): warning C4048: different array subscripts : 'int ()[]' and 'int ()[3]'
Visual C++ (and other compilers) are free to warn about things that they think aren't good practice, things that have been found empirically to be often erroneous, or things the compiler writers just had an irrational distrust of, even if they're entirely legal re the Standard.... Examples that may be familiar include "comparing signed and unsigned" and "assignment within a conditional (suggest surrounding with extra parentheses)"

Resources