Some C compilers can sometimes deduce a casted pointer as still a lvalue, but gcc defaults won't compile it. Instead of having to port (by human-error-prone way)legacy code base to:
p=(int *)p+1 ; /* increment by sizeof(int) */
can gcc be made to allow this code below (even if not technically correct)?
void f() {
void *p ; /* type to be cast later for size */
((int *)p)++ ; /* gcc -c f.c => lvalue required error */ }
Edit: even if technically incorrect, I assume the only programmer intent for such code is for p to remain lvalue and "increment" it, generating same code as my long form, right? (perreal flagged as "lvalue cast")
Edit2: We all agree refactoring to std C is best, but if something like Mateo's -fpermissive worked, it might not catch future programming errors, but hoping initial gcc porting effort will be less faulty... Any other similar suggestions?
If you want to clean up your code(remove excessive casts) , you can concentrate the ugly casts inside a macro or, even better, an inlined function.
The below fragment only makes use of implicit casts to/from void*
You are ,of course, still responsible for the proper alignment.
#include <stdio.h>
static void *increment(void * p,size_t offset)
{
char *tmp = p;
return tmp+offset;
}
int main(void)
{
void *ptr = "Hell0 world!\n";
ptr = increment( ptr, sizeof(int) );
printf("%s", (char*) ptr);
return 0;
}
Related
Is the following valid C code? (godbolt)
#include <stddef.h>
ptrdiff_t f(size_t n, void *x, void *y)
{
if (!n) return 0;
typedef unsigned char element[n];
element *a = x, *b = y;
return a - b;
}
With -Werror=pointer-arith clang loudly complains about
<source>:8:14: error: subtraction of pointers to type 'element' (aka 'unsigned char [n]') of zero size has undefined behavior [-Werror,-Wpointer-arith]
return a - b;
~ ^ ~
while gcc compiles the code without complaint.
What is the undefined behavior that clang thinks is occuring? The
possibility of the subtraction being zero and therefore not a valid
pointer to an element of the array or something different? There's no
array access being performed, right? So that shouldn't be the case...
If the code does exhibit undefined behavior, is there a simple way to
modify the code to be fully conforming, while still using pointers to
VM-types?
The code posted is valid C code if VLAs are supported by your compiler. Note that VLAs introduced in C99 have become optional in the latest version of the C Standard.
Both gcc and clang compile the code correctly as can be verified using Godbolt's compiler explorer.
Yet clang issues a warning regarding potential undefined behavior if the value of the n argument happens to be null, failing to identify that this case has been handled with an explicit test. The problem is not the value of the subtraction but the size of the type, which would be 0 if n == 0. This warning is not really a bug, more a quality of implementation issue.
It is also arguable that a - b is only defined if both a and b point to the same array or just past the last element of it. Hence x and y must verify this constraint and have type unsigned char (*)[n] or compatible. There is an exception to this rule for accessing any type as an array of character type, so passing pointers to the same array of int would be fine, but it would be incorrect (although probably harmless) to call f this way:
int x, y;
ptrdiff_t dist = f(sizeof(int), &x, &y);
Compilers are free to issue diagnostic messages to attract the programmer's attention on potential problems, indeed such warnings are life savers in many cases for beginners and advanced programmers alike. Compiler options such as -Wall, -Werror and -Weverything are quite useful, but in this particular case, one will need to add -Wno-pointer-arith to allow clang to compile this function if -Werror is also active.
Note also that the same result can be obtained with a C89 function:
ptrdiff_t f89(size_t n, void *x, void *y)
{
if (!n) return 0;
unsigned char *a = x, *b = y;
return (a - b) / n;
}
Consider this artificial example:
#include <stddef.h>
static inline void nullify(void **ptr) {
*ptr = NULL;
}
int main() {
int i;
int *p = &i;
nullify((void **) &p);
return 0;
}
&p (an int **) is casted to void **, which is then dereferenced. Does this break the strict aliasing rules?
According to the standard:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object,
So unless void * is considered compatible with int *, this violates the strict aliasing rules.
However, this is not what is suggested by gcc warnings (even if it proves nothing).
While compiling this sample:
#include <stddef.h>
void f(int *p) {
*((float **) &p) = NULL;
}
gcc warns about strict aliasing:
$ gcc -c -Wstrict-aliasing -fstrict-aliasing a.c
a.c: In function ‘f’:
a.c:3:7: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
*((float **) &p) = NULL;
~^~~~~~~~~~~~~
However, with a void **, it does not warn:
#include <stddef.h>
void f(int *p) {
*((void **) &p) = NULL;
}
So is it valid regarding the strict aliasing rules?
If it is not, how to write a function to nullify any pointer (for example) which does not break the strict aliasing rules?
There is no general requirement that implementations use the same representations for different pointer types. On a platform that would use a different representation for e.g. an int* and a char*, there would be no way to support a single pointer type void* that could act upon both int* and char* interchangeably. Although an implementation that can handle pointers interchangeably would facilitate low-level programming on platforms which use compatible representations, such ability would not be supportable on all platforms. Consequently, the authors of the Standard had no reason to mandate support for such a feature rather than treating it as a quality of implementation issue.
From what I can tell, quality compilers like icc which are suitable for low-level programming, and which target platforms where all pointers have the same representation, will have no difficulty with constructs like:
void resizeOrFail(void **p, size_t newsize)
{
void *newAddr = realloc(*p, newsize);
if (!newAddr) fatal_error("Failure to resize");
*p = newAddr;
}
anyType *thing;
... code chunk #1 that uses thing
resizeOrFail((void**)&thing, someDesiredSize);
... code chunk #2 that uses thing
Note that in this example, both the act of taking thing's address, and all use the of resulting pointer, visibly occur between the two chunks of code that use thing. Thus, there is no actual aliasing, and any compiler which is not willfully blind will have no trouble recognizing that the act of passing thing's address to reallocorFail might cause thing to be modified.
On the other hand, if the usage had been something like:
void **myptr;
anyType *thing;
myptr = &thing;
... code chunk #1 that uses thing
*myptr = realloc(*myptr, newSize);
... code chunk #2 that uses thing
then even quality compilers might not realize that thing might be affected between the two chunks of code that use it, since there is no reference to anything of type anyType* between those two chunks. On such compilers, it would be necessary to write the code as something like:
myptr = &thing;
... code chunk #1 that uses thing
*(void *volatile*)myptr = realloc(*myptr, newSize);
... code chunk #2 that uses thing
to let the compiler know that the operation on *mtptr is doing something "weird". Quality compilers intended for low-level programming will regard this as a sign that they should avoid caching the value of thing across such an operation, but even the volatile qualifier won't be enough for implementations like gcc and clang in optimization modes that are only intended to be suitable for purposes that don't involve low-level programming.
If a function like reallocOrFail needs to work with compiler modes that aren't really suitable for low-level programming, it could be written as:
void resizeOrFail(void **p, size_t newsize)
{
void *newAddr;
memcpy(&newAddr, p, sizeof newAddr);
newAddr = realloc(newAddr, newsize);
if (!newAddr) fatal_error("Failure to resize");
memcpy(p, &newAddr, sizeof newAddr);
}
This would, however, require that compilers allow for the possibility that resizeOrFail might alter the value of an arbitrary object of any type--not merely data pointers--and thus needlessly impair what should be useful optimizations. Worse, if the pointer in question happens to be stored on the heap (and isn't of type void*), a conforming compilers that isn't suitable for low-level programming would still be allowed to assume that the second memcpy can't possibly affect it.
A key part of low-level programming is ensuring that one chooses implementations and modes that are suitable for that purpose, and knowing when they might need a volatile qualifier to help them out. Some compiler vendors might claim that any code which requires that compilers be suitable for its purposes is "broken", but attempting to appease such vendors will result in code that is less efficient than could be produced by using a quality compiler suitable for one's purposes.
In the following code...
#include <stdlib.h>
#include <stdint.h>
extern void get_buffer_from_HW_driver(volatile uint32_t **p);
void getBuffer(volatile uint32_t **pp)
{
// Write an address into pp, that is obtained from a driver
// The underlying HW will be DMA-ing into this address,
// so the data pointed-to by the pointer returned by this
// call are volatile.
get_buffer_from_HW_driver(pp);
}
void work()
{
uint32_t *p = NULL;
getBuffer((volatile uint32_t **)&p);
}
...the compiler rightfully detects that any potential accesses to the data pointed to by p inside work are dangerous accesses. As-is, the code instructs the compiler that it is safe to emit code that optimizes away repeated read accesses to *p - which is indeed wrong.
But the weird thing is, that the warning emitted by compiling this code...
$ gcc -c -Wall -Wextra -Wcast-qual constqual.c
...doesn't complain about the loss of volatile - it instead recommends using const:
constqual.c: In function ‘work’:
constqual.c:20:15: warning: to be safe all intermediate pointers in cast from
‘uint32_t ** {aka unsigned int **}’ to ‘volatile uint32_t **
{aka volatile unsigned int **}’ must be ‘const’ qualified
[-Wcast-qual]
getBuffer((volatile uint32_t **)&p);
^
I cannot see how const makes sense here.
P.S. Note that adding volatile in front of the uint32_t *p, as expected, fixes the issue. My question is why GCC recommends const instead of volatile.
Well, I raised a ticket in GCC's Bugzilla about this... and Joseph Myers has answered with a laconic answer:
No, GCC is not confused. It's saying that it's type-safe to convert
uint32_t ** to volatile uint32_t *const *, but not to convert it to
volatile uint32_t *.
...and he also added a reference to this part of the C FAQ.
I have to admit that my first reaction to this was a "say what?". I quickly tested the suggestion, changing the code to make it use the proposed declaration (and cast) instead...
#include <stdlib.h>
#include <stdint.h>
extern void get_buffer_from_HW_driver(volatile uint32_t * const *p);
void getBuffer(volatile uint32_t * const *pp)
{
// Write an address into pp, that is obtained from a driver
// The underlying HW will be DMA-ing into this address,
// so the data pointed-to by the pointer returned by this
// call are volatile.
get_buffer_from_HW_driver(pp);
}
void work()
{
uint32_t *p = NULL;
getBuffer((volatile uint32_t * const *)&p);
}
$ gcc -c -Wall -Wextra -Wcast-qual constqual.c
$
...and indeed, no warning anymore.
So I went ahead and read the relevant FAQ - and I think I understand a bit more of what is happening. By adding the const modifier, the parameter we are passing is (reading from right to left, as we're supposed to do in this kind of C syntax):
a pointer to a constant pointer (that will never change) that points to volatile data
This indeed maps very well to what is happening here: I am getting a pointer that points to volatile data, that is a driver-provided buffer - i.e. one that I indeed am not allowed to change, since it comes from pre-allocated lists of buffers that the driver itself allocated. Modifying the pointer that get_buffer_from_HW_driver returned would make no sense; it's not mine to modify, I can only use it as-is.
I confess I am really surprised that C's typesystem (augmented with the really strong static-analysis checks of -Wcast-qual) can actually help in guaranteeing these semantics.
Many thanks to Joseph - and I'll leave the question open for a few weeks, in case someone else wants to elaborate more.
P.S. Adding a mental note: from now on, when anyone claims that C is a simple language, I think I'll point them here.
I've tried to read up on the other questions here on SO with similar titles, but they are all a tiny bit too complex for me to be able to apply the solution (or even explanation) to my own issue, which seems to be of a simpler nature.
In my case, I have a wrapper around free() which sets the pointer to NULL after freeing it:
void myfree(void **ptr)
{
free(*ptr);
*ptr = NULL;
}
In the project I'm working on, it is called like this:
myfree((void **)&a);
This makes gcc (4.2.1 on OpenBSD) emit the warning "dereferencing type-punned pointer will break strict-aliasing rules" if I crank up the optimization level to -O3 and add -Wall (not otherwise).
Calling myfree() the following way does not make the compiler emit that warning:
myfree((void *)&a);
And so I wonder if we ought to change the way we call myfree() to this instead.
I believe that I'm invoking undefined behaviour with the first way of calling myfree(), but I haven't been able to wrap my head around why. Also, on all compilers that I have access to (clang and gcc), on all systems (OpenBSD, Mac OS X and Linux), this is the only compiler and system that actually gives me that warning (and I know emitting warnings is a nice optional).
Printing the value of the pointer before, inside and after the call to myfree(), with both ways of calling it, gives me identical results (but that may not mean anything if it's undefined behaviour):
#include <stdio.h>
#include <stdlib.h>
void myfree(void **ptr)
{
printf("(in myfree) ptr = %p\n", *ptr);
free(*ptr);
*ptr = NULL;
}
int main(void)
{
int *a, *b;
a = malloc(100 * sizeof *a);
b = malloc(100 * sizeof *b);
printf("(before myfree) a = %p\n", (void *)a);
printf("(before myfree) b = %p\n", (void *)b);
myfree((void **)&a); /* line 21 */
myfree((void *)&b);
printf("(after myfree) a = %p\n", (void *)a);
printf("(after myfree) b = %p\n", (void *)b);
return EXIT_SUCCESS;
}
Compiling and running it:
$ cc -O3 -Wall free-test.c
free-test.c: In function 'main':
free-test.c:21: warning: dereferencing type-punned pointer will break strict-aliasing rules
$ ./a.out
(before myfree) a = 0x15f8fcf1d600
(before myfree) b = 0x15f876b27200
(in myfree) ptr = 0x15f8fcf1d600
(in myfree) ptr = 0x15f876b27200
(after myfree) a = 0x0
(after myfree) b = 0x0
I'd like to understand what is wrong with the first call to myfree() and I'd like to know if the second call is correct. Thanks.
Since a is an int* and not a void*, &a cannot be converted to a pointer to a void*. (Suppose void* were wider than a pointer to an integer, something which the C standard allows.) As a result, neither of your alternatives -- myfree((void**)a) and myfree((void*)a) -- is correct. (Casting to void* is not a strict aliasing issue. But it still leads to undefined behaviour.)
A better solution (imho) is to force the user to insert a visible assignment:
void* myfree(void* p) {
free(p);
return 0;
}
a = myfree(a);
With clang and gcc, you can use an attribute to indicate that the return value of my_free must be used, so that the compiler will warn you if you forget the assignment. Or you could use a macro:
#define myfree(a) (a = myfree(a))
Here's a suggestion that:
Does not violate the strict aliasing rule.
Makes the call more natural.
void* myfree(void *ptr)
{
free(ptr);
return NULL;
}
#define MYFREE(ptr) ptr = myfree(ptr);
you can use the macro simply as:
int* a = malloc(sizeof(int)*10);
...
MYFREE(a);
There are basically a few ways to have a function work with and modify a pointer in a fashion agnostic to the pointer's target type:
Pass the pointer into the function as void* and return it as void*, applying appropriate conversions in both directions at the call site. This approach has the disadvantage of tying up the function's return value, precluding its use for other purposes, and also precludes the possibility of performing the pointer update within a lock.
Pass a pointer to function which accepts two void*, casts one of them into a pointer of the appropriate type and the other to a double-indirect pointer of that type, and possibly a second function that can read a passed-in pointer as a void*, and use those functions to read and write the pointer in question. This should be 100% portable, but likely very inefficient.
Use pointer variables and fields of type void* elsewhere and cast them to real pointer types whenever they're actually used, thus allowing pointers of type void** to be used to modify the pointer variables.
Use memcpy to read or modify pointers of unknown type, given double-indirect pointers which identify them.
Document that code is written in a dialect of C, popular in the 1990s, which treated "void**" as a double-indirect pointer to any type, and use compilers and/or settings that support that dialect. The C Standard allows for implementations to use different representations for pointers to things of different types, and because those implementations couldn't support a universal double-indirect pointer type, and because implementations which could easily allow void** to be used that way already did so before the Standard was written, there was no perceived need for the Standard to describe that behavior.
The ability to have a universal double-indirect pointer type was and is extremely useful on the 90%+ of implementations that could (and did) readily support it, and the authors of the Standard certainly knew that, but the authors were far less interested in describing behaviors that sensible compiler writers would support anyway, than in mandating behaviors which would be on the whole beneficial even on platforms where they could not be cheaply supported (e.g. mandating that even on a platform whose unsigned math instructions wrap mod 65535, a compiler must generate whatever code is needed to make calculations wrap mod 65536). I'm not sure why modern compiler writers fail to recognize that.
Perhaps if programmers start overtly writing for sane dialects of C, the maintainers of standards might recognize that such dialects have value. [Note that from an aliasing perspective, treating void** as a universal double-indirect pointer will have far less severe performance costs than forcing programmers to use any of the alternatives 2-4 above; any claims by compiler writers that treating void** as a universal double-indirect pointer would kill performance should thus be treated skeptically].
#include <stdio.h>
#include <stdlib.h>
const int * func()
{
int * i = malloc(sizeof(int));
(*i) = 5; // initialize the value of the memory area
return i;
}
int main()
{
int * p = func();
printf("%d\n", (*p));
(*p) = 3; // attempt to change the memory area - compiles fine
printf("%d\n", (*p));
free(p);
return 0;
}
Why does the compiler allow me to change (*p) even if func() returns a const pointer?
I'm using gcc, it shows only a warning on the int * p = func(); line : "warning: initialization discards qualifiers from pointer target type".
Thanks.
Your program is not valid. C forbids implicitly removing a const like that, and in conformance to the spec GCC should give you at least a warning for that code. You would need a cast to remove the const.
Having consumed a warning for that, you can however rely on the program to work (although not anymore from a Standards point of view), because the pointer is pointing to a malloc'ed memory area. And you are allowed to write to that area. A const T* pointing to some memory doesn't mean that the memory is thereafter marked immutable.
Note that the Standard doesn't require a compiler to reject any program. The Standard merely requires compilers to sometimes emit a message to the user. Whether that's an error message or warning and how the message is emitted and whatever happens after that emission, isn't specified by the Standard at all.
The compiler and the C language "allow" you to do all manner of stupid things, especially if you ignore warnings. The conversion of a const int* to int* is the only point at which the compiler can detect that there's anything amiss here, and it issued a warning for that conversion. That's as much disapproval as you'll get, and it's why you shouldn't ignore warnings.
Since the behavior of this program is defined (by GCC, to be the same as if you'd explicitly cast to const int*), it's at least possible that what you've done really is what you intended to do. That's why the code is accepted.
You are turning a const pointer into a normal pointer, which would essentially allow you to change the pointer. You are breaking the "contract" you made by returning a constant pointer, but since C is a weakly-typed language it is syntactically legal.
Basically GCC is helping you here. Syntactically it is legal to turn a const pointer into a regular one, but chances are you didn't want to do that so GCC throws a warning.
Read design by contract.
first of all, the memory is valid in main, because it's stored on the heap and hasn't been destroyed/freed. So the compiler just complain a warning to you.
If you try
const int * p = func();
then of course (*p) = 3 will be error.